[Bioperl-l] Extracting gi no from refseq record

Siddhartha Basu basu at pharm.sunysb.edu
Fri Apr 4 17:36:52 EST 2003


Hi,
Sorry for not being specific.
Here is the exception message i have got

-----------EXCEPTION  -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK Bio::SeqIO::swiss::next_seq 
/usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/swiss.pm:179
STACK Bio::Index::AbstractSeq::fetch 
/usr/lib/perl5/site_perl/5.8.0/Bio/Index/AbstractSeq.pm:145
STACK Bio::Index::AbstractSeq::get_Seq_by_acc 
/usr/lib/perl5/site_perl/5.8.0/Bio/Index/AbstractSeq.pm:213
STACK main::GetDes linkup.pl:165
STACK toplevel linkup.pl:101

--------------------------------------

Multiple id means the lines that starts with ID identifier.
Couple of entries that failed with the exception messages are

O95300,O95753,P01121,P00938,O15509,O15532.

Here is one examples of one such entry.....

===================================================================

ID   RHOB_HUMAN     STANDARD;      PRT;   196 AA.
ID   RHOB_MOUSE
ID   RHOB_RAT
AC   P01121; Q9CUV7;
DT   21-JUL-1986 (Rel. 01, Created)
DT   01-AUG-1988 (Rel. 08, Last sequence update)
DT   28-FEB-2003 (Rel. 41, Last annotation update)
DE   Transforming protein RhoB (H6).
GN   ARHB OR ARH6 OR RHOB.
OS   Homo sapiens (Human),
OS   Mus musculus (Mouse), and
OS   Rattus norvegicus (Rat).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo.
OX   NCBI_TaxID=9606, 10090, 10116;
RN   [1]
RP   SEQUENCE FROM N.A.
RC   SPECIES=Human;
RX   MEDLINE=88203210; PubMed=3283705;
RA   Chardin P., Madaule P., Tavitian A.;
RT   "Coding sequence of human rho cDNAs clone 6 and clone 9.";
RL   Nucleic Acids Res. 16:2717-2717(1988).
RN   [2]
RP   SEQUENCE FROM N.A.
RC   SPECIES=Human; TISSUE=Brain;
RA   Puhl H.L. III, Ikeda S.R., Aronstam R.S.;
RL   Submitted (APR-2002) to the EMBL/GenBank/DDBJ databases.
RN   [3]
RP   SEQUENCE OF 29-196 FROM N.A.
RC   SPECIES=Human;
RX   MEDLINE=85201682; PubMed=3888408;
RA   Madaule P., Axel R.;
RT   "A novel ras-related gene family.";
RL   Cell 41:31-40(1985).
RN   [4]
RP   SEQUENCE FROM N.A.
RC   SPECIES=Mouse;
RX   MEDLINE=96428574; PubMed=8831676;
RA   Nakamura T., Asano M., Shindo-Okada N., Nishimura S., Monden Y.;
RT   "Cloning of the RhoB gene from the mouse genome and characterization
RT   of its promoter region.";
RL   Biochem. Biophys. Res. Commun. 226:688-694(1996).
RN   [5]
RP   SEQUENCE FROM N.A.
RC   SPECIES=Mouse; STRAIN=C57BL/6; TISSUE=Hippocampus;
RA   Westmark C.J., Malter J.S.;
RT   "RhoB mRNA is stabilized by HuR after UV light.";
RL   Submitted (FEB-2002) to the EMBL/GenBank/DDBJ databases.
RN   [6]
RP   SEQUENCE FROM N.A.
RC   SPECIES=Mouse; TISSUE=Salivary gland;
RA   Strausberg R.;
RL   Submitted (DEC-2001) to the EMBL/GenBank/DDBJ databases.

========truncated=======================================================

Its failing with those specifically with those entries having multiple 
ID lines.




bye
siddhartha






Hilmar Lapp wrote:
> Always email the exception message and stack trace copy&pasted, as 
> otherwise no-one has a clear idea what's happened. The SeqIO swissprot 
> parser will only retain the first species (and correspondingly only the 
> first NCBI taxon ID). Multiple ids mean multiple secondary accession 
> lines? These should all be parsed (Jason you fixed that: did you do that 
> on the stable branch too?); I don't know though whether they'll all be 
> indexed.
> 



More information about the Bioperl-l mailing list