[Biopython-dev] ipi parser

Pierre Monestie pierre.monestie at lbri.lionbioscience.com
Wed May 19 10:33:13 EDT 2004


Sorry my first message was too long, here is a shorter example:
Thanks for the looking at it,
Pierre


ID   IPI00387610.1         IPI;      PRT;   697 AA.
AC   IPI00387610;
DT   18-NOV-2003 (IPI Rat rel. 1.9, Created)
DT   18-NOV-2003 (IPI Rat rel. 1.9, Last sequence update)
OS   Rattus norvegicus (Rat).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
OX   NCBI_TaxID=10116;
CC   -!- CHROMOSOME: 1.
DR   InterPro; IPR001611; LRR.
DR   InterPro; IPR007091; LRR_RNinh.
DR   InterPro; IPR003590; LRR_RNinh_sub.
DR   InterPro; IPR007111; NACHT_NTPase.
DR   Pfam; PF05729; NACHT; 1.
DR   PRINTS; PR00019; LEURICHRPT.
DR   SMART; SM00368; LRR_RI; 4.
DR   PROSITE; PS50503; LRR_RI; 1.
DR   PROSITE; PS50837; NACHT; 1.
DR   UniParc; UPI000021DDC2; -; -.
DR   ENSEMBL; ENSRNOP00000030672; ENSRNOG00000021996; M.
SQ   SEQUENCE   697 AA;  80092 MW;  D6C61D8C95F306AF CRC64;
     PLVLTDSGHS KLYQAHLKKK LTHDYARKFN IKAQDLFKQK FTQDDCDRFE NLLVSKATGK
     KPHMVFLQGV AGIGKSLMLT KLMLAWSEGI VFQNKFSYIF YFCCQDVKQL KRASLAELIS
     REWPNASAPT AEILSQPEKL LFIIDSLEVM ECNMSERESE LCDNCTEKQP VSLLLSSLLR
     RKMLPESSFL ISATPETFEK MEDRIECTNV KIITGFNENN IKMYFRSLFQ DKNRTLEAFS
     LVRENEQLFN VCQVPVLCWM VATCIKKEIE KGRDPVFICR RTTSLYTTHI FNLFTPQNAQ
     YPSKKSQDQL QGLCSLAAEG MWTDTFVFSE EALRRNGILD SDIPTLLDRR ILERSKESES
     CYIFLHPSLQ EVCAAVFYLL KSHLDHPSQD VKSVEALLFT FLKKAKVQWI FLGCFLFGLL
     HESEQEKLEM FFGHQLSQEI KHQLYQCLET ISVNEELQEQ IDGMKLFYCL FEMEDEAFLM
     QAMNCMEQIN FVAKDYSDVI VAAYCLKHCS TLKKLSFSTQ NILSEEQEHS YTEKLLICWH
     HMCSVLISSK DIHVLQVKDT NLNETAFWVL YNHLKYPSCT LKVLVIAACN LSPDDCKVFA
     SVLISSKMLK HLNLSSNNLD KGISSLCKAL CHPDCILKHL VVRHCLITTS GCQDLAEVLR
     HNQNLRSLQV SNNKIEDAGV KLLCDAIKQP NCHLENI
//
ID   IPI00187591.2         IPI;      PRT;   163 AA.
AC   IPI00187591;
DT   14-MAR-2003 (IPI Rat rel. 1.0, Created)
DT   18-NOV-2003 (IPI Rat rel. 1.9, Last sequence update)
OS   Rattus norvegicus (Rat).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
OX   NCBI_TaxID=10116;
CC   -!- CHROMOSOME: 19.
DR   UniParc; UPI00001CD005; -; -.
DR   ENSEMBL; ENSRNOP00000023455; ENSRNOG00000016991; M.
SQ   SEQUENCE   163 AA;  18683 MW;  A1998B08C0383825 CRC64;
     MDALEEESFA LSFSSASDAE FDAVVGCLED IIMDAEFQLL QRSFMDKYYQ EFEDTEENKL
     TYTPIFNEYI SLVEKYIEEQ LLERIPGFNM AAFTTTLQHH KDEVAGDIFD MLLTFTDFLA
     FKEMFLDYRA EKEGRGLDLS SGLVVTSLCK SSSTPASQNN LRH
//
ID   IPI00357878.1         IPI;      PRT;   690 AA.
AC   IPI00357878; IPI00201160;
DT   02-OCT-2003 (IPI Rat rel. 1.7, Created)
DT   02-OCT-2003 (IPI Rat rel. 1.7, Last sequence update)
DE   SIMILAR TO ARHGEF3 PROTEIN.
OS   Rattus norvegicus (Rat).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
OX   NCBI_TaxID=10116;
CC   -!- CHROMOSOME: 16.
DR   InterPro; IPR001849; PH.
DR   InterPro; IPR000219; RhoGEF.
DR   Pfam; PF00169; PH; 1.
DR   Pfam; PF00621; RhoGEF; 1.
DR   SMART; SM00233; PH; 1.
DR   SMART; SM00325; RhoGEF; 1.
DR   PROSITE; PS50010; DH_2; 1.
DR   PROSITE; PS50003; PH_DOMAIN; 1.
DR   ENSEMBL; ENSRNOP00000019511; ENSRNOG00000014363; -.
DR   REFSEQ_XP; XP_224588; GI:34876921; M.
DR   UniParc; UPI00001D0F0F; -; -.
SQ   SEQUENCE   690 AA;  77882 MW;  FAB74E51D05C987B CRC64;
     MENSENPPVD NRTSVLHPLL RQTTQTQFVH EPFTEGIQMS ALGYLKRKRK QSAQDEDAVS
     LCSLDISQPA RALLNPQQTL SERWIRDGLS ASSVVWMTER KGEKHYERER PALPVEPGIR
     SSLLEAVVGV RVAAAGIVEL GPSFTRDFCC RLGSAVTSQR AGPAAAMVAK DYPFYLTVKR
     ANCSLEAPLG SGVAKDEEPS NKRVKPLSRV TSLANLIPPV KTTPLKRFSQ TLQRSISFRN
     ESRPDILAPR AWSRNATSSS TKRRDSKLWS ETFDVCVSQV LTAKEIKRQE AIFELSQGEE
     DLIEDLKLAK KAYHDPMLKL SIMTEQELNQ IFGTLDSLIP LHEDLLSQLR DVRKPDGSTE
     HVGPILVGWL PCLSSYDSYC SNQVAAKALL DHKKQDHRVQ DFLQRCLESP FSRKLDLWNF
     LDIPRSRLVK YPLLLREILR HTPNDNPDQQ HLEEAINIIQ GIVAEINTKT GESECRYYKE
     RLLYLEEGQK DSLIDSSRVL CCHGELKNNR GVKLHVFLFQ EVLVITRAVT HNEQLCYQLY
     RQPIPVKDLT LEDLQDGEVR LGGSLRGAFS NNERIKNFFR VSFKNGSQSQ THSLQANDTF
     NKQQWLNCIR QAKETVLSAA GQAGLLDSES LSQSPGTENR ELRGETKLEQ MDQSDSESDC
     SMDTSEVSLE CERMEQTDAS CANSRPEENV
//



> -----Original Message-----
> From: Jeffrey Chang [mailto:jeffrey_chang at stanfordalumni.org]
> Sent: Tuesday, May 18, 2004 5:20 PM
> To: Pierre Monestie
> Cc: biopython-dev at biopython.org
> Subject: Re: [Biopython-dev] ipi parser
> 
> 
> Hello,
> 
> These errors are nearly always due to changes in the formats of the 
> records that occur from time to time.  Do you have a sample file, or 
> accession number, that I can use to see what's going on?
> 
> Jeff
> 
> 
> On May 18, 2004, at 4:12 PM, Pierre Monestie wrote:
> 
> > Hello,
> > I'm trying to use the Swissprot parser to parse IPI. I read that the 
> > parser
> > should have been fixed for IPI however I get an error on date when I 
> > try to
> > parse ipi.HUMAN
> > I get:
> >  File "dbupdate/src/python/make_sptofasta.py", line 172, in ?
> >     parseandoutput('ipi',it,fl[0],fl[1],fl[2],fl[3],fl[4])
> >   File "dbupdate/src/python/make_sptofasta.py", line 46, in 
> > parseandoutput
> >     record = it.next()
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 166, in next
> >     return self._parser.parse(File.StringHandle(data))
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 290, in parse
> >     self._scanner.feed(handle, self._consumer)
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 333, in feed
> >     self._scan_record(uhandle, consumer)
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 338, in _scan_record
> >     fn(self, uhandle, consumer)
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 379, in _scan_dt
> >     self._scan_line('DT', uhandle, consumer.date, exactly_one=1)
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 360, in _scan_line
> >     read_and_call(uhandle, event_fn, start=line_type)
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/ParserSupport.py", 
> > line
> > 301, in read_and_call
> >     method(line)
> >   File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py", 
> > line
> > 537, in date
> >     self.data.created = cols[1], int(self._chomp(cols[3]))
> > ValueError: invalid literal for int(): Human
> >
> > Thanks in advance for your help
> > Pierre Monestie
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at biopython.org
> > http://biopython.org/mailman/listinfo/biopython-dev



More information about the Biopython-dev mailing list