[Biopython-dev] ipi parser
Pierre Monestie
pierre.monestie at lbri.lionbioscience.com
Wed May 19 10:33:13 EDT 2004
Sorry my first message was too long, here is a shorter example:
Thanks for the looking at it,
Pierre
ID IPI00387610.1 IPI; PRT; 697 AA.
AC IPI00387610;
DT 18-NOV-2003 (IPI Rat rel. 1.9, Created)
DT 18-NOV-2003 (IPI Rat rel. 1.9, Last sequence update)
OS Rattus norvegicus (Rat).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
OX NCBI_TaxID=10116;
CC -!- CHROMOSOME: 1.
DR InterPro; IPR001611; LRR.
DR InterPro; IPR007091; LRR_RNinh.
DR InterPro; IPR003590; LRR_RNinh_sub.
DR InterPro; IPR007111; NACHT_NTPase.
DR Pfam; PF05729; NACHT; 1.
DR PRINTS; PR00019; LEURICHRPT.
DR SMART; SM00368; LRR_RI; 4.
DR PROSITE; PS50503; LRR_RI; 1.
DR PROSITE; PS50837; NACHT; 1.
DR UniParc; UPI000021DDC2; -; -.
DR ENSEMBL; ENSRNOP00000030672; ENSRNOG00000021996; M.
SQ SEQUENCE 697 AA; 80092 MW; D6C61D8C95F306AF CRC64;
PLVLTDSGHS KLYQAHLKKK LTHDYARKFN IKAQDLFKQK FTQDDCDRFE NLLVSKATGK
KPHMVFLQGV AGIGKSLMLT KLMLAWSEGI VFQNKFSYIF YFCCQDVKQL KRASLAELIS
REWPNASAPT AEILSQPEKL LFIIDSLEVM ECNMSERESE LCDNCTEKQP VSLLLSSLLR
RKMLPESSFL ISATPETFEK MEDRIECTNV KIITGFNENN IKMYFRSLFQ DKNRTLEAFS
LVRENEQLFN VCQVPVLCWM VATCIKKEIE KGRDPVFICR RTTSLYTTHI FNLFTPQNAQ
YPSKKSQDQL QGLCSLAAEG MWTDTFVFSE EALRRNGILD SDIPTLLDRR ILERSKESES
CYIFLHPSLQ EVCAAVFYLL KSHLDHPSQD VKSVEALLFT FLKKAKVQWI FLGCFLFGLL
HESEQEKLEM FFGHQLSQEI KHQLYQCLET ISVNEELQEQ IDGMKLFYCL FEMEDEAFLM
QAMNCMEQIN FVAKDYSDVI VAAYCLKHCS TLKKLSFSTQ NILSEEQEHS YTEKLLICWH
HMCSVLISSK DIHVLQVKDT NLNETAFWVL YNHLKYPSCT LKVLVIAACN LSPDDCKVFA
SVLISSKMLK HLNLSSNNLD KGISSLCKAL CHPDCILKHL VVRHCLITTS GCQDLAEVLR
HNQNLRSLQV SNNKIEDAGV KLLCDAIKQP NCHLENI
//
ID IPI00187591.2 IPI; PRT; 163 AA.
AC IPI00187591;
DT 14-MAR-2003 (IPI Rat rel. 1.0, Created)
DT 18-NOV-2003 (IPI Rat rel. 1.9, Last sequence update)
OS Rattus norvegicus (Rat).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
OX NCBI_TaxID=10116;
CC -!- CHROMOSOME: 19.
DR UniParc; UPI00001CD005; -; -.
DR ENSEMBL; ENSRNOP00000023455; ENSRNOG00000016991; M.
SQ SEQUENCE 163 AA; 18683 MW; A1998B08C0383825 CRC64;
MDALEEESFA LSFSSASDAE FDAVVGCLED IIMDAEFQLL QRSFMDKYYQ EFEDTEENKL
TYTPIFNEYI SLVEKYIEEQ LLERIPGFNM AAFTTTLQHH KDEVAGDIFD MLLTFTDFLA
FKEMFLDYRA EKEGRGLDLS SGLVVTSLCK SSSTPASQNN LRH
//
ID IPI00357878.1 IPI; PRT; 690 AA.
AC IPI00357878; IPI00201160;
DT 02-OCT-2003 (IPI Rat rel. 1.7, Created)
DT 02-OCT-2003 (IPI Rat rel. 1.7, Last sequence update)
DE SIMILAR TO ARHGEF3 PROTEIN.
OS Rattus norvegicus (Rat).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.
OX NCBI_TaxID=10116;
CC -!- CHROMOSOME: 16.
DR InterPro; IPR001849; PH.
DR InterPro; IPR000219; RhoGEF.
DR Pfam; PF00169; PH; 1.
DR Pfam; PF00621; RhoGEF; 1.
DR SMART; SM00233; PH; 1.
DR SMART; SM00325; RhoGEF; 1.
DR PROSITE; PS50010; DH_2; 1.
DR PROSITE; PS50003; PH_DOMAIN; 1.
DR ENSEMBL; ENSRNOP00000019511; ENSRNOG00000014363; -.
DR REFSEQ_XP; XP_224588; GI:34876921; M.
DR UniParc; UPI00001D0F0F; -; -.
SQ SEQUENCE 690 AA; 77882 MW; FAB74E51D05C987B CRC64;
MENSENPPVD NRTSVLHPLL RQTTQTQFVH EPFTEGIQMS ALGYLKRKRK QSAQDEDAVS
LCSLDISQPA RALLNPQQTL SERWIRDGLS ASSVVWMTER KGEKHYERER PALPVEPGIR
SSLLEAVVGV RVAAAGIVEL GPSFTRDFCC RLGSAVTSQR AGPAAAMVAK DYPFYLTVKR
ANCSLEAPLG SGVAKDEEPS NKRVKPLSRV TSLANLIPPV KTTPLKRFSQ TLQRSISFRN
ESRPDILAPR AWSRNATSSS TKRRDSKLWS ETFDVCVSQV LTAKEIKRQE AIFELSQGEE
DLIEDLKLAK KAYHDPMLKL SIMTEQELNQ IFGTLDSLIP LHEDLLSQLR DVRKPDGSTE
HVGPILVGWL PCLSSYDSYC SNQVAAKALL DHKKQDHRVQ DFLQRCLESP FSRKLDLWNF
LDIPRSRLVK YPLLLREILR HTPNDNPDQQ HLEEAINIIQ GIVAEINTKT GESECRYYKE
RLLYLEEGQK DSLIDSSRVL CCHGELKNNR GVKLHVFLFQ EVLVITRAVT HNEQLCYQLY
RQPIPVKDLT LEDLQDGEVR LGGSLRGAFS NNERIKNFFR VSFKNGSQSQ THSLQANDTF
NKQQWLNCIR QAKETVLSAA GQAGLLDSES LSQSPGTENR ELRGETKLEQ MDQSDSESDC
SMDTSEVSLE CERMEQTDAS CANSRPEENV
//
> -----Original Message-----
> From: Jeffrey Chang [mailto:jeffrey_chang at stanfordalumni.org]
> Sent: Tuesday, May 18, 2004 5:20 PM
> To: Pierre Monestie
> Cc: biopython-dev at biopython.org
> Subject: Re: [Biopython-dev] ipi parser
>
>
> Hello,
>
> These errors are nearly always due to changes in the formats of the
> records that occur from time to time. Do you have a sample file, or
> accession number, that I can use to see what's going on?
>
> Jeff
>
>
> On May 18, 2004, at 4:12 PM, Pierre Monestie wrote:
>
> > Hello,
> > I'm trying to use the Swissprot parser to parse IPI. I read that the
> > parser
> > should have been fixed for IPI however I get an error on date when I
> > try to
> > parse ipi.HUMAN
> > I get:
> > File "dbupdate/src/python/make_sptofasta.py", line 172, in ?
> > parseandoutput('ipi',it,fl[0],fl[1],fl[2],fl[3],fl[4])
> > File "dbupdate/src/python/make_sptofasta.py", line 46, in
> > parseandoutput
> > record = it.next()
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 166, in next
> > return self._parser.parse(File.StringHandle(data))
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 290, in parse
> > self._scanner.feed(handle, self._consumer)
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 333, in feed
> > self._scan_record(uhandle, consumer)
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 338, in _scan_record
> > fn(self, uhandle, consumer)
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 379, in _scan_dt
> > self._scan_line('DT', uhandle, consumer.date, exactly_one=1)
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 360, in _scan_line
> > read_and_call(uhandle, event_fn, start=line_type)
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/ParserSupport.py",
> > line
> > 301, in read_and_call
> > method(line)
> > File "/lbri/gen/lib/python2.2/site-packages/Bio/SwissProt/SProt.py",
> > line
> > 537, in date
> > self.data.created = cols[1], int(self._chomp(cols[3]))
> > ValueError: invalid literal for int(): Human
> >
> > Thanks in advance for your help
> > Pierre Monestie
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at biopython.org
> > http://biopython.org/mailman/listinfo/biopython-dev
More information about the Biopython-dev
mailing list