[Biopython] Entrez.efetch

Michiel de Hoon mjldehoon at yahoo.com
Fri Feb 26 03:47:16 UTC 2010


> ##Try on E. coli
> genome:
> parseGenome("CP000819.1")
> ##Try on Drosophila chromosome 4
> parseGenome("NC_004353.3")
> ##Try on Drosophila X chromosome
> parseGenome("NC_004354")

Have you tried "NC_004354.3" instead of "NC_004354"?

--Michiel.

--- On Thu, 2/25/10, Rohan Maddamsetti <rohan.maddamsetti at gmail.com> wrote:

> From: Rohan Maddamsetti <rohan.maddamsetti at gmail.com>
> Subject: [Biopython] Entrez.efetch
> To: biopython at lists.open-bio.org
> Date: Thursday, February 25, 2010, 9:33 PM
> Hello,
> 
> I'm new to biopython (installed yesterday), so please bear
> with me. This
> problem is similar to one sent to list on Wed, Oct 8, 2008
> with the same
> subject line as this email, by a Stephan. Interestingly,
> though, my code
> works in a couple cases (including the chromosome input
> used by Stephan),
> but not in a third. I wrote the following simple function.
> 
> def parseGenome(genbank_id):
>     handle =
> Entrez.efetch(db="genome",rettype="gb",id=genbank_id)
>     for seq_record in SeqIO.parse(handle,"gb"):
>         print "%s with %i features" %
> (seq_record.id,
> len(seq_record.features))
>     handle.close()
> 
> ##Try on E. coli
> genome:
> parseGenome("CP000819.1")
> ##Try on Drosophila chromosome 4
> parseGenome("NC_004353.3")
> ##Try on Drosophila X chromosome
> parseGenome("NC_004354")
> 
> And this is the output I get:
> 
> CP000819.1 with 8759 features
> NC_004353.3 with 1191 features
> Traceback (most recent call last):
>   File "BiasCalc.py", line 48, in <module>
>     parseGenome("NC_004354")
>   File "BiasCalc.py", line 38, in parseGenome
>     for seq_record in SeqIO.parse(handle,"gb"):
>   File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 420, in parse_records
>     record = self.parse(handle, do_features)
>   File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 403, in parse
>     if self.feed(handle, consumer, do_features):
>   File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 380, in feed
>     misc_lines, sequence_string =
> self.parse_footer()
>   File
> "/Library/Frameworks/Python.framework/Versions/6.0.4/lib/python2.6/site-packages/Bio/GenBank/Scanner.py",
> line 762, in parse_footer
>     raise ValueError("Premature end of file in
> sequence data")
> ValueError: Premature end of file in sequence data
> 
> Is this a bug, or am I doing something wrong? My eventual
> goal is to iterate
> through the features in the seq_record, and collect GC
> content statistics
> for the coding regions and introns.
> 
> Thanks,
> Rohan
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 


      




More information about the Biopython mailing list