[Biopython-dev] [Bug 2229] GenBank Scanner fails to scan over headers
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Mon Mar 12 16:05:19 UTC 2007
http://bugzilla.open-bio.org/show_bug.cgi?id=2229
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-03-12 12:05 EST -------
Link to download the file, about 28 MB
ftp://ftp.ncbi.nih.gov/genbank/gbvrl1.seq.gz
This starts:
------------------------------------------------
GBVRL1.SEQ Genetic Sequence Data Bank
February 15 2007
NCBI-GenBank Flat File Release 158.0
Viral Sequences (Part 1)
72061 loci, 66147687 bases, from 72061 reported sequences
LOCUS AB000048 2007 bp DNA linear VRL 05-FEB-1999
DEFINITION Feline panleukopenia virus DNA for nonstructural protein 1,
complete cds.
ACCESSION AB000048
...
------------------------------------------------
Much smaller test case, 81 KB compressed:
ftp://ftp.ncbi.nih.gov/genbank/gbuna.seq.gz
File starts:
------------------------------------------------
GBUNA.SEQ Genetic Sequence Data Bank
February 15 2007
NCBI-GenBank Flat File Release 158.0
Unannotated Sequences
211 loci, 114018 bases, from 211 reported sequences
LOCUS AB086827 901 bp mRNA linea
...
------------------------------------------------
In both cases, and I assume all these archives, there is a fairly uniform
header present, followed by the GenBank records.
I suppose we could/should spot these and skip them... does anyone know off hand
in EMBL does anything similar?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list