[Biopython-dev] [Patch] Genbank Parser
Peter Cock
p.j.a.cock at googlemail.com
Thu Oct 4 09:11:01 UTC 2012
On Mon, Oct 1, 2012 at 10:44 PM, Björn Grüning <bjoern at gruenings.eu> wrote:
> Hi Peter,
>
>> >
>> > the tbl2asn tool from the ncbi creates genbank files that did not have a
>> > version number. Unfortunately that version number is used to fill
>> > consumer.data.id.
>> > I implemented the following fall-back:
>> > If there is no version information available than it takes the
>> > consumer.data.name for the consumer.data.id. Does that makes sense?
>> >
>> > Thanks!
>> > Bjoern
>>
>> Can you share some example output from tbl2asn that shows
>> this problem? Ideally something small we could include as a
>> unit test.
>
> please find attached a small, stripped version of such an genbank file.
>
> Thanks,
> Bjoern
$ python
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> r = SeqIO.read("tbl1asn_output.gb", "gb")
/Library/Python/2.7/site-packages/Bio/GenBank/__init__.py:1158:
BiopythonParserWarning: Expected sequence length 300246, found 2220
().
BiopythonParserWarning)
>>> r.id
''
>>> r.name
'Seq1'
>>> r.description
'Glarea strain lozoyensis.'
>>> quit()
That warning is because this test file has only the start of the sequence
present, yet the LOCUS line still gives the original length.
$ head tbl1asn_output.gb
LOCUS Seq1 300246 bp DNA linear 10-MAY-2012
DEFINITION Glarea strain lozoyensis.
ACCESSION
VERSION
KEYWORDS .
SOURCE Glarea
ORGANISM Glarea
Unclassified.
REFERENCE 1
AUTHORS Test
I didn't use your patch - looking over the code, it was already intended
that if there was no record.id that record.name would be used. Sadly
this was a bit too strict about None versus an empty string, fixed:
https://github.com/biopython/biopython/commit/e67d22e4b4f344a5a3c15b6e939c82f58986d87f
Thanks for your help,
Peter
More information about the Biopython-dev
mailing list