[Biopython-dev] [Bug 2591] GenBank files misparsed for long organism names
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Mon Dec 15 20:33:51 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2591
------- Comment #4 from joelb at lanl.gov 2008-12-15 15:33 EST -------
I heard back from GenBank, and it seems they are saying the problem isn't
theirs:
>On Tue, December 9, 2008 10:30 am, gb-admin at ncbi.nlm.nih.gov wrote:
>> Hi Joel,
>>
>> I heard back from our database folks on this one. Essentially we do
>> allow the source line to line-wrap, but we never publicly announced
>> it. We apologize for this oversight and will be putting something
>> in the release notes regarding this. Hopefully BioPython and other
>> companies will be able to pick up this change and adapt once it is
>> announced in the release notes.
>>
>> thanks for pointing it out
>>
>> Linda
I just wrote back with the followup question:
>
>OK, but but then a followup question. How does one distinguish, then, a
>line-wrapped organism line from the multiline phylogeny that follows?
>According to my reading of the specs (and most Bio* GenBank parser's
>implementations) it seems that an equally-valid parsing of the following
>ORGANISM record is that it belongs to the "AKU_12601 Bacteria" kingdom.
>That is, there is no official way of signalling "this is the end of the
>multiline organism name" or "this begins the multiline phylogeny record."
>
> ORGANISM Salmonella enterica subsp. enterica serovar Paratyphi A str.
> AKU_12601
> Bacteria; Proteobacteria; Gammaproteobacteria;Enterobacteriales;
> Enterobacteriaceae; Salmonella.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list