[Biopython-dev] [Bug 3069] More robust feature parser for GenBank/EMBL records
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Fri May 14 13:33:48 UTC 2010
http://bugzilla.open-bio.org/show_bug.cgi?id=3069
------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2010-05-14 09:33 EST -------
(In reply to comment #11)
>
> I think we should probably output all IMGT records using the increased
> indentation. This way there will be no ambiguity and no information loss. If
> you want to manually "convert" to standard EMBL format, I think the truncation
> makes sense as you proposed it, and we could issue a warning about lost
> information.
I've found a page describing the IMGT file format, and it does say their
feature indent should be 26 (while EMBL files use 21):
http://www.ebi.ac.uk/imgt/hla/docs/manual.html
>
> I have already notified IMGT regarding the ">" problem, though they seem like
> they will be slow to change it. It's a very simple fix to the flatfile, and I
> did it manually with regular expressions. My preference is that we do NOT
> support the backwards notation, as it's clearly wrong. We'll have them fix
> it. In the meanwhile, I can post my python script that corrects it somewhere
> (maybe as a gist on github) and we can just point people to it in a warning if
> they are using the IMGT parser.
>
> Regarding the 1. problem, I have not yet told the IMGT people, but I will do
> so shortly.
>
The document I found does not discuss the details of the location, so I would
expect it to follow the same rules as EMBL (and GenBank and the DDBJ), see:
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
I now agree with you it makes sense to treat this as a new format in SeqIO
(i.e. "imgt" rather than "embl"). The actual new code should be minimal too.
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list