[Biopython-dev] [Bug 2819] Bio.SeqIO support for NCBI protein tables (*.ptt files)

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Apr 23 09:39:11 UTC 2009


http://bugzilla.open-bio.org/show_bug.cgi?id=2819





------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk  2009-04-23 05:39 EST -------
Just to note that Bio/SeqIO/ProteinTableIO.py needs a minor improvement to cope
with one special case - features which wrap the origin, e.g. NEQ001 in
Nanoarchaeum equitans.

ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Nanoarchaeum_equitans/NC_005213.gbk
ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Nanoarchaeum_equitans/NC_005213.ptt

This is the first CDS in the GenBank file, location given as:
complement(join(490883..490885,1..879))

It is the last entry in the Protein Table file,
490883..879     -       ...

All my code needs to do is spot when start > end, and then add the two
appropriate sub-features (using the known genome length, 490885) and set the
location operator to join (to match what the GenBank parser does).  I'll do
this at some point assuming there is interest in adding this parser to
Bio.SeqIO.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list