[Biopython-dev] [Bug 2622] New: Parsing between position locations like 5933^5934 in GenBank/EMBL files
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Tue Oct 21 11:28:57 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2622
Summary: Parsing between position locations like 5933^5934 in
GenBank/EMBL files
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
GenBank and EMBL files can contain features with locations like 123^456,
handled in Biopython as BetweenPosition objects.
Quoting ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt
> A site between two residues, such as an endonuclease cleavage site, is
> indicated by listing the two bases separated by a carat (e.g., 23^24).
A small GenBank example containing examples of this is NC_005816.gbk available
here:
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Yersinia_pestis_biovar_Microtus_91001/NC_005816.gbk
e.g.
variation 5933^5934
/note="compared to AL109969"
/replace="a"
variation 5933^5934
/note="compared to AF053945"
/replace="aa"
For a larger example, see NC_005027.gbk
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Pirellula_sp/NC_005027.gbk
e.g.
misc_feature 41855^41856
/note="cosmid pircos-a3a12/ cosmid pircos-a1d04 joining
point"
See also one of the Biopython unit test examples, SC10H5.embl, a pre-2006 style
EMBL file from BioPerl.
As the following example script and its output will show, Biopython CVS (and I
presume several releases) does not parse these locations sensibly. There are
at least two issues, firstly there is a numerical error from treating 5933^5934
as 5932^11866 (position versus extension) and secondly the representation of
these locations might be better not using separate start/end objects.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list