[BioPython] genbank parser returns start position of the location decreased by one

Michiel De Hoon mdehoon at c2b2.columbia.edu
Fri Oct 14 18:05:10 EDT 2005


I would think that this is intentional. Python uses zero-based arrays,
Genbank starts counting at 1.
In other words, gb_seqrecord.seq[0:4115] will return the sequence that you're
interested in.

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



-----Original Message-----
From: biopython-bounces at portal.open-bio.org on behalf of Martin MOKREJS
Sent: Fri 10/14/2005 5:12 PM
To: biopython at biopython.org
Subject: [BioPython] genbank parser returns start position of the location
decreased by one
 
Hi,
  I am either too tired or have missed some point. I use bipython 1.40b to
fetch data from genbank. The
location: (467..2863) from Genbank as seen on their web pages differs to the
string
returned by biopython. I get location: (466..2863) instead. The latter number
is
never decreased, only the first-one. What's wrong? ;)
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=56117851
It happends with CDS feature data, but also with source, just anything:

FEATURES             Location/Qualifiers
     source          1..4115


$ python
Python 2.4.2 (#1, Oct  2 2005, 05:43:55) 
[GCC 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import GenBank
>>> record_parser = GenBank.FeatureParser()
>>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'genbank', parser =
record_parser)
>>> gb_seqrecord = ncbi_dict['56117851']
>>> print _feature.location
(0..4115)
>>> 
_______________________________________________
BioPython mailing list  -  BioPython at biopython.org
http://biopython.org/mailman/listinfo/biopython




More information about the BioPython mailing list