[Biopython] Problem with parsing strand in Homo_sapiens.GRCh37.68 genbank files

Susan Wilson smwilson at hpc.unm.edu
Tue Aug 14 14:54:24 UTC 2012


Hi Peter,

Thanks for quick response. I have downloaded the files from 
ftp://ftp.ensembl.org/pub/release-68/genbank/homo_sapiens/. Got version 
1.53 of biopython. Maybe I should try 1.6? Here's some diagnostics:

$ head Homo_sapiens.GRCh37.68.chromosome.1.dat
LOCUS       1 249250621 bp DNA HTG 14-JUL-2012
DEFINITION  Homo sapiens chromosome 1 GRCh37 full sequence 1..249250621
             reannotated via EnsEMBL
ACCESSION   chromosome:GRCh37:1:1:249250621:1
VERSION     1GRCh37
KEYWORDS    .
SOURCE      human
   ORGANISM  Homo sapiens
             .
COMMENT     This sequence was annotated by the Ensembl system. Please 
visit the


Output from ipython:

import sys

sys.version_info
Out[3]: (2, 6, 5, 'final', 0)

sys.version
Out[4]: '2.6.5 (r265:79063, Apr 16 2010, 13:57:41) \n[GCC 4.4.3]'

import Bio

print Bio.__version__
1.53

On 08/14/2012 08:46 AM, Peter Cock wrote:
> On Tue, Aug 14, 2012 at 3:10 PM, Susan Wilson <smwilson at hpc.unm.edu> wrote:
>> Hi,
>>
>> I am parsing the gb files with biopython. My problem is that none of the
>> seqfeature.strand values are returning the plus strand (value == 1).
> That should happen with a protein sequence.
>
>> The commands below are a bit fabricated. (For instance, I have left out the
>> opening and closing of fout.) I have read in
>> Homo_sapiens.GRCh37.68.chromosome.1.dat using SeqIO.read.
> What URL are you getting that file from?
>
> Which version of Biopython are you using? There were some strand
> related changes recently (internally moving it from the SeqFeature to
> the SeqFeature's location object).
>
> Thanks,
>
> Peter




More information about the Biopython mailing list