[Biopython] Problem with parsing strand in Homo_sapiens.GRCh37.68	genbank files
    Susan Wilson 
    smwilson at hpc.unm.edu
       
    Tue Aug 14 14:10:53 UTC 2012
    
    
  
Hi,
I am parsing the gb files with biopython. My problem is that none of the 
seqfeature.strand values are returning the plus strand (value == 1).
The commands below are a bit fabricated. (For instance, I have left out 
the opening and closing of fout.) I have read in 
Homo_sapiens.GRCh37.68.chromosome.1.dat using SeqIO.read. The file 
output of command [13] shows only "-1" and "None". Is there a bug in the 
parser? Or am I making a mistake of some sort?
Thanks.
Susan
In [10]: genome
Out[10]: 
SeqRecord(seq=Seq('NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN...NNN', 
Alphabet()), id='1GRCh37', name='1', description='Homo sapiens 
chromosome 1 GRCh37 full sequence 1..249250621 reannotated via EnsEMBL', 
dbxrefs=[])
In [11]: len(genome)
Out[11]: 249250621
In [12]: len(genome.features)
Out[12]: 109751
In [13]: for f in genome.features:
      ...:     fout.write(str(f.strand) + "~" + str(f.location) + \
      ...: "~" + str(f.qualifiers.get('gene')) + "\n")
    
    
More information about the Biopython
mailing list