[Biopython] Parsing problem

Brad Chapman chapmanb at 50mail.com
Wed Dec 9 13:38:02 UTC 2009


Iwan and Peter;

> > I am new to BioPython and stumbled upon GFF.easy while searching through the
> > API docs. Actually, What I wanted was a way to parse that location string
> > into an SeqFeature-like thing from which I could get start, end and
> > strand.Unfortunately I could not find the correct parser in Bio.Genbank -
> > any suggestions are welcome.
> 
> Right now Bio.GenBank doesn't really expose the location parsing in an
> easy to use way like Bio.GFF.easy does.

If you don't like ugly code, please avert your eyes now. This will
work with the standard GenBank parsing and is definitely not future
proof since it involves using private members. However, it'll work
for something quick n' dirty:

from Bio.GenBank import _FeatureConsumer
from Bio.SeqFeature import SeqFeature

def gb_string_to_feature(content, use_fuzziness=True):
    """Convert a GenBank location string into a SeqFeature.
    """
    consumer = _FeatureConsumer(use_fuzziness)
    consumer._cur_feature = SeqFeature()
    consumer.location(content)
    return consumer._cur_feature

print gb_string_to_feature('complement(NC_012967.1:3622110..3624728)')

Hope this helps,
Brad



More information about the Biopython mailing list