[Biopython] Parsing problem

Iwan Grin iwan.grin at googlemail.com
Wed Dec 9 07:33:09 EST 2009


Hi Peter, Thank you for your reply.

I am new to BioPython and stumbled upon GFF.easy while searching through the
API docs. Actually, What I wanted was a way to parse that location string
into an SeqFeature-like thing from which I could get start, end and
strand.Unfortunately I could not find the correct parser in Bio.Genbank -
any suggestions are welcome.

I agree with you that Bio.GFF.easy expects the Accession number before the
complement. (Actually for my purpose I do not need the accession number at
all.)

Iwan

2009/12/9 Peter <biopython at maubp.freeserve.co.uk>

> On Tue, Dec 8, 2009 at 10:43 PM, Peter <biopython at maubp.freeserve.co.uk>
> wrote:
> > On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin <iwan.grin at googlemail.com>
> wrote:
> >> Hi all,
> >>
> >> I am having a little problem while trying to parse a GenBank (or rather
> >> GenProt) file using BioPython. I am trying to extract the position on
> the
> >> genome from the "coded_by" qualifier of the CDS feature of a protein.
> >>
> >> The "coded_by" string in this specific case looks like this:
> >>
> >> 'complement(NC_012967.1:
> >> 3622110..3624728)'
> >
> > Oh, one of those tricky cross references to another file :(
>
> It looks like the Bio.GFF.easy code expects that to be formatted
> as NC_012967.1:complement(3622110..3624728) and not as
> complement(NC_012967.1:3622110..3624728)
>
> Peter
>


More information about the Biopython mailing list