[Biopython-dev] Bio.GFF and Brad's code
Michiel de Hoon
mjldehoon at yahoo.com
Fri Apr 17 12:44:34 EDT 2009
--- On Fri, 4/17/09, Brad Chapman <chapmanb at 50mail.com> wrote:
> The GFF parser right now is really generating SeqFeature
> objects for each GFF line; the top level SeqRecords are a
> collection that holds the individual features. The SeqFeature
> object is pretty similar to GFF and the generic object you are
> proposing. For instance, here is a GFF line and the relevant
> attributes from SeqFeature for the line:
>
> I Orfeome PCR_product 12759747 12764936 . - . PCR_product "mv_B0019.1" ; Amplified 1 ; Amplified 1
>
> type: PCR_product
> location: [12759746:12764936]
> strand: -1
> qualifiers:
> Key: amplified, Value: ['1']
> Key: pcr_product, Value: ['mv_B0019.1']
> Key: source, Value: ['Orfeome']
>
Just to make I understand how this works, looking at your previous code example:
>>> from BCBio.GFF.GFFParser import GFFAddingIterator
>>> gff_iterator = GFFAddingIterator()
>>> rec_dict = gff_iterator.get_all_features(gff_file)
> The returned dictionary is like a dictionary from SeqIO.to_dict;
> keys are ids and values are SeqRecords.
What will be the key in rec_dict for the example GFF file above? Is that the "I" in the first column, as in
rec_dict["I"] = a SeqRecord with the SeqFeature you described above?
Best,
--Michiel
More information about the Biopython-dev
mailing list