[Biopython] Storing SeqRecord objects with annotation
Hilmar Lapp
hlapp at gmx.net
Thu Jul 23 09:01:29 EDT 2009
On Jul 23, 2009, at 6:20 AM, Peter wrote:
> Currently the BioSQL schema doesn't have any explicit support
> for "per letter annotation"
I haven't been following the thread closely and so may be missing what
is really meant by this. If, however, you mean associating annotation
to a specific letter (position) in the sequence, BioSQL does support
this - you'd create a seqfeature with appropriate location, and attach
the annotation to the seqfeature.
Bioentry annotations are location-less, by comparison.
>
> The GenBank file format simply doesn't have an concept of "per
> letter annotation"
Since it does for in the above sense, I'm inclined to assume that you
really do mean something different than the above?
> [...]
> You can record any object in the SeqRecord's annotation dictionary.
> However, saving the result to a file will be tricky - and it wouldn't
> work in BioSQL either.
Note that that's not entirely true. If you have a textual
serialization (such as XML) of your object, you *can* store it in
bioentry_qualifier_value. This is what we do in BioPerl with a TagTree
annotation object that supports a nested hierarchical annotation
structure needed for lossless representation of some UniProt lines.
Obviously, that won't allow you to query very well by individual
elements of your custom annotation object. But you can build a custom
index (e.g., using Lucene) that does that.
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Biopython
mailing list