[BioSQL-l] Storing "per letter" annotation?

Richard Holland dicknetherlands at gmail.com
Sun May 25 11:55:32 UTC 2008


For what it's worth, BioJava allows you to define sequences as lists
of symbols, and each symbol can contain as much info as you want. e.g.
if you consider DNA to be an alphabet of ATCG etc., and you consider
quality scores as an alphabet consisting of the integer numbers, then
to construct a quality-scored sequence you use BioJava to make a
cross-product alphabet of the two, where each symbol in the sequence
actually consists of a pair of symbols, one from each alphabet. This
means you can combine any number of alphabets to define complex and
informative objects to represent each symbol in your sequence.

cheers,
Richard.

2008/5/25 Peter <biopython at maubp.freeserve.co.uk>:
> Hilmar Wrote:
>>> It sounds like in essence you want to store alternative sequences in other
>>> alphabets for a sequence?
>
> Peter wrote:
>> I hadn't thought of it like that, but for many of the examples it
>> would just be one character per letter of sequence, so could be held
>> as an alternative sequence.  This doesn't really extend to cover
>> things like a list of integers or a list of floats, but would
>> certainly cover a number of use-cases.
>
> Now that I know which bits of BioPerl to search for, I see there has
> been some similar BioSQL discussion in the past, e.g.
> http://bioperl.org/pipermail/bioperl-l/2005-July/019280.html
>
> Hilmar Wrote:
>>> In BioPerl we have Bio::Seq::SeqWithQuality and the more generic
>>> Bio::Seq::MetaI.
>
> I had wondered what metals had to do with sequences, in a different
> font MetaI is of course short for MetaInformation!
>
> Peter
>
> P.S. I'll be away next week, so I probably won't follow up on this
> topic immediately,
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>



More information about the BioSQL-l mailing list