[Biojava-l] Gap: Basis Symbol vs Symbol
Matthew Pocock
mrp@sanger.ac.uk
Fri, 09 Feb 2001 11:05:40 +0000
Emig, Robin wrote:
> I have some code that uses codons with ambiguous bases using
> basissymbol. Problem is I also try to deal with gap symbols at the same
> time. I thought the idea behind the gap symbol was that it would be
> universal, ie gap or gapxgapxgap would be the same symbol. However, I can't
> use my current code like that because I need to do a getsymbols, on standard
> codons. This comes from the BasisSymbol. Since gap only implements Symbol,
> my code blows up when a gap is thrown in.
> We could have gap implement BasisSymbol or AtomicSymbol, any ideas
> why not?
> The workaround is that I will create a basis symbol which is
> gapxgapxgap and try to deal with it differently elsewhere( I really liked
> being able to just say AlphabetManager.getGapSymbol()) to deal with gaps.
> -Robin
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
Hi Robin,
Could you post the usage-case with the codons that is causing you
trouble? There may be a quick-fix, or it may be showing up a fatal flaw
with gaps.
A column of gaps in an alignment (empty column) is not the same as a gap
in the whole alingment. The first case means that every sequence happens
to be snapped at that position and is modeled correctly by gap^n, easily
obtainable by parsing a list of gap symbol into the alphabet's
getSymbol(list) method - this will return a BasisSymbol that spans a
null-sub space of the cross-product alphabet. The second case is modeled
corectly by the alignment being behind a GappedSymbolList that inserts a
gap at the required position.
I will double-check the documentation before release time, but the
algebra for symbols definitely requires gap^n to be distinct from gap to
make everything work out (it is clearly a differently shaped null
sub-space).
Matthew