[Biojava-dev] Problems with cross-product alphabets and alignments

Lachlan James Coin lc1 at sanger.ac.uk
Fri Mar 14 15:50:31 EST 2003


Quoting Matthew Pocock <matthew_pocock at yahoo.co.uk>:

> Lachlan James Coin wrote:
> > Hi,
> > 
> > The one which I can fix:  FlexibleAlignment has addSequence() and
> removeSequence
> 
> <snip/>
> 
> Go ahead and fix this.
> 
> > 
> > The on which I can't fix:  in the symbolAt() method of AbstractULAlignment
> 
> > there is the following line
> >  alphabet.getSymbol(list);
> > 
> > where alphabet is a cross-product alphabet list is a List of symbols. 
> However, 
> > if there is a gap symbol in this list, the symbol returned by the cross
> product 
> > alphabet is the gap symbol from the cross-product alphabet, not the
> BasisSymbol 
> > comprising all of the symbols in the list.
> 
> That is a bug. the alphabet should be delegating to a method on 
> AlphabetManager, which should handle this case. To track this down, we 
> would need to know which alphabet impl is being used here, and trace its 
> getSymbol() method.

The alphabet being used is a SparseCrossProductAlphabet.  However, the 
getSymbol method is inherited from AbstractAlphabet.  The problem seems to be 
in the following code within the getSymbol() method of AbstractAlphabet.  
atomic is a count of the number of atomic symbols in the list of symbols 
(syms).  Because a gap is not an atomic symbol it doesn't count, and hence 
atomic< syms.size().  One way to fix this is to make gap an atomic symbol, but 
this causes lots of other problems (at least the validate() methods fail)

 if(atomic == syms.size()) {
      return getSymbolImpl(syms);
      } else {
      return AlphabetManager.createSymbol(
       '*', Annotation.EMPTY_ANNOTATION,
        syms, this
      );

Any ideas?

Thanks,

Lachlan




More information about the biojava-dev mailing list