[Biojava-l] RE: Bug in HashedAlphabetIndex??
Matthew Pocock
mrp@sanger.ac.uk
Thu, 08 Mar 2001 11:07:35 +0000
Schreiber, Mark wrote:
>> -----Original Message-----
>> From: Matthew Pocock [mailto:mrp@sanger.ac.uk]
>> Sent: Thursday, March 08, 2001 5:00 AM
>> To: Schreiber, Mark
>> Cc: 'biojava-l@biojava.org'
>> Subject: Re: [Biojava-l] RE: Bug in HashedAlphabetIndex??
>>
>>
>> Hi Mark,
>>
>> I've fixed this on the main trunk. Thomas, could you port this to the
>> 1.1 branch?
>>
>
>
> Great, seems to work now. What was the problem??
>
We had originaly coppied the implementation of the iterator method
directly from SimpleCrossProductAlphabet which stores symbols in a Map
and uses map.vaules().iterator(). SparseCrossProductAlphabet populates a
Map as symbols are required, spreading the initialization cost, and also
for simple cases (like alignments), vastly reducing the number of
symbols actualy instantiated. This meant that the symbols iterator from
an un-populated alphabet didn't iterate over all symbols, just the ones
that had been explicitly asked for. This is now fixed by providing a
niftey implementation of Iterator - see the source code for more
details, but the same trick can be used to build alphabet indexers for
large alphabets (any takers?).
M