[Biojava-l] SymbolTokenization landing
Thomas Down
td2@sanger.ac.uk
Fri, 16 Nov 2001 16:30:55 +0000
Hi...
A while back, I posted a patch which replaced the current SymbolParser
objects with SymbolTokenizations, which encapsulate both
Symbol -> string and string -> Symbol mappings in a single object.
I've been maintaining this code as a `branch' of the main development,
including all the changes from the trunk. It all seeems to be nice
and stable.
Anyway, I'd like to see this code checked in soon. It would certainly
be worth getting this change out of the way before we start on any
naming and directory work. Therefore, unless there are any objections,
I'm planning to check the code in on monday or tuesday of next week.
This change will require modifications to some (hopefully not too
many) applications. The changes which might affect existing code
are:
- The getParser() method on Alphabets has been replaced by
getTokenization(), which returns a SymbolTokenization object.
- All functions for SymbolParsers have been replaced by
SymbolTokenizations. However, they don't have the equivalent
of:
SymbolList sl = symParser.parse("agttcga");
Instead, use the constructor:
new SimpleSymbolList(tokenization, "agttcga");
(or, of course, use one of the various convenience methods like
DNATools.createDNA();
- Symbols no longer have a getToken() method. Code which uses
this will have to either:
+ use getName() instead
+ get a SymbolTokenization from the appropriate Alphabet, then
use the tokenizeSymbol method.
+ For the specific case of DNA, there is a convenient method
DNATools.dnaToken(symbol);
added by popular request.
The patch is a bit big to send out to the list, but I'll send a copy
by e-mail or whatever to anyone who's interested,
Thomas.