[Biojava-dev] [Bug 2854] New: Selection of protein alphabet is hardcoded in ProteinTools class
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Wed Jun 10 21:59:30 UTC 2009
http://bugzilla.open-bio.org/show_bug.cgi?id=2854
Summary: Selection of protein alphabet is hardcoded in
ProteinTools class
Product: BioJava
Version: live (CVS source)
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: seq
AssignedTo: biojava-dev at biojava.org
ReportedBy: mdharsee at ocbn.ca
In our application we are calling createProtein() in class
org.biojava.bio.seq.ProteinTools to generate SymbolList objects to encapsulate
peptide sequences that are composed of the 20 common amino acid symbols, as
well as the 'X' ambiguity symbol.
However createProtein() forces the selection of the PROTEIN-TERM alphabet from
AlphabetManager.xml, through the call to 'getTAlphabet()' as copied below:
public static SymbolList createProtein(String theProtein)
throws IllegalSymbolException
{
SymbolTokenization p = null;
try {
p = getTAlphabet().getTokenization("token");
} catch (BioException e) {
throw new BioError("Something has gone badly wrong with Protein", e);
}
return new SimpleSymbolList(p, theProtein);
}
This selection should rather be made based on the symbol content of the input
sequence(s), rather than being hardcoded. Only if the input data contains the
symbol 'TER' (terminus) or some abiguity symbol that covers the PROTEIN-TERM
alphabet, should the PROTEIN-TERM alphabet be selected. Otherwise the simpler
PROTEIN alphabet should be selected.
On a related note, the PROTEIN alphabet defined in AlphabetManager.xml consists
of 22 residues and includes the less commonly found 'SEC' (selenocysteine, U)
and 'PYR' (pyroglutamic acid, O). However, many applications only require the
common 20-symbol alphabet that excludes the latter two residues. It would be
useful to include a new alphabet in AlphabetManager.xml that defines the
simpler 20-symbol set of common amino acids. Perhaps this point should be a
feature request?
Cheers,
Moyez
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the biojava-dev
mailing list