[Biojava-l] question about ambiguous symbols
Keith James
kdj at sanger.ac.uk
Wed Feb 5 15:52:03 EST 2003
>>>>> "dion" == dion whitehead <dion.whitehead at uni-bielefeld.de> writes:
dion> Hello, I am having a frustrating time with attempting to
dion> read in rna sequences. They contain the 'N' symbol which is
dion> a standard ambiguity symbol, but the code trips up on this
dion> every time, saying its not a recognized symbol in the
dion> alphabet. Do I have to specify it myself?
I got bitten by this too, when porting some of the code. If you look
in biojava-live/resources/org/biojava/bio/symbol/AlphabetManager.xml
you will see that the default RNA alphabet contains only
agcu-~. i.e. no ambiguity symbols at all.
I haven't tested this, but you could hack your AlphabetManager.xml to
include
<ambiguityMapping token="n">
<symbolref name="guanine" />
<symbolref name="adenine" />
<symbolref name="cytosine" />
<symbolref name="uracil" />
</ambiguityMapping>
as in the DNA alphabet. Not sure if this is the best solution - I'm
sure someone will say if it's not.
Keith
--
- Keith James <kdj at sanger.ac.uk> bioinformatics programming support -
- Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, UK -
More information about the Biojava-l
mailing list