[Biojava-l] Problems parsing in RNA sequence in genbank format
Lachlan Coin
lc1 at sanger.ac.uk
Fri Apr 4 14:02:00 EST 2003
Hi,
I am parsing in RNA sequence data. One of the positions has a 'R' which
stands for the ambiguity symbol A or G. However, the RNA alphabet does
not have this as a token for anything (I did a quick test, and the
ambiguity token for A or G is N). So the genbank reader falls over when
it gets to this.
Any suggestions on how to handle this? Can we modify the symbol
tokenization for RNA to cope with this case?
Thanks a lot,
Lachlan
More information about the Biojava-l
mailing list