[Biojava-l] issue with translating codons with N
Nick England
nickengland at gmail.com
Fri Sep 20 14:16:39 UTC 2013
Everyone,
I've stepped through with a debugger, and this is a bad bug.
The code to translate from RNA->Protein does the following:
- Take the ASCII Value for the 3 RNA bases, and multiple the first pos by
16, second by 4 and third by 1 and add them up.
- Assume there won't be any collisions.
Here are the values which it then uses:
A:65
G:71
C:67
U:85
N:78
ANA: 1417
CAU: 1417
ANG: 1423
CGA: 1423
Notice any hash collisions?
I don't get why this wasn't done in a standard JavaHashMap which would
ensure that any collisions were resolved. This is a pretty critical bug for
a biology informatics package.
Nick
On 20 September 2013 13:45, Nick England <nickengland at gmail.com> wrote:
> Hara,
>
> Hmm this is rather odd. I get the same issue with that sequence with a
> custom engine as well.
>
> My code has:
> Builder builder = new TranscriptionEngine.Builder();
> builder.initMet(false);
> builder.translateNCodons(true);
> builder.trimStop(false);
> TranscriptionEngine engine = builder.build();
> Sequence<AminoAcidCompound> seq=engine.translate(new
> DNASequence("GTNTGTTAGTGT"));
> assertEquals("XC*C", seq.toString());
> Sequence<AminoAcidCompound> seq2=engine.translate(new
> DNASequence("ANAANG"));
> System.out.println(seq2);
> the first sequence translates as expected, but your sequence is
> translating as HR, when it should be XX. This looks like a pretty bad bug!
>
> Nick
>
>
> On 19 September 2013 19:59, Hara Dilley <hdilley at sutrobio.com> wrote:
>
>> Hi,
>>
>> Is there an issue with the DNA Translation in biojava3.core?
>> It appears that it wants to translate "N" in certain cases
>> Executing:
>> new
>> DNASequence("ANAANG").getRNASequence().getProteinSequence().getSequenceAsString();
>> will produce aa HR.
>>
>> thanks
>> Hara
>>
>> ________________________________
>>
>> This email and any attachments thereto may contain private, confidential,
>> and privileged material for the sole use of the intended recipient. Any
>> review, copying, or distribution of this email (or any attachments thereto)
>> by others is strictly prohibited. If you are not the intended recipient,
>> please contact the sender immediately and permanently delete the original
>> and any copies of this email and any attachments thereto.
>>
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
More information about the Biojava-l
mailing list