[Biojava-l] SCF: support for ambiguities
Richard Holland
holland at eaglegenomics.com
Fri Oct 31 16:14:30 UTC 2008
A patch would be much appreciated!
cheers,
Richard
2008/10/31 community at struck.lu <community at struck.lu>:
> True. It was a first quick and dirty hack to get the rest of my project going.
>
> I think adding support of the IUPAC ambiguities to DNATools would be the most
> approbate solution. The SCF class can then easily be adapted.
>
> Are there any plans to do so?
> If not, I could give it a try and submit a patch for DNATools and SCF.
>
> Greetings,
> Daniel
>
> "Richard Holland" <holland at eaglegenomics.com> wrote:
>
>> It is the correct method, yes.
>>
>> However your code constructs a new hash set every time it does the
>> check for W or S etc.. It would be much more efficient to create
>> class-static references to the ambiguity symbols you need, instead of
>> (re)creating them every time they're encountered. A class-static gap
>> symbol reference would also be good in this situation.
>>
>> cheers,
>> Richard
>>
>>
>>
>> 2008/10/31 community at struck.lu <community at struck.lu>:
>> > Hello,
>> >
>> >
>> > I am using the SCF class in the context of HIV-1 population sequencing. In
>> > this context we do have sometimes ambiguous base calls. To support them I
>> > extended the SCF class to allow for IUPAC ambiguities up to 2 nucleotides.
>> >
>> > Therefore I simply added the following code to the "decode" function:
>> >
>> > #########################
>> > public Symbol decode(byte call) throws IllegalSymbolException {
>> >
>> > //get the DNA Alphabet
>> > Alphabet dna = DNATools.getDNA();
>> >
>> > char c = (char) call;
>> > switch (c) {
>> > case 'a':
>> > case 'A':
>> > return DNATools.a();
>> > case 'c':
>> > case 'C':
>> > return DNATools.c();
>> > case 'g':
>> > case 'G':
>> > return DNATools.g();
>> > case 't':
>> > case 'T':
>> > return DNATools.t();
>> > case 'n':
>> > case 'N':
>> > return DNATools.n();
>> > case '-':
>> > return DNATools.getDNA().getGapSymbol();
>> > case 'w':
>> > case 'W':
>> > //make the 'W' symbol
>> > Set symbolsThatMakeW = new HashSet();
>> > symbolsThatMakeW.add(DNATools.a());
>> > symbolsThatMakeW.add(DNATools.t());
>> > Symbol w = dna.getAmbiguity(symbolsThatMakeW);
>> > return w;
>> > case 's':
>> > case 'S':
>> > //make the 'S' symbol
>> > Set symbolsThatMakeS = new HashSet();
>> > symbolsThatMakeS.add(DNATools.c());
>> > symbolsThatMakeS.add(DNATools.g());
>> > Symbol s = dna.getAmbiguity(symbolsThatMakeS);
>> > return s;
>> > ... (and so on)
>> > #########################
>> >
>> > Is this the right way to do it? And if so, how can this code be submitted
> to
>> > the official biojava source code?
>> >
>> >
>> > Best regards,
>> > Daniel Struck
>> > _________________________________________________________
>> > Mail sent using root eSolutions Webmailer - www.root.lu
>> >
>> >
>> > _______________________________________________
>> > Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >
>>
>>
>
>
> _________________________________________________________
> Mail sent using root eSolutions Webmailer - www.root.lu
>
>
>
--
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/
More information about the Biojava-l
mailing list