[Biojava-l] SCF: support for ambiguities

Richard Holland holland at eaglegenomics.com
Fri Oct 31 13:56:54 UTC 2008


It is the correct method, yes.

However your code constructs a new hash set every time it does the
check for W or S etc.. It would be much more efficient to create
class-static references to the ambiguity symbols you need, instead of
(re)creating them every time they're encountered. A class-static gap
symbol reference would also be good in this situation.

cheers,
Richard



2008/10/31 community at struck.lu <community at struck.lu>:
> Hello,
>
>
> I am using the SCF class in the context of HIV-1 population sequencing. In
> this context we do have sometimes ambiguous base calls. To support them I
> extended the SCF class to allow for IUPAC ambiguities up to 2 nucleotides.
>
> Therefore I simply added the following code to the "decode" function:
>
> #########################
>        public Symbol decode(byte call) throws IllegalSymbolException {
>
>            //get the DNA Alphabet
>            Alphabet dna = DNATools.getDNA();
>
>            char c = (char) call;
>            switch (c) {
>                case 'a':
>                case 'A':
>                    return DNATools.a();
>                case 'c':
>                case 'C':
>                    return DNATools.c();
>                case 'g':
>                case 'G':
>                    return DNATools.g();
>                case 't':
>                case 'T':
>                    return DNATools.t();
>                case 'n':
>                case 'N':
>                    return DNATools.n();
>                case '-':
>                    return DNATools.getDNA().getGapSymbol();
>                case 'w':
>                case 'W':
>                    //make the 'W' symbol
>                    Set symbolsThatMakeW = new HashSet();
>                    symbolsThatMakeW.add(DNATools.a());
>                    symbolsThatMakeW.add(DNATools.t());
>                    Symbol w = dna.getAmbiguity(symbolsThatMakeW);
>                    return w;
>                case 's':
>                case 'S':
>                    //make the 'S' symbol
>                    Set symbolsThatMakeS = new HashSet();
>                    symbolsThatMakeS.add(DNATools.c());
>                    symbolsThatMakeS.add(DNATools.g());
>                    Symbol s = dna.getAmbiguity(symbolsThatMakeS);
>                    return s;
> ... (and so on)
> #########################
>
> Is this the right way to do it? And if so, how can this code be submitted to
> the official biojava source code?
>
>
> Best regards,
> Daniel Struck
> _________________________________________________________
> Mail sent using root eSolutions Webmailer - www.root.lu
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>



-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the Biojava-l mailing list