[Biojava-l] Pattern

Jerome LANE Jerome.Lane at igh.cnrs.fr
Mon Jun 25 18:15:52 UTC 2007


Hi,

I have used biojava Pattern class to match DNA sequence. But I can't 
find all matches for my sequence. For example here a bit of code that I 
have implemented to search for "aa" pattern in "aaaa" DNA sequence :

-----------------------------------
try {
                // Variables needed...
                org.biojava.utils.regex.Matcher occurences ;
                FiniteAlphabet IUPAC = DNATools.getDNA();
                SymbolList WorkingSequence = DNATools.createDNA("aaaa");
           
                // Create pattern using pattern factory.
                org.biojava.utils.regex.Pattern pattern;
                PatternFactory FACTORY = PatternFactory.makeFactory(IUPAC);
                try{
                    pattern = FACTORY.compile("aa");
                } catch(Exception e) {e.printStackTrace(); return;}
                System.out.println("Searching for: 
"+pattern.patternAsString( ) );
           
                // Obtain iterator of matches.
                try {
                    occurences = pattern.matcher( WorkingSequence );
                } catch(Exception e) {e.printStackTrace(); return;}
     
                // Foreach match
                while( occurences.find( ) ) {
                    System.out.println("Match: " +"\t"+ WorkingSequence
                                    +"\n"+ occurences.start() +"\t"+ 
occurences.group().seqString());
                }
            } catch (Exception ex) {
                ex.printStackTrace();
                System.exit(1);
            }
----------------------------

And this is the output :

----------------------------
Searching for: aa
Match:     org.biojava.bio.symbol.SimpleSymbolList at ea82ff69 length: 4
1    aa
Match:     org.biojava.bio.symbol.SimpleSymbolList at ea82ff69 length: 4
3    aa
--------------------------------
But for the input sequence "aaaa" I should have 3 matchs at postion 1, 2 
and 3. Is there any parameter to provide for it ?

Best regards

Jerome



More information about the Biojava-l mailing list