[Biojava-dev] Problem with SymbolListCharSequence and Regex

Ido M. Tamir tamir at imp.univie.ac.at
Wed Oct 29 16:54:18 EST 2003


Hi,
could this be a bug ?

The regex captured group returned from a 
SymbolListCharSequence is 1 char more extended
to the right than expected.

Thank you very much for
your time and effort.

Ido M. Tamir


Output for the testcase below:

string: C
symbol: ca gcat

---testcase:


package mf;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.biojava.bio.seq.DNATools;
import org.biojava.bio.seq.io.SymbolListCharSequence;
import org.biojava.bio.symbol.SymbolList;


public class TestRegex {
	public static void main(String[] args) {
		try {
			Pattern p = Pattern.compile("C", Pattern.CASE_INSENSITIVE);
			String strSeq = "GCAT";
			SymbolList symSeq = DNATools.createDNA(strSeq);
			Matcher m = p.matcher( strSeq );
			if( m.find() ){
				System.out.println( "string: " + m.group() );
			}
			m = p.matcher( new SymbolListCharSequence(symSeq ));
			if( m.find() ){
				System.out.println( "symbol: " + m.group() + " " + new 
SymbolListCharSequence(symSeq ));
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}




More information about the biojava-dev mailing list