[Biojava-dev] Problem with SymbolListCharSequence and Regex

Matthew Pocock matthew_pocock at yahoo.co.uk
Thu Oct 30 06:02:15 EST 2003


Hi,

This is a coordinate systems problem. Strings index from 0 to length-1, 
and ranges are inclusive of the min index and exclusive of the max 
index. Sequencees index from 1 to length and ranges are inclusive of min 
index and inclusive of max index.

There was a bug in SymbolListCharSequence where code wasn't taking this 
into account. Now fixed in CVS.

Matthew

    public CharSequence subSequence(int start, int end)
    {
        return new SymbolListCharSequence(syms.subList(start + 1, end), 
// was end + 1
                                          alphaTokens);
    }


Ido M. Tamir wrote:

>Hi,
>could this be a bug ?
>
>The regex captured group returned from a 
>SymbolListCharSequence is 1 char more extended
>to the right than expected.
>
>Thank you very much for
>your time and effort.
>
>Ido M. Tamir
>
>
>Output for the testcase below:
>
>string: C
>symbol: ca gcat
>
>---testcase:
>
>
>package mf;
>
>import java.util.regex.Matcher;
>import java.util.regex.Pattern;
>
>import org.biojava.bio.seq.DNATools;
>import org.biojava.bio.seq.io.SymbolListCharSequence;
>import org.biojava.bio.symbol.SymbolList;
>
>
>public class TestRegex {
>	public static void main(String[] args) {
>		try {
>			Pattern p = Pattern.compile("C", Pattern.CASE_INSENSITIVE);
>			String strSeq = "GCAT";
>			SymbolList symSeq = DNATools.createDNA(strSeq);
>			Matcher m = p.matcher( strSeq );
>			if( m.find() ){
>				System.out.println( "string: " + m.group() );
>			}
>			m = p.matcher( new SymbolListCharSequence(symSeq ));
>			if( m.find() ){
>				System.out.println( "symbol: " + m.group() + " " + new 
>SymbolListCharSequence(symSeq ));
>			}
>		} catch (Exception e) {
>			e.printStackTrace();
>		}
>	}
>}
>
>
>_______________________________________________
>biojava-dev mailing list
>biojava-dev at biojava.org
>http://biojava.org/mailman/listinfo/biojava-dev
>
>  
>




More information about the biojava-dev mailing list