[Biojava-dev] [Bug 2164] New: Restriction Mapper - Thread (or dual core cpu) problem

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Dec 13 12:59:10 UTC 2006


http://bugzilla.open-bio.org/show_bug.cgi?id=2164

           Summary: Restriction Mapper - Thread (or dual core cpu) problem
           Product: BioJava
           Version: 1.4
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: major
          Priority: P2
         Component: bio
        AssignedTo: biojava-dev at biojava.org
        ReportedBy: ilhami.visne at gmail.com


in last summer, i wrote a program, which uses Restriction Mapper. As it was in
example (if i remember correctly), for each enzyme i used one thread. everytime
i got error.

then i noticed, if i use only one enzyme, i get no error. i thought, this could
be a thread-safe issue, because if enzyme count is more than one, more than one
thread will run. therefore i have changed my program to single threaded. and it
has worked well, even for many enzymes. till this week...

one of my clients has run my program on a dual cpu machine. Guess what? Again
same error!!! i have a single-cpu laptop. a friend of mine has a dual-core cpu
laptop. i have tried myself on this machine. And yeah. that is the problem,
because for the same file i don't get any error on my single-core machine, but
everytime the same error on dual-core cpu. Two more important information: 1.
here i got an error for HpaII but it can be any other enzyme. 2. my file has
24000 sequences. the sequence, by which this exception is thrown, is random
too. sometimes the 5600. sequence, another time the 17456. sequence. it changes
too. i checked, all sequences are normal. 

i ran my program today several times, to get the stack trace and i got three
different stack trace. i ran it on my single-core laptop and dual-core laptop.
therefor there are three different stack trace.

1- Exception in thread "Thread-13" org.biojava.bio.BioRuntimeException: Failed
to complete search for HpaII CCGG (1/3)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:137)
        at
org.biojava.utils.SimpleThreadPool$PooledThread.run(SimpleThreadPool.java:295)
Caused by: java.lang.NullPointerException
        at
org.biojava.bio.seq.io.SymbolListCharSequence.charAt(SymbolListCharSequence.java:115)
        at java.lang.Character.codePointAt(Unknown Source)
        at java.util.regex.Pattern$Single.match(Unknown Source)
        at java.util.regex.Pattern$Curly.match(Unknown Source)
        at java.util.regex.Pattern$Start.match(Unknown Source)
        at java.util.regex.Matcher.search(Unknown Source)
        at java.util.regex.Matcher.find(Unknown Source)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:104)
        ... 1 more

2- Exception in thread "Thread-2" org.biojava.bio.BioRuntimeException: Failed
to complete search for HpaII CCGG (1/3)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:137)
        at
org.biojava.utils.SimpleThreadPool$PooledThread.run(SimpleThreadPool.java:295)
Caused by: java.lang.ArrayIndexOutOfBoundsException
        at java.lang.System.arraycopy(Native Method)
        at java.util.ArrayList.ensureCapacity(Unknown Source)
        at java.util.ArrayList.add(Unknown Source)
        at
org.biojava.bio.seq.SimpleFeatureHolder.addFeature(SimpleFeatureHolder.java:92)
        at
org.biojava.bio.seq.impl.ViewSequence.createFeature(ViewSequence.java:283)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:113)
        ... 1 more

3- Exception in thread "Thread-0" Exception in thread "Thread-1"
org.biojava.bio.BioRuntimeException: Failed to complete search for AluI AGCT
(2/2)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:137)
        at
org.biojava.utils.SimpleThreadPool$PooledThread.run(SimpleThreadPool.java:295)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 88
        at java.util.ArrayList.add(Unknown Source)
        at
org.biojava.bio.seq.SimpleFeatureHolder.addFeature(SimpleFeatureHolder.java:92)
        at
org.biojava.bio.seq.impl.ViewSequence.createFeature(ViewSequence.java:283)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:113)
        ... 1 more
org.biojava.bio.BioRuntimeException: Failed to complete search for MseI TTAA
(1/3)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:137)
        at
org.biojava.utils.SimpleThreadPool$PooledThread.run(SimpleThreadPool.java:295)
Caused by: java.lang.ArrayIndexOutOfBoundsException
        at java.lang.System.arraycopy(Native Method)
        at java.util.ArrayList.ensureCapacity(Unknown Source)
        at java.util.ArrayList.add(Unknown Source)
        at
org.biojava.bio.seq.SimpleFeatureHolder.addFeature(SimpleFeatureHolder.java:92)
        at
org.biojava.bio.seq.impl.ViewSequence.createFeature(ViewSequence.java:283)
        at
org.biojava.bio.molbio.RestrictionSiteFinder.run(RestrictionSiteFinder.java:113)
        ... 1 more

Program to use:

//please use a big file. my file has ~24000 sequences. the length of a sequence
doesn't matter
 File sequenceFile = new File("D:/CpG_chr1_plusminus 3000_hg18.fasta");
 BufferedReader br = new BufferedReader(new FileReader(sequenceFile));

 SequenceIterator iter = SeqIOTools.readFastaDNA(br);

 SimpleThreadPool pool = new SimpleThreadPool();

 RestrictionMapper mapper = new RestrictionMapper(pool);

 mapper.addEnzyme(RestrictionEnzymeManager.getEnzyme("MseI"));
 mapper.addEnzyme(RestrictionEnzymeManager.getEnzyme("HpaII"));
 mapper.addEnzyme(RestrictionEnzymeManager.getEnzyme("AluI"));
 Sequence seq;
 while(iter.hasNext()){
     seq = iter.nextSequence();
     mapper.annotate(seq);
 }


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the biojava-dev mailing list