[Bioperl-l] Perl script uses all my cpu

Jason Stajich jason.stajich at gmail.com
Thu Oct 31 05:33:57 UTC 2013


Couple things if speed is your primary concern you want to go lower down I think and avoid using the modules unless necessary.

a) SearchIO parsing is slow -- if you want speed dump the data to tabular format -m 8 and just parse the columns with split.

b) if you specify the input filename as - you can pass in the sequence data as a input string instead of having to create the kmer file. Or will also be faster to pipeline you analysis of many kmers at once rather than creating one file at at time.

you can do it through STDIN

open(my $fh => "| fasta36 -T 1 -E 1e-5 - databasename > outfile") || die $!;
print $fh ">kmer$l\n",$seq,"\n";

open(my $infh => "outfile" ) || die $!;

c) if you wanted to be even clever-er and not ever create files try IPC::Open2 - http://perldoc.perl.org/IPC/Open2.html
you could push seq data in through STDIN and get the SearchIO output from STDOUT - you would just print to the $infh and initialize a Bio::SearchIO object reading from output -- though there is some buffering that has to happen to wait on something running.

But it would be simpler to to write to /dev/shm/outfile or a fast SSD drive or /tmp and read back from that file. Could also keep the filehandle open and rewind too if you wanted to.  I would 

Jason

On Oct 29, 2013, at 10:24 PM, Jason Stajich <jason.stajich at gmail.com> wrote:

> are you sure it is bioperl that is causing this - if you run top I am sure it is the fasta command that is causing this:
> 
> Also not sure why you initialize searchIO twice. Just initialize it in the loop where you use it.
> 
> 
> Is the CPU just from running the application fasta36 ? you can specify the number of threads with -T  -- so to ask for 1 processor add "-T 1" to your fasta cmd
> 
> On Oct 29, 2013, at 2:05 PM, Antony03 <antony.vincent.1 at ulaval.ca> wrote:
> 
>> Hi,
>> 
>> I wrote this perl script http://pastebin.com/PWVKvcQ6 and it uses bioperl
>> modules. It works well (I think) but it uses all my cpu (8)...i don't
>> understand why.
>> 
>> Is someone know how execute my code on only one cpu?
>> 
>> Thanks!
>> 
>> Antony
>> 
>> 
>> 
>> --
>> View this message in context: http://bioperl.996286.n3.nabble.com/Perl-script-uses-all-my-cpu-tp17189.html
>> Sent from the Bioperl-L mailing list archive at Nabble.com.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
> 

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org





More information about the Bioperl-l mailing list