[Bioperl-l] Call to users/developers -- user cases that bring Bioperl to its knees

Bruno Vecchi vecchi.b at gmail.com
Mon Feb 23 16:52:57 UTC 2009


This trivial example, applied to a large input sequence, could help optimize
what I think is one of the most important BioPerl modules: Bio::SeqIO.

#!/usr/bin/perl
> use strict;
> use warnings;
>
> use Bio::SeqIO;
>
> my $infile = 'sequences.gp'
>
> my $seqI = Bio::SeqIO->new(
>    -file => '<' . $infile,
>    -format => 'genbank',
>    -flush => 0,                    # This makes it go faster
> );
>
> my $seqO = Bio::SeqIO->new(
>    -fh => \*STDOUT,
>    -format => 'fasta',
> );
>
> while (my $seq = $seqI->next_seq) {
>    $seqO->write_seq($seq);
> }
>

Since I don't know what the policy is on file attachments on the mailing
list, I'll refrain from sending you the >4MB file that I had prepared for
profiling. I could send it to you directly if you ask me to, although any
sequence file will do.
Please notice that scripts running under NYTProf's eye are several times
slower; you won't need to code a lot before you can have some scripts whose
profiles will be useful.

Cheers,

Bruno.

2009/2/23 Albert Vilella <avilella at gmail.com>

> Hi all,
>
> I've discovered the profiling wonders of "perl -d:NYTProf -S" and I
> would like to play with it and Bioperl.
>
> Can interested users/developers provide a URL with a dataset that
> brings bioperl to its knees in
> terms of CPU usage for say, about 1h?
>
> Preferably no net access, no calling external programs or other
> complications, just data churning within Bioperl.
> And the data must be public, of course.
>
> The idea would be to try to identify optimizations in the code that
> could benefit us all,
>
> Cheers,
>
>    Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list