[Bioperl-l] postgres 8.3 - load_seqdatabase.pl / swissprot
Hilmar Lapp
hlapp at gmx.net
Sat Mar 22 21:40:55 EDT 2008
On Mar 22, 2008, at 7:36 PM, Erik wrote:
> The next thing is performance, it's really intolerably
> slow, and I don't think the database is the bottleneck -
> isn't it more likely bioperl object heaviness? I get
> continuous near 100% load for 1 cpu (this machine has 2
> cpus).
Is the database on the same machine? If yes, and a significant
fraction (~30-50% or even more) of the load are generated by the perl
script, rather than almost everything coming from the postmaster,
then indeed the database is not the bottleneck.
Of course, the bioperl object creation overhead takes a toll too. I
would be surprised though if BioPerl can't parse more than 3.6
records/s on a modern CPU; you can convince yourself of that though
by writing a simple script along the lines of the following and see
how fast that goes:
my $seqio = Bio::SeqIO->new(-file => '<uniprot_sprot.dat', -format =>
'swiss);
my $n = 0;
while (my $seq = $seqio->next_seq) {
$n++;
# print something every 5,000 sequences or so
}
But maybe load_seqdatabase.pl or even BioSQL or BioPerl aren't
suitable for your use-case?
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list