[Bioperl-l] timing out a blast in StandAloneBlast.pm

BHurwitz@twt.com BHurwitz@twt.com
Fri, 28 Jun 2002 11:39:04 -0500


I posted this once before but I didn't get any responses, so I thought I'd
post a little more detail.  If this is the wrong place for asking this
question just let me know.  Thanks!

I am using StandAloneBlast.pm module in my program to run blast and so
far it is great!  I have one problem though, some blasts take hours to run
since they are matching against repetitive element and generating tons of
HSPs.  Since Blast does not have any way of setting a max number of HSPs,
I was thinking about altering the StandAloneBlast.pm module to set a time
limit on the blast and just retrieve the results that it got within the
specified period of time.  This would probably require some sort of fork
and exec, rather than a system call and use of the alarm command for
timing.
I was wondering if anyone has any advice or if someone else has already
generated similar code before?

Probably the easiest solution would be to repeatmask the sequences before
putting them into Blast to limit these problems.  But there are two things
we are worried about, one is over-masking the sequences and two even if we
have a sequence that is repetitive element we still need to find a place
for them on the genome.  However, I wonder if I just time out the Blast if
I will get the best HSPs or if they occur randomly in the Blast search.

Thank you so much for any advice you can provide,

Bonnie