[Bioperl-l] parsing blast output and obtaining seq info

Gopinath Ganji gopi@bioinfo.sickkids.on.ca
Mon, 30 Jul 2001 16:12:53 -0400


Hey everyone,
Been teaching myself some bioperl to automate a few tasks in our project
for the past couple of weeks. The tasks at hand may be summarized as
follows:
- parse blast output for frac_aligned_query which is within the filter()

- for each filtered hit, obtain seq (by using the
get_by_acc($hit->name()) , obtain subseq and for - (anti) sense strands,
revcom() the string obtained from subseq

The blast output is in a multi-report format. For testing purposes, a
shorter version has been used which works problem free. However, upon
applying the script to the original file, I have frequently encountered
error messages like "subseq() not defined" or "revcom() not defined"
each time halting at a different line in the script. So when I actually
looked at the output generated by the script, the script seems to halt
at a different hit each time. Appears that the program crashes for some
reason - runs out of memory, perhaps? I do know that parse method is
known for memory leaks. However, the same scenario is duplicated for
merely obtaining seq using the Blat::DB::GenBank module (using the
get_by_acc method (see above)). Is this module known to suffer from
leaks or crashes?  Please clarify. Any help/suggestions/pointers would
be much appreciated.

Thanks in advance,
Gopi

[PS: Kindly let me know if I am not making any sense :-)]