[BioPython] How can I get a more explicite error
Yvan Strahm
yvan.strahm at bccs.uib.no
Thu Mar 19 04:10:17 EDT 2009
Hello Brad,
Thanks for the help, much appreciated.
I will look at bowtie and Maq. In fact I am interested into reads which are not in the reference and
how they differ from the reference, how many reads have 1,2,3,.... indels/mismatch.
Cheers,
yvan
Brad Chapman wrote:
> Hi Yvan;
>
>> I try to get a grip on Biopython and followed the chapter 6 form the
>> tutorial (http://www.biopython.org/DIST/docs/tutorial/Tutorial.html)
>>
>> I run this script:
> [...]
>> blast_results = result_handle.read()
> [...]
>> [yvans at lundalm BEE]$ python bioblast.py s_1_2_eland_extended.8000000.fta
>> Traceback (most recent call last):
>> File "bioblast.py", line 16, in <module>
>> blast_results = result_handle.read()
>> SystemError: Objects/stringobject.c:4271: bad argument to internal function
>>
>> if the number of sequence blasted agianst the db is greater than 500000.
>> The sequence are small reads from a solexa sequencing project.
>
> The result_handle.read() line is pulling the entire large BLAST result
> file into memory as a string. You will run out of memory with huge files,
> leading to the errors you are seeing.
>
> To limit the problem, run BLAST initially at the command line,
> and then process the resulting XML file with the BLAST parser
> as described here:
>
> http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc56
>
> This iterates over 1 record at a time, avoiding the memory issue.
>
> However, you should be using a short read aligner to map these reads
> to the genome. BLAST is not the right tool for this particular
> application; massive BLAST report files are going to be one of many
> problems you will run into analyzing the data. Here are a couple of
> popular aligners designed for the exact problem you are tackling:
>
> Bowtie: http://bowtie-bio.sourceforge.net/index.shtml
> Maq: http://maq.sourceforge.net/
>
> Hope this helps,
> Brad
> _______________________________________________
> BioPython mailing list - BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
More information about the BioPython
mailing list