[BioPython] How can I get a more explicite error

Yvan Strahm yvan.strahm at bccs.uib.no
Thu Mar 19 08:10:17 UTC 2009


Hello Brad,

Thanks for the help, much appreciated.
I will look at bowtie and Maq. In fact I am interested into reads which are not in the reference and 
   how they differ from the reference, how many reads have 1,2,3,.... indels/mismatch.
Cheers,
yvan

Brad Chapman wrote:
> Hi Yvan;
> 
>> I try to get a grip on Biopython and followed the chapter 6 form the  
>> tutorial (http://www.biopython.org/DIST/docs/tutorial/Tutorial.html)
>>
>> I run this script:
> [...]
>> blast_results = result_handle.read()
> [...]
>> [yvans at lundalm BEE]$ python bioblast.py s_1_2_eland_extended.8000000.fta
>> Traceback (most recent call last):
>>    File "bioblast.py", line 16, in <module>
>>      blast_results = result_handle.read()
>> SystemError: Objects/stringobject.c:4271: bad argument to internal function
>>
>> if the number of sequence blasted agianst the db is greater than 500000.
>> The sequence are small reads from a solexa sequencing project.
> 
> The result_handle.read() line is pulling the entire large BLAST result
> file into memory as a string. You will run out of memory with huge files,
> leading to the errors you are seeing.
> 
> To limit the problem, run BLAST initially at the command line,
> and then process the resulting XML file with the BLAST parser
> as described here:
> 
> http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc56
> 
> This iterates over 1 record at a time, avoiding the memory issue.
> 
> However, you should be using a short read aligner to map these reads
> to the genome. BLAST is not the right tool for this particular
> application; massive BLAST report files are going to be one of many
> problems you will run into analyzing the data. Here are a couple of
> popular aligners designed for the exact problem you are tackling:
> 
> Bowtie: http://bowtie-bio.sourceforge.net/index.shtml
> Maq: http://maq.sourceforge.net/
> 
> Hope this helps,
> Brad
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list