[Biopython] multiple sequence blast

Dilara Ally dilara.ally at gmail.com
Sun Jul 3 19:27:53 UTC 2011


Hi Peter

How long will it take then to do a big BLAST job that has over 600,000 
contigs.  Wouldn't downloading the databasese and doing a standalone 
BLAST take a lot of cpu memory?  Should I be doing this on a cluster?

Dilara

On 7/2/11 2:18 AM, Peter Cock wrote:
> On Thu, Jun 30, 2011 at 11:42 AM, Brad Chapman<chapmanb at 50mail.com>  wrote:
>> Dilara;
>> Thanks for the message. It would be helpful if you'd include the
>> error message traceback that you got stuck on; this will help
>> pinpoint the problem.
>>
>>  From reading your code, my guess is that you are getting and IOError
>> about files not existing. When you do os.listdir, it only includes
>> the name of the files, not the full path to where they are located.
> I would have suggested the same thing.
>
> In addition, are you really trying to run 100,000 contigs though
> the NCBI online BLAST service? If it works it will take a long time,
> but they might not like that and block your access. Big BLAST
> jobs like this are better done by installing BLAST+ (and in this case
> the NR database) locally. Biopython has wrappers to help call
> standalone BLAST too.
>
> Peter
>



More information about the Biopython mailing list