[Biopython] problem blasting on line

Eric Talevich eric.talevich at gmail.com
Wed Nov 17 21:54:40 UTC 2010


On Wed, Nov 17, 2010 at 4:22 PM, Jessica Grant <jgrant at smith.edu> wrote:

> Hello,
>
> I am trying to use blast to extract contaminating sequences from a set of
> 454 sequence data.  My script uses NCBIWWW.qblast as follows:
>
> [...]
>
> I thought about downloading nr and using the standalone blast, but it seems
> the downloadable nr database comes in several parts, already formatted for
> blast.  Can I concatenate these?
>
> Any thoughts on the problem with the qblast or other ways to circumvent
> this problem would be greatly appreciated!
>
>
Hi Jessica,

If the problem boils down to grouping all the related sequences together, or
isolating the unrelated sequences, you might also have some luck with
CD-HIT:
http://weizhong-lab.ucsd.edu/cd-hit/

Best,
Eric



More information about the Biopython mailing list