[Biopython] problem blasting on line
Eric Talevich
eric.talevich at gmail.com
Wed Nov 17 21:54:40 UTC 2010
On Wed, Nov 17, 2010 at 4:22 PM, Jessica Grant <jgrant at smith.edu> wrote:
> Hello,
>
> I am trying to use blast to extract contaminating sequences from a set of
> 454 sequence data. My script uses NCBIWWW.qblast as follows:
>
> [...]
>
> I thought about downloading nr and using the standalone blast, but it seems
> the downloadable nr database comes in several parts, already formatted for
> blast. Can I concatenate these?
>
> Any thoughts on the problem with the qblast or other ways to circumvent
> this problem would be greatly appreciated!
>
>
Hi Jessica,
If the problem boils down to grouping all the related sequences together, or
isolating the unrelated sequences, you might also have some luck with
CD-HIT:
http://weizhong-lab.ucsd.edu/cd-hit/
Best,
Eric
More information about the Biopython
mailing list