[Biopython] Bio.Blast; entrez_query= multiple organisms

Peter Cock p.j.a.cock at googlemail.com
Thu Jul 7 11:42:45 UTC 2011


On Thu, Jul 7, 2011 at 12:30 PM, Jesse Colangelo-Lillis
<jessecolangelolillis at googlemail.com> wrote:
> Can someone tell me the format for specifying multiple organisms
> within the blast parameters?
>
> I have this:
>
> result_handle = NCBIWWW.qblast("blastp", "nr", gene_seq, expect=100,
> hitlist_size=1, entrez_query="unclassified Caudovirales[orgn]")
>
> but I actually want to blast against both 'Caudovirales' and
> 'unclassified Caudovirales'.
> Thanks for any help.

You probably would need explicit quotes round unclassified Caudovirales
on the Entrez query, otherwise it will do this I think:

unclassified AND Caudovirales[orgn]

I would use the taxid rather than the name to avoid the space problem.

Is a taxid that covers both your clades of interest?

Otherwise combine fields with OR (or AND as appropriate). Play with
the web interface to build the right query:

http://www.ncbi.nlm.nih.gov/protein/advanced

Peter



More information about the Biopython mailing list