[Biopython] align sequence to genomic DNA

Peter Cock p.j.a.cock at googlemail.com
Sat Feb 28 11:53:38 UTC 2015


On Sat, Feb 28, 2015 at 5:48 AM, Horea Chrristian <h.chr at mail.ru> wrote:
> Thanks for your answer. I want my code to be as reproducible as possible and
> have as little dependencies as possible, so I would rather use remote BLAST
> than set it up (and expect all my users to set it up) locally.

BLAST like this at the NCBI will use the latest nt database, which is not
fully reproducible as the database is regularly updated.

> is there any way to BLAST remotely just against the mouse?

Yes, using the NCBI BLAST+ standalone command line tools with their
option -remote to send the search to the NCBI servers you can include
an Entrez search restriction to just the mouse, e.g. txid10090[ORGN]
using the NXBI taxonomy ID for the house mouse with the command
line option -entrez_query.

e.g.
http://www.biopython.org/pipermail/biopython/2009-September/005620.html

However that means adding a dependency (the BLAST+ command
line tools).

I'm not sure if the NCBI offers a mouse only database via QBLAST
which would be the easiest route.

> You said BLAST might not be
> the best solution - what then? I tried EMBOSS' supermatcher, but that
> requires local sequences and is excruciatingly slow....
>
> Also, my main concern right now is finding the position of my hits on the
> genome (chromosome) - could you also tell me something about that?
>
> Best,
> Christian

How many query sequences do you have?

How long are your query sequences? Are your query sequences
high throughput sequencing reads?

Peter


More information about the Biopython mailing list