[Biopython] Blast using Biopython
Martin Mokrejs
mmokrejs at fold.natur.cuni.cz
Tue Oct 15 22:33:32 UTC 2013
Hi Tanya,
I suppose you use the newer ncbi-tools++ suite. Try the legacy blastn from the ncbi-tools suite.
The version numbering is same ... I have better experience with "blastall -p blastn" form the old
suite. You can also try to find some switch to force the really old blastn algorithm buried in
blastall (nowadays the blastall uses the new algorithm which is in the new ncbi-tools++ suite).
However, experience shows that "blastall -p blastn" gives different results compared to blastn
although BOTH should be in theory using the new algorithm. With the possibility to force the real
predecessor of the algorithm in blastall you have a third method to test.
From blastall you get only limited results into CSV-formatted output, you cannot change the
output columns. For me important results can be only parsed from XML/plaintext results of blastall.
You can increase the reward for a match "-r 2" to overcome some gaps on sides but depends what
queries you have and whether that does not give you elsewhere falsely widened alignments. You have
to test that.
Good luck,
Martin
Tanya Golubchik wrote:
> Hi guys,
>
> This is strictly speaking more about blast than biopython, but I was wondering if anyone has any tips on doing the following: searching for a hit in a nucleotide database using tblastn, but reporting the actual DNA sequence of the subject, rather than the translated protein sequence. Is there by any chance a way of extracting this from the XML output?
>
> What I'm finding is that blastn sometimes misses the edges, where substitutions close the ends of my hit result in a truncated hit (rather than a complete hit with a mismatch or two). The full hit is reported correctly by tblastn, but of course this returns the protein translation rather than the original nucleotide sequence. It's probably a long shot, but just wondering if anyone has ideas -- the brute force approach would be to get the start and stop positions from tblastn and then extract and re-align this fragment to my query, but that seems redundant given that blast has already done this for me...
More information about the Biopython
mailing list