[Biopython] Help modify this code so it can do what I want it to do
Ivan Gregoretti
ivangreg at gmail.com
Mon Feb 3 13:43:17 UTC 2014
Hello Edson,
There is an argument that you can pass to tblastn that is called
max_hsps_per_subject. Try -max_hsps_per_subjec=1 and be sure not to
pass the flag -ungapped. That might do the job for you.
The help says
tblastn -help
...
*** Statistical options
-dbsize <Int8>
Effective length of the database
-searchsp <Int8, >=0>
Effective length of the search space
-max_hsps_per_subject <Integer, >=0>
Override maximum number of HSPs per subject to save for ungapped searches
(0 means do not override)
Default = `0'
...
Ivan
Ivan Gregoretti, PhD
On Mon, Feb 3, 2014 at 7:19 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Sun, Feb 2, 2014 at 7:28 PM, Edson Ishengoma
> <ishengomae at nm-aist.ac.tz> wrote:
>> Hi folks,
>>
>> I picked this code from somewhere and edited it a bit but it still can't
>> achieve what I need. I have an xml output of tblastn hits on my customized
>> database and now I am in the process to extract the results with biopython.
>> With tblastn sometimes the returned hit is multiple local hits corresponding
>> to certain positions along the query with significant scores. Now I want to
>> concatenate these local hits which initially requires sorting according to
>> positions.
>>
>> ...
>> complete_query_seq += str(query[q_start:q_end])
>> complete_sbjct_seq += str(query[sb_start:sb_end])
>> ...
>
> Shouldn't you be taking a slice from the subject sequence (the database
> match) there, rather than the query sequence?
>
> Another approach would be to use the alignment sequence fragments
> BLAST gives you (and remove the gap characters).
>
> Peter
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list