[Bioperl-l] Question about the definition of 'gaps' in blast -m8 output...

Chris Fields cjfields at illinois.edu
Fri Mar 20 19:10:06 UTC 2009


On Mar 20, 2009, at 12:23 PM, Dan Bolser wrote:

> 2009/3/19 Phillip San Miguel <pmiguel at purdue.edu>:
>> Dan Bolser wrote:
>>>
>>> 2009/3/18 Phillip San Miguel <pmiguel at purdue.edu>
>>>
>>>
>>>>
>>>> Dan Bolser wrote:
>>>>
>>>>
>>>>>
>>>>> Can someone clarify the definition of the 'gaps' column in the  
>>>>> blast -m8
>>>>> output format for me?
>>>>>
>>>>> I thought that the column 'gaps' was basically the number of  
>>>>> columns in
>>>>> the
>>>>> HSP that contains a gap character.
>>>>>
>>>>>
>>>>
>>>> Hi Dan,
>>>> "gaps", to me, denotes the number of gaps. Not the total length  
>>>> of all
>>>> the
>>>> gaps.
>>>> Just my interpretation, but given your results my guess is that  
>>>> whomever
>>>> wrote blastall was thinking the way I do.
>>>>
>>>
>>>
>>> Yeah, I'll have to go look at the HSPs to confirm this... I'm just
>>> surprised
>>> that there are not more gaps of length >1. i.e. my data (given your
>>> interpretation) suggests that 90% of the HSPs have no gaps >  
>>> length 1.
>>>
>>
>> Sounds about right. Depends on how you have gap opening vs gap  
>> lengthening
>> parameters set.
>
> I see. I thought that by default extension was less than opening, so I
> had expected there to be more gaps of length >1 ... anyway... where
> can I read more about selecting parameters for certain tasks?
> Currently I'm blasting tomato against potato sequence, and the two
> organisms are known to be 'highly syntenic' - I'm just not sure how
> that translates into how I should set the parameters. I'm after large
> alignments of large regions of the chromosome. My thinking is to just
> run through the list of HSPs and merge based on gap / window size
> (dynamic programming style) - that way I can play with the set of HSPs
> that I have, and look at the effect of different settings, then I can
> just globally align the matching regions using SW (if I need to). Does
> that sound reasonable, or is using the default settings just dumb?
>
> Cheers,
> Dan.

The zebrafinch group here is using BLAT for some of their work, though  
I would suggest AVID, LAGAN, or maybe even MUMmer for this purpose (no  
sure how the latter performs compared to the others, we have used it  
for archaeal whole-genome alignments but nothing larger).

chris




More information about the Bioperl-l mailing list