[Bioperl-l] genbank (blast) alignments

Chris Fields cjfields at illinois.edu
Thu Jul 23 18:53:29 EDT 2009


Lots of emails to answer, so little time.  Doesn't help when my VPN  
goes out either ;>

What you want appears to be generating a multiple alignment from  
pairwise alignment.   The answer is 'very likely not'.  However, the  
local BLAST executable does have several options for generating  
alignments from HSP data (assuming that's what you mean):

-m :     alignment view options:
	0 = pairwise,
	1 = query-anchored showing identities,
	2 = query-anchored no identities,
	3 = flat query-anchored, show identities,
	4 = flat query-anchored, no identities,
	5 = query-anchored no identities and blunt ends,
	6 = flat query-anchored, no identities and blunt ends,
	7 = XML Blast output,
	8 = tabular,
	9 tabular with comment lines [Integer] default = 0

You can set this by reformatting on the BLAST web site (here's a chunk  
of the output, note the query):

Query         61   PVTVGEIDITLYRDDLS-KKTSND-E--PLVKGADIPVDIT------- 
DQKVILVDDVLY  109
NP_389430     61   PVTVGEIDITLYRDDLS-KKTSND-E--PLVKGADIPVDIT------- 
DQKVILVDDVLY  109
YP_001421124  61   PVTVGEIDITLYRDDLT-KKTSNE-E--PLVKGADIPADIT------- 
DQKVIVVDDVLY  109
YP_078940     63   KVTVGELDITLYRDDLS-KKTSNK-E--PLVKGADIPADIT------- 
DQKVILVDDVLY  111
ZP_03053294   61   PVIVGELDITLYRDDLT-KKTENQ-D--PLVKGADIPADIN------- 
DKTLIVVDDVLF  109
YP_001486689  61   PVIVGELDITLYRDDLT-KKTDNQ-D--PLVKGADIPADIN------- 
DKTLIVVDDVLF  109
YP_002949168  60   AVPVGELDITLYRDDLT-VKTIDH-E--PLVKGTDVPFDVT------- 
NKKVILVDDVLF  108
ZP_01860800   61   KMPVGEIDITLYRDDLT-VKTANE-E--PEVKGSDLPVDVT------- 
DKKVILIDDVLF  109
ZP_04121773   61   EMEVGELDITLYRDDLT-LQSKNK-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
ZP_04218628   61   EMEVGELDITLYRDDLT-LQSKNK-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
YP_002316154  66   SIPVGELDITLYRDDLT-VKTDDR-E--PLVKGTDVPFSVT------- 
NQKVILVDDVLF  114
ZP_00240953   61   EMEVGELDITLYRDDLT-LQSKNE-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
YP_037953     61   EIEVGELDITLYRDDLT-LQSKNK-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
ZP_04193166   61   KMEVGELDITLYRDDLT-LQSKNK-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
NP_833611     61   EMEVGELDITLYRDDLT-LQSKNK-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
ZP_03018932   61   EMEVGELDITLYRDDLT-LQSKNK-E--PLVKGSDIPVDIT------- 
KKKVILVDDVLY  109
...

We do not have a parser for that format, BTW, but it wouldn't be too  
hard to get something working quickly based on one of the current  
parsers.  Probably could go AlignIO or SearchIO (or both).

chris

On Jul 23, 2009, at 2:38 PM, Robert Buels wrote:

> Wow, that silence is deafening.  I can't believe somebody who knows  
> what they're talking about hasn't written you back yet.
>
> Perhaps you could do some kind of transformation where you read in  
> the BLAST report with Bio::SearchIO, and then write to MSF with  
> Bio::AlignIO::msf?  You would probably need to do some fiddling to  
> create the proper objects and relationships that Bio::AlignIO::msf  
> would want.
>
> But this reply probably isn't helpful, because you probably already  
> knew that much.  I'm mostly just trying to add to this thread so  
> that people who actually know a lot about BioPerl's functions in  
> this area will see it and hopefully be of more help.
>
> Rob
>
> -- 
> Robert Buels
> Bioinformatics Analyst, Sol Genomics Network
> Boyce Thompson Institute for Plant Research
> Tower Rd
> Ithaca, NY  14853
> Tel: 503-889-8539
> rmb32 at cornell.edu
> http://www.sgn.cornell.edu
>
>
> Thomas Keller wrote:
>> Greetings,
>> Blast 2.2.21 has a multi-sequence alignment feature that is really  
>> handy: put in the accession number of the refseq in one sequence  
>> field and a concatenated fasta file of the Sanger reads to align in  
>> the second box and it does the alignments. Unfortunately, the  
>> output is a series of alignments rather than the more useful msf  
>> format with all reads aligned with the reference.
>> Is there a bioperl module that reads the blast alignments and  
>> converts it to an msf alignment?
>> thanks,
>> Tom
>> kellert at ohsu.edu
>> 503-494-2442
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list