[Bioperl-l] convert fasta output to blast -m8?

Jason Stajich jason.stajich at duke.edu
Tue May 24 12:17:57 EDT 2005


I think this all depends on your version of FASTA and Bioperl - there  
were some changes in the FASTA output format which caused breakage in  
older bioperl SearchIO:;fasta parser. I answered a similar question  
recently on the list:
   http://bioperl.org/pipermail/bioperl-l/2005-May/018870.html

Also if you are just doing -m8 output I would run fasta with -d 0 -m  
9 options.

And if you really just want to do FASTA 2 BLAST tables (which I do  
all the time for my stuff) and want a super-fast parser for this I  
wrote a simple script in
scripts/searchio/fastam9_to_table.PLS


-jason

On May 24, 2005, at 11:14 AM, Amir Karger wrote:

> Hi.
>
> I've been asked to translate Fasta output to Blast -m8 output. I  
> could do it
> by hand, but I have a feeling SearchIO & Writer can do this pretty  
> easily.
> Can someone give me a couple hints?
>
> I tried running a ridiculously simple script on fasta -m9 output:
>
> use Bio::SearchIO;
> my $searchio = new Bio::SearchIO(-format => 'fasta',
>                                 -file   => 'short.out');
> while( my $result = $searchio->next_result ) {
>     print $result->query_name;
> }
>
> And I got:
>
> Use of uninitialized value in concatenation (.) or string at
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Search/HSP/GenericHSP.pm  
> line 231,
> <GEN1> line 61.
>
> ------------- EXCEPTION  -------------
> MSG: Did not specify a Query End or Query Begin -verbose 0 - 
> algorithm FASTP
> -score 186.3 -hit_frame 0 -hsp_length 300 -hit_seq
> PPPPPPTAETFDSDQTSSFSDINSTTASAPTTPAPALPPASPEVRKEETHPKHSLPPLPNQFAPLPDPPQ 
> HNSPPQ
> NNAPSQPQSNPFPFPIPEIPSTQSATNPFPFPVPQQQ-- 
> FNQAPSMGIPQQNRPLPQLPNRNNRPVPPPPPMRTTT
> EGSGVRL---PAPPPP---PRRGPAPPPPPHRHVTSNTL------ 
> NSAGGNSLLPQATGRRGPAPPPPPRASRPTP
> NVTMQQNPQQYNNSNRPFGYQTNSNMSSPPPPPVTTFNTLTPQMTAATGQPAVPLPQNTQAPSQATNVPV 
> AP
> -hit_length 300 -query_length 300 -query_frame 0 -swscore 212 -rank 1
> -query_seq
> MYQSMTVP-PFRPYGGDDIRVVSDLSRFDYQPDQKIRSRNPTPP--- 
> STINDNVSSSKLTLDTIIPLY---SSKID
> ERPKYSPLRQQEDRSTQYPSPPIPVKEEPTITIPKREKKKVRYSIGVQVPQDNGGISMTNNPAPPAPVPV 
> PVPAPA
> PPPPPPKDIAPRSMPYPQDINNANNLPPMPQPTSQLYPQQQLPPLPYKDSSSITSPQKRLEKKLIKQVMN 
> RPVIQF
> KADRFGQNYEGEYFTISANFVIYVFEVCCSVVEIVLSSILLQRDQDI -homology_seq
> :.: :  : .. ..:      . .  :  .  .  : ::   :  ..:. :.  .    .:.    :..  
>    :
> :. ::.   .: ::  :: ...:   .:.:... :     ...  ...:. .   :::: :   :  .:: 
> :::::
> . ..  ..      :.:..   .:: :..  :    ::   . . ..:  :
> -hit_name lcl|cerevisiae|YOR181W| -bits 44.0 -query_name
> lcl|albicans|CA0100| -evalue 8.3e-05 (qs='
> STACK Bio::Search::HSP::GenericHSP::new
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Search/HSP/GenericHSP.pm:231
> STACK Bio::Search::HSP::FastaHSP::new
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Search/HSP/FastaHSP.pm:97
> STACK Bio::Factory::ObjectFactory::create_object
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Factory/ObjectFactory.pm:150
> STACK Bio::SearchIO::SearchResultEventBuilder::end_hsp
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/SearchIO/ 
> SearchResultEventBuilder.p
> m:275
> STACK Bio::SearchIO::fasta::end_element
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/SearchIO/fasta.pm:872
> STACK Bio::SearchIO::fasta::next_result
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/SearchIO/fasta.pm:403
> STACK toplevel a.pl:8
>
> --------------------------------------
> lcl|albicans|CA0099|
>
> (The last thing is actually the query, so it's sort of doing the right
> thing. And line 61 of short.out (where the uninitialized value  
> happens) is
> the beginning of the second hit.
>
> Running bp_filter_search.pl -format fasta -score 150 on the same  
> output file
> produced no output at all. Is -m9 confusing it? Or is there some other
> problem?
>
> Pointers to docs etc. appreciated.
>
> -Amir Karger
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/




More information about the Bioperl-l mailing list