[Bioperl-l] How to parse BLAST output - all hits of each queryinnew file

Wed Nov 25 19:21:27 UTC 2009

whoops: change the following line:
my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );

to

my $blio = Bio::SearchIO->new( -file => ">$qid\.bls", -format=>'blast' );

(I always forget that...)
MAJ

----- Original Message ----- 
From: "Mark A. Jensen" <maj at fortinbras.us>
To: "Tim" <timbourine81 at gmail.com>; <bioperl-l at lists.open-bio.org>
Sent: Wednesday, November 25, 2009 1:20 PM
Subject: Re: [Bioperl-l] How to parse BLAST output - all hits of each queryinnew 
file


> hey Tim--
>
> Sound like you need to go about collecting your queries inside out:
>
> my %hits_by_query;
> for ($result->hits) {
>  push @{$hits_by_query{$hit->name}} $hit;
> }
>
> I believe now each hash element, keyed by the query name, will contain
> an arrayref to the set of hits assoc with that query.
>>From here, I believe
>
> use Bio::Search::Result::BlastResult;
> use Bio::SearchIO;
>
> foreach my $qid ( keys %hits_by_query ) {
>  my $result = Bio::Search::Result::BlastResult->new();
>  $result->add_hit($_) for ( @{$hits_by_query{$qid}} );
>  my $blio = Bio::SearchIO->new( -file => $qid.".bls", -format=>'blast' );
>  $blio->write_result($result);
> }
>
> will do what you want.
>
> hope this helps -
> Mark
>
> ----- Original Message ----- 
> From: "Tim" <timbourine81 at gmail.com>
> To: <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, November 25, 2009 12:40 PM
> Subject: [Bioperl-l] How to parse BLAST output - all hits of each query innew 
> file
>
>
>> Dear bioperl users,
>>
>> I am a real newbie and have - maybe a very trivial - question.
>>
>> I searched the mailing list archive and many howtos but I have not found
>> a concrete answer to my problem. So hopefully you can help me :)
>>
>> Background: I use the latest Bioperl version (installed it two weeks
>> before).
>> When I use Bio::Tools::Run::StandAloneBlast to BLAST one fasta file
>> including different sequences, I get a BLAST output with many queries
>> each having several hits / sbjcts.
>>
>> My problem is how to parse *all* hits of *one* query into a single new
>> file. And this for all the queries I have in my BLAST output file.
>>
>> Or is it better the other way round; first to make fasta files with only
>> single sequences inside and BLAST each file? But how can I automize that
>> using Bioperl?
>>
>> I tried Bio::SearchIO but can only parse all queries and their
>> respective hits in only one file...
>> I think iteration is also necessary here, but I do not really know how
>> to include that into Bio::SearchIO.
>> Or do I have to use Module:Bio::Index::Blast?
>>
>> I can index a file (see below), but I have no idea what comes next...
>>
>> ###How I index a file...
>>
>> #!/usr/bin/perl -w
>>
>> $ENV{BIOPERL_INDEX_TYPE} = "SDBM_File";
>>
>> use Bio::Index::Fasta;
>>
>>
>> $file_name = "8_to_BLAST_two_seq_index.fasta";
>> $id = "48882";
>> $inx = Bio::Index::Fasta->new (-filename => $file_name . ".idx",
>> -write_flag => 1);
>> $inx->make_index($file_name);
>>
>>
>> Hopefully, you can give me at least hints what to look for.
>>
>> A big THANKS in advance!
>>
>> Cheers,
>>
>> Tim
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>