[Bioperl-l] parse blast-xml output

Brian Osborne bosborne11 at verizon.net
Wed Jan 18 19:20:36 UTC 2012



http://www.bioperl.org/wiki/HOWTO:SearchIO#Sorting


On Jan 18, 2012, at 5:57 AM, Jordi Durban wrote:

> Hi all!
> I'm trying to parse a xml blast output (-m 7 option) in order to get the
> best hit (I mean the first one) from each result I got.
> I've done:
> *
> my @files=<*>;
> my @files2 = grep (/^454AllContigs.fna.masked*/, @files);
> foreach my $blast_report(@files2){
> # Get the report
> my $searchio = new Bio::SearchIO (-format => 'blastxml',
> -file=>$blast_report, -best =>'true');
>  while( my $result = $searchio->next_result ) {
> 
>       my $query=$result->query_name();
>         #~ print @query,"\n"; ##### results quey names
>         while (my $hits = $result->next_hit) {
>        #~ print $hits,"\n"; ###### the whole of hits
>              my $name= $hits->name();
>              my $desc = $hits->description();
>              print $query."\t".$name."\t".$desc,"\n";
> 
> *But it does not work as I get the whole of results from a single query.
> What I mean:
> contig01181    gi|63794|emb|X03832.1|    Chicken mRNA 3' end for fast
> skeletal troponin I (sTnI)
> contig01181    gi|110293358|gb|DQ646396.1|    Lama pacos troponin 1 type 2
> (Tnni2) mRNA, partial cds
> contig01181    gi|298897248|emb|FQ224489.1|    Rattus norvegicus
> TL0ACA64YG07 mRNA sequence
> contig01181    gi|298892466|emb|FQ217985.1|    Rattus norvegicus
> TL0ACA12YG21 mRNA sequence
> contig01181    gi|298889559|emb|FQ217454.1|    Rattus norvegicus
> TL0ACA25YO07 mRNA sequence
> contig01181    gi|298888987|emb|FQ223772.1|    Rattus norvegicus
> TL0ACA87YD21 mRNA sequence
> 
> I know some perl and I think it is a really newbie question but any help
> would be appreciate.
> Thanks a lot.
> -- 
> Jordi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list