[Bioperl-l] Splitting BLAST report

Dave Messina David.Messina at sbc.su.se
Wed May 11 15:28:39 UTC 2011


Hi Luis-Miguel,

That's right, you can't write out blastxml from within BioPerl.

Have you looked at blast_formatter that comes with BLAST+, though? From the
manual:

"It may be helpful to view the same BLAST results in different formats. A
user may first parse the tabular format looking for matches meeting a
certain criteria, then go back and examine the relevant alignments in the
full BLAST report."

So you might be able to do what you want using command-line tools from the
NCBI.

Otherwise, xml is extremely structured, so if you just want to break it up
into files query by query, it would probably be pretty straightforward using
good old-fashioned Perl.


Dave




2011/5/11 Luis-Miguel Rodríguez Rojas <lmrodriguezr at gmail.com>

> Dear all,
>
> Is there a way to split BlastXML report with multiple queries into several
> BlastXML reports with one query each?
>
> So far, I have something similar to the following code:
>
> *#!/usr/bin/perl*
> use strict;
> use Bio::SearchIO;
> *
> *
> *# $file contains the output file*
> *# $severalQueries contain the queries*
> *# %args contains other BLAST parameters*
> *# [...] First, run the large BLAST:*
> my $factory = Bio::Tools::Run::StandAloneBlast->new(%args);
> $factory->o($file);
> $factory->m(7); *# BlastXML*
> my $report = $factory->blastall($severalQueries);
> *
> *
> *# $dir is the output directory*
> *# [...] Now, in another script, or another part of the script:*
> my $report = Bio::SearchIO->new(-file=>$file, -format=>'blastxml');
> mkdir $dir unless -d $dir;
> while(my $result = $report->next_result){
>    my $newFile = $dir."/".$result->query_accession.".xml";
>    my $searchIO = Bio::SearchIO->new(-file=>">$newFile",
> -output_format=>'blastxml');
>    $searchIO->write_result($result);
> }
> *
> *
> *# [...]*
> *__END__*
>
> The BLAST runs correctly, and the first output (the large XML) is there.
>  However, the second part fails with the following message, clearly stating
> that I can't create a Bio::SearchIO object with output format BlastXML:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Failed to load module Bio::SearchIO::Writer::blastxml. *Can't locate
> Bio/SearchIO/Writer/blastxml.pm* in @INC (@INC contains:
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib
> /home/equipe/resistance/lrodrigu/lib/perl5
> /home/equipe/resistance/lrodrigu/lib/perl5/5.8.8
> /home/equipe/resistance/lrodrigu/lib/perl5/site_perl/5.8.8
> /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
> /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
> /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi
> /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
> /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm line 439, <GEN25> line
> 48675.
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:368
> STACK: Bio::Root::Root::_load_module
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:441
> STACK: Bio::SearchIO::new
> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO.pm:180
> STACK: Unus::Blast::run
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Blast.pm:88
> STACK: Unus::Orth::BsrAuto::extract_values
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:47
> STACK: Unus::Orth::BsrAuto::thresholds
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:31
> STACK: Unus::Orth::Bsr::build_orthref_file
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:42
> STACK: Unus::Orth::Bsr::run
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:25
> STACK: Unus::Unus::calculate_orthologs
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:175
> STACK: Unus::Unus::run
> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:149
> STACK: /home/equipe/resistance/lrodrigu/bin/unus2:45
> -----------------------------------------------------------
>
>
> I checked the supported output formats (Bio::SearchIO::Writer::*) and none
> of the supported formats seem to be either BlastXML or Blast, so I assume I
> am no in the right direction.
>
> Thanks in advance!
>
> Best,
> LRR
>
> --
> Luis M. Rodriguez-R
> [ http://thebio.me/lrr ]
> ---------------------------------
> UMR Résistance des Plantes aux Bioagresseurs - Group effecteur/cible
> Institut de Recherche pour le Développement, Montpellier, France
> [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ]
> +33 (0) 6.29.74.55.93
>
> Unidad de Bioinformática del Laboratorio de Micología y Fitopatología
> Universidad de Los Andes, Bogotá, Colombia
> [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ]
> +57 (1) 3.39.49.49 ext 2777
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>




More information about the Bioperl-l mailing list