[Bioperl-l] Splitting BLAST report

Luis-Miguel Rodríguez Rojas lmrodriguezr at gmail.com
Wed May 11 15:32:50 UTC 2011


Hello Dave,

Thanks for your answer.  Yes, I am currently working in your second
suggestion.  It is pretty simple, indeed.

Thanks!
LRR

--
Luis M. Rodriguez-R
[ http://thebio.me/lrr ]
---------------------------------
UMR Résistance des Plantes aux Bioagresseurs - Group effecteur/cible
Institut de Recherche pour le Développement, Montpellier, France
[ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ]
+33 (0) 6.29.74.55.93

Unidad de Bioinformática del Laboratorio de Micología y Fitopatología
Universidad de Los Andes, Bogotá, Colombia
[ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ]
+57 (1) 3.39.49.49 ext 2777



2011/5/11 Dave Messina <David.Messina at sbc.su.se>

> Hi Luis-Miguel,
>
> That's right, you can't write out blastxml from within BioPerl.
>
> Have you looked at blast_formatter that comes with BLAST+, though? From the
> manual:
>
> "It may be helpful to view the same BLAST results in different formats. A
> user may first parse the tabular format looking for matches meeting a
> certain criteria, then go back and examine the relevant alignments in the
> full BLAST report."
>
> So you might be able to do what you want using command-line tools from the
> NCBI.
>
> Otherwise, xml is extremely structured, so if you just want to break it up
> into files query by query, it would probably be pretty straightforward using
> good old-fashioned Perl.
>
>
> Dave
>
>
>
>
> 2011/5/11 Luis-Miguel Rodríguez Rojas <lmrodriguezr at gmail.com>
>
>> Dear all,
>>
>> Is there a way to split BlastXML report with multiple queries into several
>> BlastXML reports with one query each?
>>
>> So far, I have something similar to the following code:
>>
>> *#!/usr/bin/perl*
>> use strict;
>> use Bio::SearchIO;
>> *
>> *
>> *# $file contains the output file*
>> *# $severalQueries contain the queries*
>> *# %args contains other BLAST parameters*
>> *# [...] First, run the large BLAST:*
>> my $factory = Bio::Tools::Run::StandAloneBlast->new(%args);
>> $factory->o($file);
>> $factory->m(7); *# BlastXML*
>> my $report = $factory->blastall($severalQueries);
>> *
>> *
>> *# $dir is the output directory*
>> *# [...] Now, in another script, or another part of the script:*
>> my $report = Bio::SearchIO->new(-file=>$file, -format=>'blastxml');
>> mkdir $dir unless -d $dir;
>> while(my $result = $report->next_result){
>>    my $newFile = $dir."/".$result->query_accession.".xml";
>>    my $searchIO = Bio::SearchIO->new(-file=>">$newFile",
>> -output_format=>'blastxml');
>>    $searchIO->write_result($result);
>> }
>> *
>> *
>> *# [...]*
>> *__END__*
>>
>> The BLAST runs correctly, and the first output (the large XML) is there.
>>  However, the second part fails with the following message, clearly
>> stating
>> that I can't create a Bio::SearchIO object with output format BlastXML:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: Failed to load module Bio::SearchIO::Writer::blastxml. *Can't locate
>> Bio/SearchIO/Writer/blastxml.pm* in @INC (@INC contains:
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib
>> /home/equipe/resistance/lrodrigu/lib/perl5
>> /home/equipe/resistance/lrodrigu/lib/perl5/5.8.8
>> /home/equipe/resistance/lrodrigu/lib/perl5/site_perl/5.8.8
>> /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
>> /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
>> /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi
>> /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
>> /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at
>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm line 439, <GEN25> line
>> 48675.
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw
>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:368
>> STACK: Bio::Root::Root::_load_module
>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:441
>> STACK: Bio::SearchIO::new
>> /usr/lib/perl5/site_perl/5.8.8/Bio/SearchIO.pm:180
>> STACK: Unus::Blast::run
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Blast.pm:88
>> STACK: Unus::Orth::BsrAuto::extract_values
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:47
>> STACK: Unus::Orth::BsrAuto::thresholds
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/BsrAuto.pm:31
>> STACK: Unus::Orth::Bsr::build_orthref_file
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:42
>> STACK: Unus::Orth::Bsr::run
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Orth/Bsr.pm:25
>> STACK: Unus::Unus::calculate_orthologs
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:175
>> STACK: Unus::Unus::run
>> /home/equipe/resistance/lrodrigu/Runbox/Unus/lib/Unus/Unus.pm:149
>> STACK: /home/equipe/resistance/lrodrigu/bin/unus2:45
>> -----------------------------------------------------------
>>
>>
>> I checked the supported output formats (Bio::SearchIO::Writer::*) and none
>> of the supported formats seem to be either BlastXML or Blast, so I assume
>> I
>> am no in the right direction.
>>
>> Thanks in advance!
>>
>> Best,
>> LRR
>>
>> --
>> Luis M. Rodriguez-R
>> [ http://thebio.me/lrr ]
>> ---------------------------------
>> UMR Résistance des Plantes aux Bioagresseurs - Group effecteur/cible
>> Institut de Recherche pour le Développement, Montpellier, France
>> [ http://bioinfo-prod.mpl.ird.fr/xantho | Luismiguel.Rodriguez at ird.fr ]
>> +33 (0) 6.29.74.55.93
>>
>> Unidad de Bioinformática del Laboratorio de Micología y Fitopatología
>> Universidad de Los Andes, Bogotá, Colombia
>> [ http://lamfu.uniandes.edu.co | luisrodr at uniandes.edu.co ]
>> +57 (1) 3.39.49.49 ext 2777
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>




More information about the Bioperl-l mailing list