[Bioperl-l] Fwd: low complexity filter in StandAloneBlastPlus

Paul Cantalupo pcantalupo at gmail.com
Thu Feb 13 22:59:37 UTC 2014


Paul Cantalupo
University of Pittsburgh


---------- Forwarded message ----------
From: Paul Cantalupo <pcantalupo at gmail.com>
Date: Thu, Feb 13, 2014 at 5:59 PM
Subject: Re: [Bioperl-l] low complexity filter in StandAloneBlastPlus
To: Carnë Draug <carandraug+dev at gmail.com>


Hi Carne,

Careful, Table C3 [4] is for BLASTP. Low complexity is off for BLASTP. But
it is on by default depending on the blast program. For example, low
complexity is on by default for tblastn (see Table C5 [4]).

Paul


Paul Cantalupo
University of Pittsburgh


On Thu, Feb 13, 2014 at 5:52 PM, Carnë Draug <carandraug+dev at gmail.com>wrote:

> On 13 February 2014 22:38, Carnë Draug <carandraug+dev at gmail.com> wrote:
> > On 13 February 2014 22:09, Paul Cantalupo <pcantalupo at gmail.com> wrote:
> >> Hi Carne,
> >>
> >> Take a look at the synopsis of
> >> Bio::Tools::Run::StandAloneBlastPlus::BlastMethods. I think you need to
> use
> >> the method_args parameter:
> >>
> >>  $result = $fac->blastn( -query => 'query_seqs.fas',
> >>                          -outfile => 'query.bls',
> >>                          -method_args => [ '-dust' => 'no' ] );
> >>
> >
> > Hi Paul
> >
> > thank for your reply but where do you see this documentation? This is
> > neither in the last release [1], the bioperl-run repository [2], or
> > the bioperl-live [3] (which doesn't even have ::BlastMethods).
> >
> > Also, I did what you say but get a "Blast run: parameter 'dust' is not
> > available for method 'tblastn'" error.
> >
> > This is my simple code, from the very start (including creation of the
> > database):
> >
> > Bio::Tools::Run::StandAloneBlastPlus->new(
> >   -db_data => $db_data,
> >   -db_name => $db_name,
> >   -create => 1,
> > )->make_db();
> >
> > my $fac = Bio::Tools::Run::StandAloneBlastPlus->new(
> >   -db_name => $db_name,
> > );
> >
> > foreach my $file (@files) {
> >   my $result = $fac->tblastn(
> >     -query   => $file,
> >     -outfile => $file . '.bls',
> >     -method_args => ['-dust' => 'no'],
> >   );
> >   ## do stuff with $result
> > }
> >
> > Thank you
> > Carnë
> >
> > [1]
> https://metacpan.org/pod/Bio::Tools::Run::StandAloneBlastPlus::BlastMethods
> > [2]
> https://github.com/bioperl/bioperl-run/blob/master/lib/Bio/Tools/Run/StandAloneBlastPlus/BlastMethods.pm
> > [3]
> https://github.com/bioperl/bioperl-live/blob/master/Bio/Tools/Run/StandAloneBlast.pm
>
> For future reference, I just figured it out by reading blast+ manual
> Table C3 [4] and some comments in the source of
> Bio::Tools::Run::StandAloneBlastPlus. Basically, dustmasker is used
> for nucleotides only and segmaster for protein. From your example, I
> deduce that I could use "seg" to disable for protein sequences. The
> following works fine and gives me the same results as disabling low
> complexity filter in the NCBI web interface.
>
> my $result = $fac->tblastn(
>   -query   => $file,
>   -outfile => $file . '.bls',
>   -method_args => ['-seg' => 'no'],
> );
>
> Also, for future reference, note that blast+ default is "no"
> (according to its manual), but bioperl's module changes it to "yes".
> I'm guessing this is to use the same defaults as the NCBI web
> interface.
>
> Thank you all, one more time.
>
> Carnë
>
> [4] http://www.ncbi.nlm.nih.gov/books/NBK1763/
>




More information about the Bioperl-l mailing list