[Bioperl-l] clustalw

Sendu Bala bix at sendu.me.uk
Tue Mar 6 13:33:59 UTC 2007


Luba Pardo wrote:
> Hello,
> I tried to post this question yesterday (sorry if you get the email several
> times).

We did. Please only send one email and trust that it will make it to the 
list.


> I am trying to run a script for Clustalw based on few examples. I always get
> an error:
> 
> EXCEPTION: Bio::Root::Exception -------------
> MSG: Bad input data (sequences need an id ) or less than 2 sequences in
> ARRAY(0x8861280) !
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.1/Bio/Root/Root.pm:359
> STACK: Bio::Tools::Run::Alignment::Clustalw::align
> /usr/lib/perl5/site_perl/5.8.1/Bio/Tools/Run/Alignment/Clustalw.pm:484
> STACK: clustal1.pl:17

As the Exception message states, you probably didn't supply 2 or more 
sequences. See how many elements @seq_array has after your while loop. 
What exactly is 'clustalw.fa'? Is it really a plain, unaligned 
multi-fasta file with 2 or more sequences in it?


>  or
> 
> EXCEPTION: Bio::Root::NotImplemented -------------
> MSG: Abstract method "Bio::Tools::Run::WrapperBase::run" is not implemented
> by package Bio::Tools::Run::Alignment::Clustalw.
> This is not your fault - author of Bio::Tools::Run::Alignment::Clustalw
> should be blamed!

You're using code from the synopsis of the 'live' (latest, CVS-only) 
version of Bio::Tools::Run::Alignment::Clustalw but do not have that 
version installed. The run() method was only added recently. If you 
actually want the run() method, update the Clustalw module from CVS.

http://www.bioperl.org/wiki/Getting_BioPerl#CVS


> THIS IS THE SCRIPT
> 
> BEGIN {$ENV{CLUSTALDIR} = '/home/luba/bin/clustalx1.82.linux/';}
> 
> use Bio::SeqIO;
> use Bio::Tools::Run::Alignment::Clustalw;
> use Bio::SimpleAlign;
> use Bio::AlignIO;
> #use strict;
> use warnings;
> 
> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
>   my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> 
>   my $str = Bio::SeqIO->new(-file=> 'clustalw.fa ', '-format' => 'Fasta');
>   my @seq_array =();
>   while ( my $seq = $str->next_seq() ) {push (@seq_array, $seq) ;}
>   my $seq_array_ref = \@seq_array;
> 
>   my $aln = $factory->align($seq_array_ref);
> 
>    # Get a tree of the sequences
>   $tree = $factory->tree(\@seq_array);
> 
>   # Get both an alignment and a tree
>   ($aln, $tree) = $factory->run(\@seq_array);
> 
>   # Do a footprinting analysis on the supplied sequences, getting back the
>   # most conserved sub-alignments
>   my @results = $factory->footprint(\@seq_array);
>   foreach my $result (@results) {
>     print $result->consensus_string, "\n";
>   }

You need to learn to read and understand the synopsis code before trying 
to use it. The synopsis code usually isn't intended to be used 
whole-sale. Rather, as in this case, it demonstrates a few useful things 
that might not make sense all in the same script. So there's no need for 
you to get an alignment with the align() method, a tree with the tree() 
method and then get the alignment and tree again with the run() method. 
You also don't need to do footprinting with footprint() unless you're 
actually interested in footprinting!

tree() and footprint() won't work for you because, again, those are 
recent additions to the module. Upgrade from CVS if you really want to 
footprint.



More information about the Bioperl-l mailing list