[Bioperl-l] questions on Bio::Tools::Run::Alignment::Clustalw

Dave Messina David.Messina at sbc.su.se
Thu May 19 12:28:37 UTC 2011


Hi Lorenzo,



> Anyway, what the error message means?:
>

"Uninitialized value" means that the variable doesn't have a value — it's
not set to anything.


so I guess the problem should refer to $self->executable,
>

Yes, I think that's probably right.



>  which must be solved after changing the executable name to clustalw (is it
> right?).
>

Well, yes, that's part of it. But I think the real problem must be that the
directory where you have the clustalw executable is not being found by
BioPerl.

I notice that there's a space after the directory name when you set
$ENV{CLUSTALDIR}. Get rid of the space, so it looks like:

BEGIN {$ENV{CLUSTALDIR} =
'/Applications/Bioinformatics/clustalw-2.0.10-macosx/'}

BioPerl is not properly catching the trailing space — I'll look into why and
see if I can get it to be a little more defensive against this kind of
thing.



>  However, I don't understand the rest of the error message:
>
> 	sh: align: command not found
>
>
Warning: long-winded explanation follows.

So, align is a parameter that the bioperl code is trying to pass to the
clustalw executable (I bet $command back on line 754 is set to 'align').
Since $self->executable has not value, the first thing passed to the shell
to execute is the word 'align'. And since there's no program called align in
your PATH, you get a 'command not found' from the shell. The same exact
thing would happen if you opened your terminal and typed align and hit
return.

tl;dr it's a side effect of $self->executable being uninitialized.



> Regarding my last question, what I want is to align the sequences, using
> clustalw preferably, to get the total number of aligned aas for both the
> longest and the shortest sequence in the alignment. I need these data to
> apply the following formula:
>
>      	  I'=I*Min(n1L1,n2L2)
>
>    	 where I is the percentage of identical aas in the aligned region,
> 	Li is the length of sequence i and ni is the number of aas in the
> 	aligned regions in sequence i
>
>
Yes, I think you'll have to loop through the seqs in the alignment one by
one to get how many aas each one has in the aligned region, but (if you
haven't already) do look through the Bio::SimpleAlign docs and see if
something there will be of use. There might also be some scripts in the
scripts directory that do something like this.


Dave




More information about the Bioperl-l mailing list