[Bioperl-l] Recoding Bio::SimpleAlign
Jun Yin
jun.yin at ucd.ie
Fri Jul 16 15:54:36 UTC 2010
Dear all,
I am the Google Summer of Code student working on refactoring Bio::Align
subsystems. The first aim of the project is to recode Bio::SimpleAlign. This
is because this package is really useful, but it was created a long time
ago, written by several people, and a bit inconsistent mainly due to the
above two reasons.
I tried to keep the package consistent (e.g. method calling, coding styles)
with the previous distribution. However, there are still a few changes.
Since this package is created and used by the community, I think it is
better to show it to everyone before it is merged with the major
distribution. Any suggestions and criticisms are welcome.
Here are the major improvements on Bio::SimpleAlign
1. MSA modifying and selection methods are more consistent and easier to
use. I have enabled multiple/reverse selections for all sequences/columns
selection methods, and change the names to be more understandable.
For example,
$aln->select() and $aln->select_noncont() are both deprecated, and renamed
as $aln->select_Seqs() now. Because selections should be both in seqs and
columns, which need to be explicit in the method call.
For example, multiple sequence selections can be called by:
$newaln=$aln->select_Seqs([4..10,20..35,37]);
$newaln=$aln->select_Seqs(-selection=>[4..10,20..35,37]);
Or you can toggle selection(reverse selection) using:
$newaln=$aln->select_Seqs([4..10,20..35,37],1);
$newaln=$aln->select_Seqs(-selection=>[4..10,20..35,37],-toggle=>1);
If you can the method using the old ways, e.g.
$newaln=$aln->select(1,5);
A warning will be shown:
select - deprecated method. Use select_Seqs() instead.
And, the calling will be redirected to
$newaln=$aln->select_Seqs([1..5]);
2. gap chars/missing chars are more consistent in the package
Default values for gap char and missing char are now set in the package.
Calling/Setting gap char should be made by calling $aln->gap_char("-").
3. Some redundant methods are removed. The methods are moved to more
reasonable categories.
For example, $aln->select and $aln->select_noncont are deprecated now.
Please use $aln->select_Seqs.
4. Some methods are renamed. Methods selecting/giving objects are
capitalized, e.g. each_seq to each_Seq.
Another example, the method is renamed to give a clearer information.
$aln->purge is renamed into $aln->remove_redundant_Seqs
$aln->splice_by_seq_pos is renamed to $aln->remove_gaps
For further information, you can visit:
http://spreadsheets.google.com/ccc?key=0AssLTcJFJMbXdDFfZGpJZlhidFY5blBneGdh
QUZ6WFE
<http://spreadsheets.google.com/ccc?key=0AssLTcJFJMbXdDFfZGpJZlhidFY5blBneGd
hQUZ6WFE&hl=en&authkey=CJTCw4QL> &hl=en&authkey=CJTCw4QL
Cheers,
Jun Yin
Ph.D. student in U.C.D.
Bioinformatics Laboratory
Conway Institute
University College Dublin
More information about the Bioperl-l
mailing list