[Bioperl-l] Bio::SimpleAlign, uniq_seq

Tristan Lefebure tristan.lefebure at gmail.com
Thu Apr 2 13:30:51 UTC 2009


Thanks you both,

To internally store the ST composition, so that I can reuse it in the same
script, I made the following modifications to SimpleAlign.pm:

diff /usr/local/share/perl/5.10.0/Bio/SimpleAlign.pm
/usr/local/share/perl/5.10.0/Bio/SimpleAlignMod.pm
590a591,592
>     #modified to also returned an array of the ST composition
>     my %st;
651a654
>         push @{$st{$order{$str}}}, $_->id();
655c658
<     return $aln;
---
>     return ($aln, %st);

This is probably not really BioPerl compliant. Being an OBO ignorant, I
wonder if we could add this information somewhere either once in the $aln
object, or by little pieces in each  Bio::LocatableSeq objects?

Thks,

--Tristan

On Thu, Apr 2, 2009 at 12:15 AM, Mark A. Jensen <maj at fortinbras.us> wrote:

> Thanks Weigang-- I didn't look carefully enough--
> I'll make a change to the POD.
> so Tristan, in your code below, add
>
> $aln->verbose(1);
>
> before you invoke uniq_seq(). The ST's should
> then be sent to stderr (as "warns").
>
> MAJ
> ----- Original Message ----- From: "Weigang Qiu" <weigangq at gmail.com>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; <
> tristan.lefebure at gmail.com>
> Sent: Wednesday, April 01, 2009 11:57 PM
> Subject: Re: [Bioperl-l] Bio::SimpleAlign, uniq_seq
>
>
>
>  Mark and Tristan,
>>
>> I am the original instigator of the uniq_seq method. The STDERR
>> implementation was used so that STDOUT could be piped. But it did not
>> conform to bioperl convention of using the $self->debug() method. I think
>> that's why these lines were commented out and re-implemented using the
>> $self->debug method. So, turning on the debug option should give the
>> intended ST mapping for each sequence in stderr.
>>
>> weigang
>>
>> On Wed, Apr 1, 2009 at 10:28 PM, Mark A. Jensen <maj at fortinbras.us>
>> wrote:
>>
>>  Tristan--
>>> Strange: it looks like the prints to stderr have been commented out in
>>> the
>>> source (back in revision 10242; 1.6 is rev 15582). The
>>> two statements are easy to find in the SimpleAlign.pm uniq_seq() source;
>>> you can
>>> uncomment them to work around this.
>>> You are right, this is rather an unconventional way to specify an output
>>> option-- can Chris comment?
>>> Mark
>>> ----- Original Message ----- From: "Tristan Lefebure" <
>>> tristan.lefebure at gmail.com>
>>> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>> Sent: Wednesday, April 01, 2009 11:11 PM
>>> Subject: [Bioperl-l] Bio::SimpleAlign, uniq_seq
>>>
>>>
>>>
>>>  Hi there,
>>>
>>>>
>>>> I'm trying to use the uniq_seq function from the Bio::SimpleAlign
>>>> module.
>>>> Here is the description:
>>>>
>>>> Title     : uniq_seq
>>>> Usage     : $aln->uniq_seq():  Remove identical sequences in
>>>>           in the alignment.  Ambiguous base ("N", "n") and
>>>>           leading and ending gaps ("-") are NOT counted as
>>>>           differences.
>>>> Function  : Make a new alignment of unique sequence types (STs)
>>>> Returns   : 1. a new Bio::SimpleAlign object (all sequences renamed as
>>>> "ST")
>>>>           2. ST of each sequence in STDERR
>>>> Argument  : None
>>>>
>>>> What I'm trying to obtain is the ST composition (i.e. what is supposed
>>>> to
>>>> go
>>>> to STDERR), but I see nothing...
>>>>
>>>> An example:
>>>>
>>>> --------test.fasta:
>>>>
>>>>  seq1
>>>>>
>>>>>  AAATTTC
>>>>
>>>>  seq2
>>>>>
>>>>>  CAATTTC
>>>>
>>>>  seq3
>>>>>
>>>>>  AAATTTC
>>>> -------
>>>>
>>>>
>>>> ----------test.pl:
>>>> #! /usr/bin/perl
>>>>
>>>> use strict;
>>>> use warnings;
>>>> use Bio::AlignIO;
>>>> use Bio::SimpleAlign;
>>>> use Getopt::Long;
>>>>
>>>> my $in  = Bio::AlignIO->new(-file   => 'test.fasta' ,
>>>>                           -format => 'fasta');
>>>>
>>>> my $out = Bio::AlignIO->new(-file   => ">test.out" ,
>>>>                           -format => 'fasta');
>>>>
>>>> while ( my $aln = $in->next_aln() ) {
>>>> my $red_aln = $aln->uniq_seq;
>>>>      $out->write_aln($red_aln);
>>>>  }
>>>> -------------
>>>>
>>>> If you run:
>>>>
>>>> ./test.pl &> log
>>>>
>>>> you will get nothing written into the log file... (but the test.out is
>>>> OK)
>>>>
>>>> Am I missing something?
>>>> By the way, wouldn't it be more convenient to have the ST composition
>>>> returned
>>>> in an array?
>>>>
>>>> Thanks,
>>>>
>>>> --Tristan
>>>> (BioPerl 1.6)
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>>  _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>>
>> --
>> Weigang Qiu
>> Department of Biological Sciences
>> Hunter College, City University of New York
>> 695 Park Avenue
>> New York, NY 10065
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>



More information about the Bioperl-l mailing list