[Bioperl-l] Bio::SimpleAlign, uniq_seq

Mark A. Jensen maj at fortinbras.us
Thu Apr 2 13:46:36 UTC 2009


Hi Tristan--
I think this is a good thought,
Can you register this as an enhancement at http://bugzilla.bioperl.org ?
Please go ahead and attach the diff as a patch to the 'bug' report--
thanks for *your* input-
cheers, Mark
----- Original Message ----- 
From: "Tristan Lefebure" <tristan.lefebure at gmail.com>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; "Weigang Qiu" 
<weigangq at gmail.com>
Sent: Thursday, April 02, 2009 9:30 AM
Subject: Re: [Bioperl-l] Bio::SimpleAlign, uniq_seq


> Thanks you both,
>
> To internally store the ST composition, so that I can reuse it in the same
> script, I made the following modifications to SimpleAlign.pm:
>
> diff /usr/local/share/perl/5.10.0/Bio/SimpleAlign.pm
> /usr/local/share/perl/5.10.0/Bio/SimpleAlignMod.pm
> 590a591,592
>>     #modified to also returned an array of the ST composition
>>     my %st;
> 651a654
>>         push @{$st{$order{$str}}}, $_->id();
> 655c658
> <     return $aln;
> ---
>>     return ($aln, %st);
>
> This is probably not really BioPerl compliant. Being an OBO ignorant, I
> wonder if we could add this information somewhere either once in the $aln
> object, or by little pieces in each  Bio::LocatableSeq objects?
>
> Thks,
>
> --Tristan
>
> On Thu, Apr 2, 2009 at 12:15 AM, Mark A. Jensen <maj at fortinbras.us> wrote:
>
>> Thanks Weigang-- I didn't look carefully enough--
>> I'll make a change to the POD.
>> so Tristan, in your code below, add
>>
>> $aln->verbose(1);
>>
>> before you invoke uniq_seq(). The ST's should
>> then be sent to stderr (as "warns").
>>
>> MAJ
>> ----- Original Message ----- From: "Weigang Qiu" <weigangq at gmail.com>
>> To: "Mark A. Jensen" <maj at fortinbras.us>
>> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>; <
>> tristan.lefebure at gmail.com>
>> Sent: Wednesday, April 01, 2009 11:57 PM
>> Subject: Re: [Bioperl-l] Bio::SimpleAlign, uniq_seq
>>
>>
>>
>>  Mark and Tristan,
>>>
>>> I am the original instigator of the uniq_seq method. The STDERR
>>> implementation was used so that STDOUT could be piped. But it did not
>>> conform to bioperl convention of using the $self->debug() method. I think
>>> that's why these lines were commented out and re-implemented using the
>>> $self->debug method. So, turning on the debug option should give the
>>> intended ST mapping for each sequence in stderr.
>>>
>>> weigang
>>>
>>> On Wed, Apr 1, 2009 at 10:28 PM, Mark A. Jensen <maj at fortinbras.us>
>>> wrote:
>>>
>>>  Tristan--
>>>> Strange: it looks like the prints to stderr have been commented out in
>>>> the
>>>> source (back in revision 10242; 1.6 is rev 15582). The
>>>> two statements are easy to find in the SimpleAlign.pm uniq_seq() source;
>>>> you can
>>>> uncomment them to work around this.
>>>> You are right, this is rather an unconventional way to specify an output
>>>> option-- can Chris comment?
>>>> Mark
>>>> ----- Original Message ----- From: "Tristan Lefebure" <
>>>> tristan.lefebure at gmail.com>
>>>> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
>>>> Sent: Wednesday, April 01, 2009 11:11 PM
>>>> Subject: [Bioperl-l] Bio::SimpleAlign, uniq_seq
>>>>
>>>>
>>>>
>>>>  Hi there,
>>>>
>>>>>
>>>>> I'm trying to use the uniq_seq function from the Bio::SimpleAlign
>>>>> module.
>>>>> Here is the description:
>>>>>
>>>>> Title     : uniq_seq
>>>>> Usage     : $aln->uniq_seq():  Remove identical sequences in
>>>>>           in the alignment.  Ambiguous base ("N", "n") and
>>>>>           leading and ending gaps ("-") are NOT counted as
>>>>>           differences.
>>>>> Function  : Make a new alignment of unique sequence types (STs)
>>>>> Returns   : 1. a new Bio::SimpleAlign object (all sequences renamed as
>>>>> "ST")
>>>>>           2. ST of each sequence in STDERR
>>>>> Argument  : None
>>>>>
>>>>> What I'm trying to obtain is the ST composition (i.e. what is supposed
>>>>> to
>>>>> go
>>>>> to STDERR), but I see nothing...
>>>>>
>>>>> An example:
>>>>>
>>>>> --------test.fasta:
>>>>>
>>>>>  seq1
>>>>>>
>>>>>>  AAATTTC
>>>>>
>>>>>  seq2
>>>>>>
>>>>>>  CAATTTC
>>>>>
>>>>>  seq3
>>>>>>
>>>>>>  AAATTTC
>>>>> -------
>>>>>
>>>>>
>>>>> ----------test.pl:
>>>>> #! /usr/bin/perl
>>>>>
>>>>> use strict;
>>>>> use warnings;
>>>>> use Bio::AlignIO;
>>>>> use Bio::SimpleAlign;
>>>>> use Getopt::Long;
>>>>>
>>>>> my $in  = Bio::AlignIO->new(-file   => 'test.fasta' ,
>>>>>                           -format => 'fasta');
>>>>>
>>>>> my $out = Bio::AlignIO->new(-file   => ">test.out" ,
>>>>>                           -format => 'fasta');
>>>>>
>>>>> while ( my $aln = $in->next_aln() ) {
>>>>> my $red_aln = $aln->uniq_seq;
>>>>>      $out->write_aln($red_aln);
>>>>>  }
>>>>> -------------
>>>>>
>>>>> If you run:
>>>>>
>>>>> ./test.pl &> log
>>>>>
>>>>> you will get nothing written into the log file... (but the test.out is
>>>>> OK)
>>>>>
>>>>> Am I missing something?
>>>>> By the way, wouldn't it be more convenient to have the ST composition
>>>>> returned
>>>>> in an array?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> --Tristan
>>>>> (BioPerl 1.6)
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>>>  _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>
>>>
>>> --
>>> Weigang Qiu
>>> Department of Biological Sciences
>>> Hunter College, City University of New York
>>> 695 Park Avenue
>>> New York, NY 10065
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> 




More information about the Bioperl-l mailing list