[Bioperl-l] ace to msf format?

Wes Barris wes.barris at csiro.au
Tue Sep 2 19:09:40 EDT 2003


Jason Stajich wrote:

> Perhaps it make sense to instead derive a flushed alignment from a Contig
> - i.e. a get_aln() method - which will make a new SimpleAlign object and
> padding the individual sequences with the necessary leading and trailing
> gap characters?

Yes, that is exactly what I want.

> Wes - if this is something you need, perhaps you could look into trying to
> write a method of this sort?

Yes, I will try.  However, I will first have to better understand how all
the pieces of Bioperl "fit together" before I will know where to begin.

> 
> -jason
> 
> On Tue, 2 Sep 2003, Robson Francisco de Souza wrote:
> 
> 
>>	Hi Wes and Jason,
>>
>>	There are indeed some caveats when trying to use
>>Bio::Assembly::Contig objects as Bio::Align::AlignI objects. Not all
>>methods defined in this interface are implemented and some are not
>>working (checked it yesterday using Wes's code). Most routines that are
>>not working can be corrected without much work and some not yet
>>implemented are easy to write but I'm not sure we'll ever get full
>>compliance to the AlignI interface.
>>	I'd like to discuss that further but for now let me just clarify
>>why I believe there will be no way to print contig using msf.pm: contigs
>>are not flush, i.e. most contigs will be alignments of sequences of
>>different lengths and, even worst, sequences in a contig may be only
>>locally aligned to each other, which implies that some regions of any
>>sequence in the alignment might not be aligned to the contig consensus but
>>will get printed to MSF any way. As far as I understand AlignI interface,
>>such an alignment (a set of local alignments) is not supported.
>>	I've been considering removing AlignI from @ISA in
>>Bio::Assembly::Contig and defining a ContigI interface for it as it seems
>>to me that AlignI interface is not generic enough to describe contigs.
>>The main problem is that any sequence in a contig is only partially
>>aligned to a consensus's subsequence, qich makes some of the methods from
>>AlignI non-sense (e.g. Bio::Align::AlignI::length, which is used by
>>msf.pm). I'd like to hear comments from others on this.
>>	So, do not try to use MSF, CLUSTALW or other format of multiple
>>global alignment for printing assemblies, you wont get what you want.
>>
>>						Robson
>>
>>On Mon, 1 Sep 2003, Wes Barris wrote:
>>
>>>Thanks Jason, that makes sense.  Perhaps I'm missing something obvious
>>>but I am getting an error when treating each contig as a Bio::SimpleAlign
>>>object.  Here is my code:
>>>
>>>#!/usr/local/bin/perl -w
>>>#
>>>use strict;
>>>use Bio::Assembly::IO;
>>>use Bio::AlignIO;
>>>#
>>>my $usage = "Usage: $0 <infile.ace>\n";
>>>my $infile = shift or die $usage;
>>>
>>>my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
>>>my $assembly = $io->next_assembly;
>>>
>>>foreach my $contig ($assembly->all_contigs()) {
>>>    my $name = "cn".$contig->id;
>>>    print("$name\n");
>>>    my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">$name");
>>>    $outstream->write_aln($contig);
>>>    undef $outstream;
>>>    }
>>>
>>>And here is the runtime error:
>>>
>>>cn1
>>>Use of uninitialized value in hash element at
>>>/usr/lib/perl5/site_perl/5.6.1/Bio/Assembly/Contig.pm line 1305, <GEN0> line 33990.
>>>Use of uninitialized value in hash element at
>>>/usr/lib/perl5/site_perl/5.6.1/Bio/Assembly/Contig.pm line 1305, <GEN0> line 33990.
>>>Can't call method "alphabet" on an undefined value at
>>>/usr/lib/perl5/site_perl/5.6.1/Bio/AlignIO/msf.pm line 180, <GEN0> line 33990.
>>>
>>>I am using bioperl-1.2.2.
>>>
>>>
>>>
>>>>Your code below is calling it in scalar context which will just have $aln
>>>>being set to the length of the returned array.
>>>>
>>>>-jason
>>>>
>>>>On Mon, 1 Sep 2003, Wes Barris wrote:
>>>>
>>>>
>>>>
>>>>>Brian Osborne wrote:
>>>>>
>>>>>
>>>>>
>>>>>>Wes,
>>>>>>
>>>>>>I don't think this is possible in Bioperl. To put it more generally, AlignIO
>>>>>>can't accommodate Assembly objects currently. AlignIO is the module that
>>>>>>takes in a variety of alignment formats and interconverts them, analogous to
>>>>>>SeqIO. I'll be corrected if I'm wrong.
>>>>>>
>>>>>>Brian O.
>>>>>
>>>>>I am kind of new to this so I could be wrong but isn't an Assembly a group
>>>>>of alignments?  So, from one assemble, a group of alignments could be
>>>>>generated?
>>>>>
>>>>>
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: bioperl-l-bounces at portal.open-bio.org
>>>>>>[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Wes Barris
>>>>>>Sent: Thursday, August 28, 2003 7:58 PM
>>>>>>To: Bioperl Mailing List
>>>>>>Subject: [Bioperl-l] ace to msf format?
>>>>>>
>>>>>>Can anyone give me a hint as to how I could use bioperl to read in
>>>>>>an ACE assembly and write out an MSF formatted alignment?  This shows
>>>>>>what I have figured out so far:
>>>>>>
>>>>>>#!/usr/local/bin/perl -w
>>>>>>#
>>>>>>use strict;
>>>>>>use Bio::Assembly::IO;
>>>>>>#
>>>>>>my $usage = "Usage: $0 <infile.ace>\n";
>>>>>>my $infile = shift or die $usage;
>>>>>>
>>>>>>my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
>>>>>>my $assembly = $io->next_assembly;
>>>>>>
>>>>>>my $aln = $assembly->all_contigs();
>>>>>>
>>>>>>--
>>>>>>Wes Barris
>>>>>>E-Mail: Wes.Barris at csiro.au
>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>Bioperl-l mailing list
>>>>>>Bioperl-l at portal.open-bio.org
>>>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>--
>>>>Jason Stajich
>>>>Duke University
>>>>jason at cgt.mc.duke.edu
>>>
>>>
>>>--
>>>Wes Barris
>>>E-Mail: Wes.Barris at csiro.au
>>>
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at portal.open-bio.org
>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu


-- 
Wes Barris
E-Mail: Wes.Barris at csiro.au



More information about the Bioperl-l mailing list