[Bioperl-l] ace to msf format?

Robson Francisco de Souza rfsouza at citri.iq.usp.br
Tue Sep 2 10:59:30 EDT 2003


	Hi Wes and Jason,

	There are indeed some caveats when trying to use
Bio::Assembly::Contig objects as Bio::Align::AlignI objects. Not all
methods defined in this interface are implemented and some are not
working (checked it yesterday using Wes's code). Most routines that are
not working can be corrected without much work and some not yet
implemented are easy to write but I'm not sure we'll ever get full
compliance to the AlignI interface.
	I'd like to discuss that further but for now let me just clarify
why I believe there will be no way to print contig using msf.pm: contigs
are not flush, i.e. most contigs will be alignments of sequences of
different lengths and, even worst, sequences in a contig may be only
locally aligned to each other, which implies that some regions of any
sequence in the alignment might not be aligned to the contig consensus but
will get printed to MSF any way. As far as I understand AlignI interface,
such an alignment (a set of local alignments) is not supported.
	I've been considering removing AlignI from @ISA in
Bio::Assembly::Contig and defining a ContigI interface for it as it seems
to me that AlignI interface is not generic enough to describe contigs.
The main problem is that any sequence in a contig is only partially
aligned to a consensus's subsequence, qich makes some of the methods from
AlignI non-sense (e.g. Bio::Align::AlignI::length, which is used by
msf.pm). I'd like to hear comments from others on this.
	So, do not try to use MSF, CLUSTALW or other format of multiple
global alignment for printing assemblies, you wont get what you want.

						Robson

On Mon, 1 Sep 2003, Wes Barris wrote:
> Thanks Jason, that makes sense.  Perhaps I'm missing something obvious
> but I am getting an error when treating each contig as a Bio::SimpleAlign
> object.  Here is my code:
> 
> #!/usr/local/bin/perl -w
> #
> use strict;
> use Bio::Assembly::IO;
> use Bio::AlignIO;
> #
> my $usage = "Usage: $0 <infile.ace>\n";
> my $infile = shift or die $usage;
> 
> my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
> my $assembly = $io->next_assembly;
> 
> foreach my $contig ($assembly->all_contigs()) {
>     my $name = "cn".$contig->id;
>     print("$name\n");
>     my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">$name");
>     $outstream->write_aln($contig);
>     undef $outstream;
>     }
> 
> And here is the runtime error:
> 
> cn1
> Use of uninitialized value in hash element at 
> /usr/lib/perl5/site_perl/5.6.1/Bio/Assembly/Contig.pm line 1305, <GEN0> line 33990.
> Use of uninitialized value in hash element at 
> /usr/lib/perl5/site_perl/5.6.1/Bio/Assembly/Contig.pm line 1305, <GEN0> line 33990.
> Can't call method "alphabet" on an undefined value at 
> /usr/lib/perl5/site_perl/5.6.1/Bio/AlignIO/msf.pm line 180, <GEN0> line 33990.
> 
> I am using bioperl-1.2.2.
> 
> 
> > 
> > Your code below is calling it in scalar context which will just have $aln
> > being set to the length of the returned array.
> > 
> > -jason
> > 
> > On Mon, 1 Sep 2003, Wes Barris wrote:
> > 
> > 
> >>Brian Osborne wrote:
> >>
> >>
> >>>Wes,
> >>>
> >>>I don't think this is possible in Bioperl. To put it more generally, AlignIO
> >>>can't accommodate Assembly objects currently. AlignIO is the module that
> >>>takes in a variety of alignment formats and interconverts them, analogous to
> >>>SeqIO. I'll be corrected if I'm wrong.
> >>>
> >>>Brian O.
> >>
> >>I am kind of new to this so I could be wrong but isn't an Assembly a group
> >>of alignments?  So, from one assemble, a group of alignments could be
> >>generated?
> >>
> >>
> >>>-----Original Message-----
> >>>From: bioperl-l-bounces at portal.open-bio.org
> >>>[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Wes Barris
> >>>Sent: Thursday, August 28, 2003 7:58 PM
> >>>To: Bioperl Mailing List
> >>>Subject: [Bioperl-l] ace to msf format?
> >>>
> >>>Can anyone give me a hint as to how I could use bioperl to read in
> >>>an ACE assembly and write out an MSF formatted alignment?  This shows
> >>>what I have figured out so far:
> >>>
> >>>#!/usr/local/bin/perl -w
> >>>#
> >>>use strict;
> >>>use Bio::Assembly::IO;
> >>>#
> >>>my $usage = "Usage: $0 <infile.ace>\n";
> >>>my $infile = shift or die $usage;
> >>>
> >>>my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
> >>>my $assembly = $io->next_assembly;
> >>>
> >>>my $aln = $assembly->all_contigs();
> >>>
> >>>--
> >>>Wes Barris
> >>>E-Mail: Wes.Barris at csiro.au
> >>>
> >>>
> >>>_______________________________________________
> >>>Bioperl-l mailing list
> >>>Bioperl-l at portal.open-bio.org
> >>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >>
> >>
> > 
> > --
> > Jason Stajich
> > Duke University
> > jason at cgt.mc.duke.edu
> 
> 
> -- 
> Wes Barris
> E-Mail: Wes.Barris at csiro.au
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list