[Bioperl-l] Is it possible to do contig alignments?

De-Jian,ZHAO zhaodj at ioz.ac.cn
Fri Aug 24 09:34:07 UTC 2007


On Fri, Aug 24, 2007 12:43, Florent Angly wrote:
> Dear list members,
>
> I would like to "produce" an alignment of a contig, or more
exactly
> visualize it in a such a fashion based on the aligned sequences
> provided
> to be by a sequence assembler:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
>
> It sounds like a very trivial task but after searching for a long
time,
> it seems impossible using the methods BioPerl provides.
>
> Using the Bio::Align classes, it seems like the only way is if the
sequences have the same aligned length, i.e. like this:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC----
> Sequence2: -CGTACGT--
> Sequence3: ----AC-TTG
>
> It's not very satisfactory if I have to pad the sequences with
gaps
> manually. In the context of a phylogenetic alignment, it might
make
> sense, but not for contigs.

How do you pad the sequences with gaps manually? Just replace the
hyphens with blanks? If yes, you can program in perl to automate
this process.

> For assemblies whole sequences are mapped on contigs.
> Bio::LocatableSeq
> does not help here because it defines locations _within_ the
> sequence
> (the name LocatableSeq was pretty misleading to me).
>
> I think it's all very strange that contigs have the coordinates of
the
> aligned sequences composing them but there is no straightforward
way
> to
> exploit this information.
>
> So what's the bottom line? Am I missing something obvious, an
> out-of-the-box solution? Is it a "missing feature" of BioPerl that
is
> planned to be implemented in the future or that should be
requested?
> Should I pad my sequences with dashes or spaces after assembly? Or
is it
> expected that my aligned reads coming from my assembly be padded
with
> lots of gaps at their beginning and end? What's the BioPerl
> philosophy here?
>
> Thanks for giving me pointers,
>
> Florent
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn








More information about the Bioperl-l mailing list