[Bioperl-l] Help on a basic EST-genomic alignment script

Edward Chuong echuong at gmail.com
Tue Jun 29 18:13:52 EDT 2004


Hi,

I tried using FASTX, which looks like it gives similar alignment
results as estwise. Can bioperl take in the fastx output? I always
thought "fasta" output was just the simple >header followed by
sequence, but the output by fastx has far more information. However,
even fastx output is missing the original nucleotide sequence.. which
wouldn't be a problem if it would included the nucleotide locations,
but it doesn't as far as I can see.

Ultimatley I need something that will align an EST (from a file) to a
mus CDS (retrieved from genbank) and get an aln object I can use to
find dn/ds (which means the alignment must be in the correct protein
coding frame) in bioperl..

So estwise and fastx/y align the both translated sequences
beautifully, but it seems like I can't parse them in bioperl, and they
don't return as nucleotide anyway.

The closest thing I can find is est2genome from EMBOSS. It aligns the
EST to the mus cDNA nucleotide sequences great--but that alignment
isn't in any particular coding frame, which would cause problems if I
stuck it in a dn/ds module. Also, est2genome doesn't appear to output
in standard EMBOSS format.

Anyone with experience doing something similar with dn/ds care to
share how you got a proper alignment object?

Thanks!

-Ed 

On Mon, 28 Jun 2004 19:16:11 -0400 (EDT), Jason Stajich
<jason at cgt.duhs.duke.edu> wrote:
> 
> Hmm - I guess estwise doesn't provide a machine parseable output as I
> would have thought.  What does one do ewan?  No one has written a
> wise prettyblock alignment parser yet sadly.
> 
> -jason
> 
> On Mon, 28 Jun 2004, Edward Chuong wrote:
> 
> > > Just use estwise as a standalone program not from within perl.
> > >  % estwise protein est
> > >
> > > estwise is pretty slow so I wouldn't embark on this route unless you know
> > > what you are doing.  Try a BLAST or FASTA route first to get likely
> > > homologs.
> > >
> >
> > Hey,
> >
> > I'm using blast already to find the likely homologs, which is working
> > fine, and I get the homolog CDS/protein sequence by querying genbank
> > with the accessionID from blast.
> >
> > ESTwise seems fast enough for my very small ESTs. I'm not sure what
> > fastx is used for. I need a library..?
> >
> > How can I automate running estwise? Should i just use some sort of shell script?
> >
> > Is there any way to get the alignment I get from estwise into one of
> > the dn/ds modules in bioperl?
> >
> > Thanks so much for helping!
> >
> > -Ed
> >
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 


-- 
Edward Chuong
http://iacs5.ucsd.edu/~echuong


More information about the Bioperl-l mailing list