[Bioperl-l] app to translate EST contigs

Elia Stupka elia@fugu-sg.org
Fri, 10 Jan 2003 13:45:45 +0800 (SGT)


> I want to generate a 'likely' set of protein sequences from a large set
> of EST contigs to populate a mass spec database.  Sequences are fairly
> high GC-content (60-65% GC). Plant/algal species.

There is a very nice program called ESTScan based on an HMM which can deal
with sequencing errors commonly found in EST sequences and predict
translations. The program can be run at:

http://www.ch.embnet.org/software/ESTScan.html

though you can also obtain the binary and run it locally of course.

Also, if you have a large set of ESTs you would be best off first
clustering them with a tool such as StackPack:
http://juju.egenetics.com/stackpack/

Which is free for academic sites. StackPack will cluster the ESTs for you
so that you may then run ESTScan on the cluster consensus sequences rather
than the individual EST sequences.

We've been using these tools for a while in our own projects, so feel free
to contact me for further help.

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6779 1117        *
********************************