[Bioperl-l] How to get the corresponding DNA sequence of a protein gi#??

Jason Stajich jason at cgt.mc.duke.edu
Mon Apr 21 14:46:26 EDT 2003


You might read my post here:
http://bioperl.org/pipermail/bioperl-l/2003-April/011918.html

This code works only when the protein record is annotated with back links
coded_by

Of course this is more of an NCBI question if the back links are missing -
we provide the tools not the data so if a link doesn't exist you have to
get creative...

Things like -

Parse in the whole genome sequence and grab the annotated proteins from it
and see which ones match the ones you are interested in.

a) get the genome sequence for the organism you have pep gi #
b) tblastn
c) cleanup alignment with genewise if these genes have introns.

-jason

On Mon, 21 Apr 2003, Sally Li wrote:

> Hi,
>
> The following is a blast result (partial) from
> gi|29836496 (coronavirus genome's putative gene).
>
>
> gi|547041
>  gi|58980
> gi|7769344
>  gi|74849
>
> These are Genbank numbers. They are protein ids. We
> can easiy obtain sequences based on these gi# using
> bioperl modules. But how can I get the corresponding
> DNA sequences?
>
> Thank you for your help.
>
> Sally
>
>
> __________________________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo
> http://search.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list