[BioPython] NCBI: from protein to CDS

Iddo Friedberg idoerg at burnham.org
Thu Apr 10 19:24:29 EDT 2003


Sorry, slight mistake in my question.

The protein sequence is retrieved using the gi-10956263

Then the CDS link hold the following:

val=10956247  (which is the nucleotide GI)
itemID=98 (which tells us which bit of the nucleotide sequence actually 
codes for this protein).

Best,

Iddo

Iddo Friedberg wrote:
> Hi,
> 
> Can anyone suggest a painless way to retrieve a coding sequence using a 
> protein gi number via NCBI? manually that would mean to:
> 
> 1) go to the protein page using the protein gi.
> 2) Click on the CDS
> 3) Get the DNA sequence coding to it.
> 
> Here why this is a bummer:
> 
> Using the gi-10956247 a protein record is retrieved from NCBI. The URL 
> for the CDS looks like :
> 
> "http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=10956247&itemID=98&view=gbwithparts" 
> 
> 
> the "val" field holds the same gi number, but the itemID field varies 
> (that depends on the coding location in the full genome record, in this 
> case). So I cannot use the protein gi to go and retrieve its coding 
> sequence.
> 
> I *can* write something to read the URL itemID field value, I just 
> thought there might be a more elegant way. Maybe even using "legitimate" 
> NCBI mechanisms...
> 
> Thanks,
> 
> Iddo
> 
> 
> 

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://bioinformatics.ljcrf.edu/~iddo



More information about the BioPython mailing list