[Bioperl-l] fetch gene sequence with EUtilities.pm
Chris Fields
cjfields at illinois.edu
Wed Jun 10 13:20:43 UTC 2009
EntrezGene doesn't contain the sequence information; I believe it just
links to the sequence in a specified nuc record with given
coordinates. You can get to it, but it takes a little trickery; in
essence you need to use the UID to get the gene summary information,
extract that, then grab the sequence record using seqstart, seqend,
and seqstrand.
A dump of esummary info for UID 18131, for instance, (using $eutil-
>print_all) gives this info (abbreviated somewhat):
UID :18131
Name :Notch3
Description :Notch gene homolog 3 (Drosophila)
Orgname :Mus musculus
...
GenomicInfo
GenomicInfoType
ChrLoc :17
ChrAccVer :NC_000083.5
ChrStart :32303796
ChrStop :32257837
GeneWeight :23049
The genomic info section gives the accession.version, start, end, and
(implicitly) the strand (ChrStop is less that ChrStart). I have added
an example to the cookbook:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#How_do_I_retrieve_the_DNA_sequence_using_EntrezGene_IDs.3F
chris
On Jun 9, 2009, at 6:20 AM, Adam Witney wrote:
> Hi,
>
> I have been experimenting with the Bio::DB::EUtilities module, with
> help from the Cookbook. But I can't seem to figure out how to get
> the DNA sequence of a gene; all the examples seem to be fetching
> protein sequence.
>
> How would i go about fetching a sequence using an Entrez GeneID?
>
> thanks for any help
>
> adam
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list