[Bioperl-l] Refseq and Splice Variants

Stefan Kirov skirov at utk.edu
Mon Mar 14 15:34:32 EST 2005


What is your initial id- refseq or gene? Do you want all of them or just 
some. In any case LL_tmpl (locuslink file) has this data and there is a 
parser for it (hopefully an Entrez gene parser will be there soon). Also 
you can get  gene2refseq file from here 
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/. It is tab delimited and pretty 
easy to use. You need columns 1 and 4 I think.
Stefan

Peter J Stogios wrote:

> Hi,
>
> I am wondering if there is a way of easily identifying Refseq 
> sequences that are splice variants of the same gene.  If a gene has 
> multiple splice products that are supported by experimental evidence, 
> they get their own Refseq identifier, but there is no explicit 
> reference to the underlying gene they came from (outside of the 
> identifier line).
>
> What I am trying to do is group sets of Refseq sequences in FASTA 
> format into sets of splice variants of the same gene.  Does anyone 
> know of a way, using Bioperl, that I can accomplish this?
>
> Thanks,
>
> ~
> Peter J Stogios
> Ph.D. candidate, Privé Lab
> Dept. of Medical Biophysics, University of Toronto
> Ontario Cancer Institute, Princess Margaret Hospital
> e: pstogios at uhnres.utoronto.ca
> w: http://xtal.uhnres.utoronto.ca/prive
> p: (416) 946-4501x3280
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list