[Bioperl-l] Getting sequences by base pair locations

Cui, Wenwu (NIH/NLM/NCBI) [C] cuiw at ncbi.nlm.nih.gov
Fri Jul 28 13:46:50 UTC 2006


Maybe the easiest way is to use LWP to get the webpage. Here is an
example for CHIMP1A:10:12345678:12348888:

 

http://www.ensembl.org/Pan_troglodytes/exportview?format=fasta&l=10%3A12
345678-12348888&action=export&_format=Text&output=txt&submit=Continue+%3
E%3E

 

Wenwu Cui 

________________________________

From: Yuval Itan [mailto:y.itan at ucl.ac.uk] 
Sent: Friday, July 28, 2006 8:08 AM
To: bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Getting sequences by base pair locations

 

Hello all, 

 

I was BLATing a few hundred human genes against the chimp genome, and
kept the best chimp hits for every human gene. 

I have the base pair start and end location for every chimp hit, and I
need to get the sequence for each of these chimp hits. Here is an
example for a few chimp hits bp locations: 

 

Start End 

142854 144504 

154479 155198 

153066 167370 

163146 163559 

I have one chimp genome file (about 3GB) including all chromosomes, but
I could also get one file per chromosome if that would make things
easier. Does anyone have a script or a link for an interface that can do
the job? 

 

Thank you very much. 





More information about the Bioperl-l mailing list