[Bioperl-l] Getting sequences by base pair locations

Cook, Malcolm MEC at stowers-institute.org
Fri Jul 28 16:44:43 UTC 2006


There are many options.

But, it looks like you only have start end coordinates! Where do you
know which chromosome/contig the hit was on?

Assuming you have this, if you did the blat with a local copy of the
blat program and a the genome, then in addition to the blat command, you
have the twoBitToFa command which can extract the hits from the blat
index (see http://genome.ucsc.edu/goldenPath/help/blatSpec.html
<http://genome.ucsc.edu/goldenPath/help/blatSpec.html> )

Or did you do the blat at ucsc?

Malcolm Cook

Database Applications Manager, Bioinformatics
Stowers Institute for Medical Research 

oh - I replied similarly in the Bio BB forum, but it is held for
moderation so am replying here as well
 


________________________________

	From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Yuval Itan
	Sent: Friday, July 28, 2006 7:08 AM
	To: bioperl-l at lists.open-bio.org
	Subject: [Bioperl-l] Getting sequences by base pair locations
	
	
	Hello all, 

	I was BLATing a few hundred human genes against the chimp
genome, and kept the best chimp hits for every human gene. 
	I have the base pair start and end location for every chimp hit,
and I need to get the sequence for each of these chimp hits. Here is an
example for a few chimp hits bp locations: 

	Start End 
	142854 144504 
	154479 155198 
	153066 167370 
	163146 163559 
		I have one chimp genome file (about 3GB) including all
chromosomes, but I could also get one file per chromosome if that would
make things easier. Does anyone have a script or a link for an interface
that can do the job? 

	Thank you very much. 





More information about the Bioperl-l mailing list