[Bioperl-l] newby Q: fetch sequence from one database and blast against another and retrieve coordinates

Tiffanie Moss tiffanie.moss at gmail.com
Mon Jul 18 16:20:10 UTC 2011


i have two fasta files I want to use as databases - a contig file and a
scaffold file (from the assembled contigs). For the sequences I am
interested in, I have the coordinates to the contig database and I need the
corresponding coordinates to the scaffold database. The contigs in the
contig file are identified as partials of a scaffold (ie. scaffold1000-3,
scaffold1000-4, etc) and the scaffolds are listed in the scaffold file in
singletons (ie. scaffold1000, scaffold 1002, etc). I want a script that can
use the coordinates to the contig file to fetch the sequence and then blast
that sequence against the corresponding scaffold in the scaffold file and
provide the coordinates (start and stop). Then I want to compare these
coordinates to another file containing EST scaffold location coordinates in
order to determine if my sequences of interest are located in these
regions.Can anyone guide me as to where I can find a perl or bioperl
script that I
can manipulate to do this. I've started a script that will compare the
scaffold coordinates of the two files, but first I need to extract the
sequences from the contig file and get the corresponding Scaffold location
coordinates. Many thanks in advance.

-- 
Tiffanie Yael Moss
PhD candidate
Case Western Reserve University
Department of Biology
2080 Adelbert Road, Millis 127
Cleveland, Ohio  44106-7080
Fax: (216) 368-4672
Ph: (216) 368-5301



More information about the Bioperl-l mailing list