[Bioperl-l] Can't find gene sequence in choromosome sequence

Sean Davis sdavis2 at mail.nih.gov
Fri Jan 6 06:34:19 EST 2006




On 1/6/06 3:35 AM, "Andrew Walsh" <walsh at cenix-bioscience.com> wrote:

> If you look at the entry in the .gbs file (release 34.1), the exon
> coordinates for that mRNA are on the negative strand.  Are you using the
> transcript sequence or the gene sequence?  If you are using the gene
> sequence, reverse complementing should do the trick.  If you are using
> the transcript sequence, this will not work since you are missing the
> introns.

Another possibility that is readily available and more robust is to use BLAT
at the UCSC genome browser.  It is really a pretty simple matter to drop
this sequence into the UCSC genome browser and BLAT it.  In addition to the
complexities already noted, note that mRNA sequence does NOT necessarily
match the associated genomic sequence base-for-base because of SNPs, lower
quality sequence reads, etc.  Finally, if you have the Accession (which you
do), you could simply look that up at UCSC and get the (curated) results of
the blat on the refseq track.

Sean


> 
> hz5 at njit.edu wrote:
>> NM is mRNA, should be separated by intron on genomic sequences, did you
>> consider this when you search?
>> 
>> Quoting Sam Al-Droubi <saldroubi at yahoo.com>:
>> 
>> 
>>> All,
>>> 
>>> I downloaded the fasta sequence for a mouse gene from
>>> genbank with accession number NM_01167.  I also
>>> downloaded the Mouse chromosome 3 fasta file from from
>>> ncbi 
>>> 
>> 
>> (ftp://ftp.ncbi.nlm.nih.gov/genomes/M_musculus/Assembled_chromosomes/mm_chr3.
>> fa.
>> gz).
>> 
>>> The problem is that I can not find the gene sequence
>>> in chromosome sequence. I used Perl
>>> index($chr_obj->seq,$seq_obj->seq) and I get -1,
>>> meaning no match.  I then searched by hand using grep
>>> and emacs and to my surprise, the gene sequence is not
>>> in the mm_chr3.fa file. What am I doing wrong?  Do I
>>> have the wrong chromosome file?  I am positive that
>>> this gene is in this chromosome according to genbank.
>>> By the way, I am doing this so that I can extract the
>>> promoter region right before the gene starts on the
>>> chromosome. 
>>> 
>>> Thank you in advance.
>>> 
>>> 
>>> 
>>> Sincerely, 
>>> Sam Al-Droubi, M.S.
>>> saldroubi at yahoo.com
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> 
>> 
>> 
>> =========================================================
>> Haibo Zhang, PhD
>> Computational Biology
>> http://www.cyberpostdoc.org/
>> Share postdoc information in cyberspace. Welcome your stories, suggestions
>> and 
>> advice!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 




More information about the Bioperl-l mailing list