[Bioperl-l] Retrieving promoter sequenc

Hermann Norpois hnorpois at googlemail.com
Mon Apr 30 18:06:40 UTC 2012


Dear list,

I try to write a script for retrieving a 700bp sequence upstream of the
5´prime of TTS (a putative promoter sequence). This page gave me some
information how to do so (Chapter *Using Bio::DB::EntrezGene to get genomic
coordinates* AND
 *Using Bio::DB::GenBank when you have genomic coordinates to get a Seq
object*):
http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences

Actually I do not have an idea how to define $chr_acc_ver (see below)

#!/bin/perl -w
use strict;
use Bio::DB::EntrezGene;
use Bio::SeqIO;
use Bio::DB::GenBank;

my $id = "12064"; # bdnf

my $seqio_obj = Bio::SeqIO->new(-file => '>s2.fasta', -format => 'fasta' );

my $db = new Bio::DB::EntrezGene;

my $seq = $db->get_Seq_by_id($id);

my $ac = $seq->annotation;

for my $ann ($ac->get_Annotations('dblink'
)) {
    if ($ann->database eq "Evidence Viewer") {
                # get the sequence identifier, the start, and the stop
        my ($contig,$from,$to) = $ann->url =~
          /contig=([^&]+).+from=(\d+)&to=(\d+)/;
                 my $chr_start = $from-700;
                 my $chr_stop = $from;
                 my $gb  = Bio::DB::GenBank->new(-format     => 'genbank',
                                -seq_start  => $chr_start,
                                -seq_stop   => $chr_stop,
           #                     -strand     => $strand
                                );

                my $obj = $gb->get_Seq_by_id($chr_acc_ver);    # *How do I
define $chr_acc_ver?*

                $seqio_obj->write_seq($obj);
    #    print "$contig\t$from\t$to\n$chr_start\t$chr_stop\n";
    }
}

Can anybody give me a hint how this might work?

Thanks Hermann Norpois




More information about the Bioperl-l mailing list