[Bioperl-l] blast hit to feature gene sequence in bioperl?

Chris Fields cjfields at illinois.edu
Mon Aug 17 14:22:26 UTC 2009


That's possible, yes.  Use the hit information and use  
Bio::DB::GenBank to pull the sequence out, in the below example.  Note  
that strand is different than BioPerl's -1/0/1; efetch strand: 1 =  
normal (default), 2 = comp.

================================
my $factory = Bio::DB::GenBank->new(-format => 'genbank',
  -seq_start => $seqstart,
  -seq_stop => $seqend,
  -strand => $strand, # 1=plus, 2=minus
  );

$factory->get_Seq_by_id($id); # should be UID, use get_Seq_by_acc()  
for accessions
================================

This pulls everything into a Bio::Seq, though, so you'll need to push  
it out to a SeqIO output stream.  You can also use Bio::DB::EUtilities  
to get the raw sequence via efetch, something like (untested):

================================
my $fetcher = Bio::DB::EUtilities->new(
  -eutil => 'efetch',
  -db => 'nucleotide',
  -rettype => 'gb');

# loop: for each hit/HSP, grab sequence...
my $fetcher->set_parameters(
  -id         => $id        # UID or accession
  -seq_start  => $seqstart, # hit start
  -seq_stop   => $seqend,    # hit end
  -strand     => $strand # 1=plus, 2=minus
);

# then get raw content
$fetcher->get_Response(-file => ">$id.gb");
================================

You could probably plug into ENSembl similarly if the db versions  
match; see:

http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences

chris

On Aug 17, 2009, at 8:06 AM, David Quan wrote:

> Hello there,
>
>        I've been browsing around bioperl documentation and have used
> a blast parser, but am wondering if it is possible to use the start
> and end information for a hit to trace back to a gene in genbank and
> extract the sequence for that gene?  I have not been able to find
> elements that would work in such a way.  Hints and recommendations for
> elements that would be capable of behaving in such a way would be
> greatly appreciated.  Thanks very much.
>
> David N. Quan
>
> -- 
> Love of country is, at heart, trust in a nation's people, faith in
> their better nature, esteem for their best hopes, understanding for
> the magnificence and the distinctiveness and the huge, infinitely
> shaded cultural palette of their simple humanity.    --Bradley Burston
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list