[Bioperl-l] Bio::DB::EntrezGene or Bio::DB::Query::GenBank to obtain sequence metadata without sequence

Dan Kortschak dan.kortschak at adelaide.edu.au
Sun Oct 11 18:03:52 EDT 2009


Hi Russell,

I ended up using a hodgepodge of Bio::DB::GenBank and
Bio::DB::EUtililties

#!/usr/bin/perl

use Bio::DB::EUtilities;
use Bio::DB::GenBank;

my @uids = qw(89161185 89161199 89161205 89161207 51511721 89161210 89161213 51511724 89161216 89161187 51511727 89161190 51511729 51511730 51511731 51511732 51511734 51511735 42406306 51511747 51511750 89161203 17981852 89161218 89161220);

my $gb = Bio::DB::GenBank->new(-complexity => 1,-seq_stop=>1);
my $seqio = $gb->get_Stream_by_gi(\@uids);
my $summary = Bio::DB::EUtilities->new(-eutil => 'esummary',
                           -db => 'nucleotide',
                           -id => \@uids);

print "chromosome,refSeq,name,length\n";

my $index=0;

while (my $seq = $seqio->next_seq() and my $ds = $summary->next_DocSum) {
	warn "Database queries don't reconcile",$seq->primary_id,"-",$ds->get_id,"\n" if $seq->primary_id != $ds->get_id;
	print $index++,",",$seq->id(),".",$seq->version,",";
	($feat)=$seq->get_SeqFeatures;
	if( defined $seq->species && $feat->annotation->get_Annotations('chromosome')) {
		print $seq->species->binomial;
		print " chromosome ",$feat->annotation->get_Annotations('chromosome'),",";
	} elsif (defined $seq->species && $feat->annotation->get_Annotations('organelle')) {
		print $seq->species->binomial;
		print " ",$feat->annotation->get_Annotations('organelle'),",";
	} else {
		$_=$seq->desc;
	        /^([[:alnum:] ]+)[[:graph:]]/;
		print "$1,";
	}
	print $ds->get_contents_by_name('Length') if $ds->get_contents_by_name('Length');
	print "\n";
}


If Bio::DB:EUtilities gives rich seq or equivalent I might change over
from Bio::DB::GenBank, but it pretty much works at the moment (the
fail-overs give me grief and I don't like the kludge of asking for a
single base, but they work to get some of the details that the DocSum
doesn't - sensible title for example).

cheers
Dan

On Mon, 2009-10-12 at 08:46 +1300, Smithies, Russell wrote:

> Or you could try using Bio::DB::Eutilities, specifying 'gene' as the database and 'table' as the retype.
> I'm not sure what retypes are allowed under B:D:E but it should be in the docs.





More information about the Bioperl-l mailing list