[Bioperl-l] Gene Type in Entrez gene?

Jill jillianrowe91286 at gmail.com
Wed Sep 28 06:03:32 UTC 2011


Hi there,

I am using the Bio::DB::Eutilities module to download gene sequences
based on a query.

<code>
while (my $docsum = $summaries->next_DocSum) {
  ## some items in DocSum are also named ChrStart so we pick the
genomic
  ## information item and get the coordinates from it
  my ($genomic_info)  = $docsum->get_Items_by_name('GenomicInfoType');

  ## some entries may have no data on genomic coordinates. This
condition filters then out
  if (!$genomic_info) {
    ## found no genomic coordinates data
    next;
  }

  ## get coordinates of sequence
  ## get_contents_by_name always returns a list
  my ($chr_acc_ver)   = $genomic_info-
>get_contents_by_name("ChrAccVer");
  my ($chr_start)     = $genomic_info-
>get_contents_by_name("ChrStart");
  my ($chr_stop)      = $genomic_info-
>get_contents_by_name("ChrStop");
  my $strand;

  if ($chr_start < $chr_stop) {
    $strand     = 1;
    $chr_start  = $chr_start +1 - $bp5_extra;
    $chr_stop   = $chr_stop  +1 + $bp5_extra;
  } elsif ($chr_start > $chr_stop) {
    $strand     = 2;
    $chr_start  = $chr_start +1 - (-$bp5_extra);
    $chr_stop   = $chr_stop  +1 + (-$bp5_extra);
  } else {
    next;
  }

  while (my $item = $docsum->next_Item('flattened'))  {
	next if ($item->get_name =~ m/NomenclatureName/);
	if($item->get_name =~ m/Description/) {
	  $description = $item->get_content if $item->get_content;
	  $description =~ tr/ /_/;
	  print $description, "\n";}
	if($item->get_name =~ m/Name/) {
	    $name = $item->get_content if $item->get_content;
	    print $name, "\n";
	  }
        printf("%-20s:%s\n",$item->get_name,$item->get_content) if
$item->get_content;
    }
}
</code>

Then I go on to use genbank to download the sequences based on the
chromosome splice. For what I have it works great. But I am trying to
get to the gene type (either protein coding or pseudo) as well. I can
see it in the summary on the Entrez Gene sight, but can't get to it
through bioperl. When I have it print out all the contents of the
summary it doesn't show up there either.

Any help?

Thanks!



More information about the Bioperl-l mailing list