[Bioperl-l] Bio::DB::RefSeq and NC_007092

armendarez77 at hotmail.com armendarez77 at hotmail.com
Tue Mar 2 21:06:17 UTC 2010


Hello,

I am writing a script to remotely access annotation files and parse information using Bio::DB::RefSeq and Bio::DB::Genbank.  I was testing it with random RefSeq accession numbers (NC_######) when something odd happened.  When I used the accession number 'NC_007092', the script seemed to freeze.  After some time, 'Out of Memory' was printed to the terminal.  

When I investigated the annotation file associated with NC_007092, a MapViewer page opened.  It turns out that NC_007092 is a genome shotgun sequence, but it does not start with 'NZ' as I though all shotgun sequences did.  

Is this a random event that I don't have to worry much about or is there a way to pre-screen accession numbers to ensure they are associated with complete genome RefSeq files?  

I've included my script in case there is something I missed that could have prevented this.

Thank you,

Veronica


_________________

use strict;
use Bio::Perl;
use Getopt::Long;
use IO::Handle;

my $accessionNumber;

GetOptions("accessionNumber=s"=>\$accessionNumber);
unless($accessionNumber){
    print<<"OPTIONS";
    options for $0
    accessionNumber    -a    accession number
OPTIONS
die;
}

my $description = annotation_info($accessionNumber);

print "$description\n";



sub annotation_info{
    
    my $seqObj;
    
    my $accNum = shift(@_);
    
    my $rs = Bio::DB::RefSeq->new();
    my $gb = Bio::DB::GenBank->new();
    

    if($accNum =~ /\w\w_\d{6}/){ #RefSeq annotations include an underscore in their accession number
        
        $seqObj = $rs->get_Seq_by_id($accNum);
    }
    elsif($accNum !~ /_/){ #GenBank annotation
        $seqObj = $gb->get_Seq_by_id($accNum);
    }
    
    return $seqObj->desc();
}

 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/201469226/direct/01/



More information about the Bioperl-l mailing list