[Bioperl-l] Re: RegSeq and NT_****** contig Id

Heikki Lehvaslaiho heikki at nildram.co.uk
Mon Jul 14 23:41:43 EDT 2003


Jing,

Bio::DB::RefSeq used to inherit from Bio::DB::NCBIHelper, but lately it
has been a subclass of Bio::DB::DBFetch. Looks like in the transition we
lost the warning:

  $self->throw("NT_ contigs are whole chromosome files which are
    not part of regular database distributions. Go to  
    ftp://ftp.ncbi.nih.gov/genomes/.") 
	if $ids =~ /NT_/;

It also true that the NCBI Entrez web interface now allows retrieving
NT_ contigs, so it would be possible to hack RefSeq class to retrieve
them. However, NCBI has asked us help to limit the load to their online
services, I am hesitant to do that when their eutils server is excluding
them (Or is it? Do we just need different parameters?). Downloading a
28,477,090 base mouse chromosome 1 sequence with tons of annotation is
certainly heavy. The warning should definitely be put back in.

Yours,
	-Heikki

P.S. DBI is for accessing local relational database and not needed here.
	-H.

On Mon, 2003-07-14 at 19:36, jzhao wrote:
> Dear Sir,
> 
> I was trying to retrieve some mouse contig data from the RefSeq database 
> with Bioperl. My testing perl script looks like:
> 
> use Bio::DB::RefSeq;
> use Bio::SeqIO;
> use DBI;
> use strict;
> 
> my $gb = new Bio::DB::RefSeq;
> my $seq = $gb->get_Seq_by_acc('NT_039167');
> 
> if ( defined $seq ) {
> 	printf 'seq defined', '\n';
> }	
> else {
> 	printf 'seq undefined', '\n';
> }	
> 
> This script works with access ids like NC_000913 (bacteria genome), but 
> with NT_****** contig id, the $seq returns undefined. I checked, these 
> contig data are stored in RefSeq db ftp site, but why they are not 
> available through DBI interface? anything I'm missing here?
> 
> Thank you very much,
> Jing
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list