[Bioperl-l] Missing Sequences

Ewan Birney birney@ebi.ac.uk
Thu, 30 May 2002 17:32:11 +0100 (BST)


On Thu, 30 May 2002, Mick Watson wrote:

> This is an old-ish problem when using Bioperl to fetch multiple
> sequences from GenBank/EMBL
> 
> I am using EMBL.pm (Bioperl 1.0) to fetch multiple sequences that have
> been identified from a blast search against Unigene.  Parsing the
> Accession from unigene entries is simple as I just look for the
> 
>     /gb=.....
> 
> token and I have the accessions.  Simple.
> 
> The problem is, I guess, that these are GenBank accessions so I get the
> following list:
> 
> AL117415 AJ291674 AJ291673 AJ291675 NM_022139 AF253318 NM_025220
> AB055891 BI826766 BG547620
> 
> When I use EMBL.pm to fetch these, it croaks with the error that
> NM_022139 and NM_025220 do not exist, and when I try to fetch them from
> the ebi, it's right, they don't.  However, when I go to the NCBI, they
> DO exist in GenBank (or at least the NCBI's nucleotide fetch tool says
> that they do)
> 
> So my question is why is it that there are sequences in GenBank that
> aren't in EMBL?  I'm guessing the NM_ prefix has some sort of
> relevance....
> 

They are refseq sequences. I think we have a RefSeq database in Bioperl
(Bio::DB::RefSeq).


GenBank means many things 

  (a) the american part of the international sequence database, shared
with partners in europe and japan (accession numbers)

  (b) the internal system used at NCBI to store DNA Sequences, including
sequences *not* in the international databases because they are derived
from sequences in there (RefSeq).


It is a long standing gripe from us Europeans that NCBI conviently blends
all these concepts into one


> Also, this looks as if this will force me to use GenBank.pm to fetch the
> sequences and not EMBL.pm, and I don't want to do this for various
> reasons....
> 
> Thanks
> Mick
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------