[Bioperl-l] Uniprot/Swiss accessions?

Cook, Malcolm MEC at stowers.org
Mon May 18 21:11:40 UTC 2009


Chris,

livelists, eh?  Cool!  So, the gis could be obtained using eutil search, which could be translated to accessions using livelists.

On a side note.... Do you happen if livelists includes refseq identifiers/gis?

Thx,

Malcolm Cook
Stowers Institute for Medical Research - Kansas City, Missouri
  

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu] 
> Sent: Monday, May 18, 2009 3:44 PM
> To: Cook, Malcolm
> Cc: 'Smithies, Russell'; 'BioPerl List'
> Subject: Re: [Bioperl-l] Uniprot/Swiss accessions?
> 
> If you need to retain mapping between acc => gi it gets a 
> little more complicated; most procedures to NCBI return a 
> 'bag' of gi's w/o any relation to their original accession.  
> You can grab them via esummary, though, but you'll have to 
> iterate through them.
> 
> The other option is LiveLists (has both nuc and protein acc => gi).   
> I'm assuming this would have the swissprot accessions 
> included (famous last words):
> 
> ftp://ftp.ncbi.nih.gov/genbank/livelists/README.genbank.livelists
> 
> chris
> 
> 
> 
> On May 18, 2009, at 9:34 AM, Cook, Malcolm wrote:
> 
> > you could:
> >
> > 1) Use eutils search with -database protein -term "srcdb swiss 
> > prot"[Properties]  If you use a retmax of 100000 it should 
> only take a 
> > few seconds to download the 458,445 ginumbers.
> >  I just did it.
> >
> > 2) use fastacmd to extract the fasta from nr for these gis, 
> and parse 
> > the defline.
> >  (assuming you have a copy of nr)
> >
> >
> > Does this work for you?
> >
> >
> > Malcolm Cook
> > Stowers Institute for Medical Research - Kansas City, Missouri
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> >> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> Smithies, 
> >> Russell
> >> Sent: Sunday, May 17, 2009 11:53 PM
> >> To: 'BioPerl List'
> >> Subject: [Bioperl-l] Uniprot/Swiss accessions?
> >>
> >> Does anyone know of a way to get GI numbers for Uniprot/Swissprot 
> >> accessions?
> >>
> >> Fasta from Uniprot's FTP site doesn't formatdb correctly 
> (with the -o 
> >> T option) as it's missing the gi number in the fasta header.
> >> NCBI won't let you use SwissProt ids in batch-entrez and I 
> don't want 
> >> to have to look up all 466,739 of them.
> >> I could use Bio::DB::Eutilities and query each id but even at 10 
> >> queries/second (the limit changed recently) it would take too long.
> >>
> >> Any ideas?
> >> Is there a swissprot2gi list somewhere?
> >>
> >> Thanx,
> >>
> >>
> >> Russell Smithies
> >>
> >> Bioinformatics Applications Developer T +64 3 489 9085 E  
> >> russell.smithies at agresearch.co.nz
> >>
> >> Invermay  Research Centre
> >> Puddle Alley,
> >> Mosgiel,
> >> New Zealand
> >> T  +64 3 489 3809
> >> F  +64 3 489 9174
> >> www.agresearch.co.nz
> >>
> >>
> >>
> >> ==============================================================
> >> =========
> >> Attention: The information contained in this message and/or 
> >> attachments from AgResearch Limited is intended only for 
> the persons 
> >> or entities to which it is addressed and may contain confidential 
> >> and/or privileged material. Any review, retransmission, 
> dissemination 
> >> or other use of, or taking of any action in reliance upon, this 
> >> information by persons or entities other than the intended 
> recipients 
> >> is prohibited by AgResearch Limited. If you have received this 
> >> message in error, please notify the sender immediately.
> >> ==============================================================
> >> =========
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 



More information about the Bioperl-l mailing list