[Bioperl-l] Uniprot/Swiss accessions?
Cook, Malcolm
MEC at stowers.org
Mon May 18 21:11:40 UTC 2009
Chris,
livelists, eh? Cool! So, the gis could be obtained using eutil search, which could be translated to accessions using livelists.
On a side note.... Do you happen if livelists includes refseq identifiers/gis?
Thx,
Malcolm Cook
Stowers Institute for Medical Research - Kansas City, Missouri
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: Monday, May 18, 2009 3:44 PM
> To: Cook, Malcolm
> Cc: 'Smithies, Russell'; 'BioPerl List'
> Subject: Re: [Bioperl-l] Uniprot/Swiss accessions?
>
> If you need to retain mapping between acc => gi it gets a
> little more complicated; most procedures to NCBI return a
> 'bag' of gi's w/o any relation to their original accession.
> You can grab them via esummary, though, but you'll have to
> iterate through them.
>
> The other option is LiveLists (has both nuc and protein acc => gi).
> I'm assuming this would have the swissprot accessions
> included (famous last words):
>
> ftp://ftp.ncbi.nih.gov/genbank/livelists/README.genbank.livelists
>
> chris
>
>
>
> On May 18, 2009, at 9:34 AM, Cook, Malcolm wrote:
>
> > you could:
> >
> > 1) Use eutils search with -database protein -term "srcdb swiss
> > prot"[Properties] If you use a retmax of 100000 it should
> only take a
> > few seconds to download the 458,445 ginumbers.
> > I just did it.
> >
> > 2) use fastacmd to extract the fasta from nr for these gis,
> and parse
> > the defline.
> > (assuming you have a copy of nr)
> >
> >
> > Does this work for you?
> >
> >
> > Malcolm Cook
> > Stowers Institute for Medical Research - Kansas City, Missouri
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> >> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Smithies,
> >> Russell
> >> Sent: Sunday, May 17, 2009 11:53 PM
> >> To: 'BioPerl List'
> >> Subject: [Bioperl-l] Uniprot/Swiss accessions?
> >>
> >> Does anyone know of a way to get GI numbers for Uniprot/Swissprot
> >> accessions?
> >>
> >> Fasta from Uniprot's FTP site doesn't formatdb correctly
> (with the -o
> >> T option) as it's missing the gi number in the fasta header.
> >> NCBI won't let you use SwissProt ids in batch-entrez and I
> don't want
> >> to have to look up all 466,739 of them.
> >> I could use Bio::DB::Eutilities and query each id but even at 10
> >> queries/second (the limit changed recently) it would take too long.
> >>
> >> Any ideas?
> >> Is there a swissprot2gi list somewhere?
> >>
> >> Thanx,
> >>
> >>
> >> Russell Smithies
> >>
> >> Bioinformatics Applications Developer T +64 3 489 9085 E
> >> russell.smithies at agresearch.co.nz
> >>
> >> Invermay Research Centre
> >> Puddle Alley,
> >> Mosgiel,
> >> New Zealand
> >> T +64 3 489 3809
> >> F +64 3 489 9174
> >> www.agresearch.co.nz
> >>
> >>
> >>
> >> ==============================================================
> >> =========
> >> Attention: The information contained in this message and/or
> >> attachments from AgResearch Limited is intended only for
> the persons
> >> or entities to which it is addressed and may contain confidential
> >> and/or privileged material. Any review, retransmission,
> dissemination
> >> or other use of, or taking of any action in reliance upon, this
> >> information by persons or entities other than the intended
> recipients
> >> is prohibited by AgResearch Limited. If you have received this
> >> message in error, please notify the sender immediately.
> >> ==============================================================
> >> =========
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list