[Bioperl-l] Uniprot/Swiss accessions?
Chris Fields
cjfields at illinois.edu
Mon May 18 16:44:17 EDT 2009
If you need to retain mapping between acc => gi it gets a little more
complicated; most procedures to NCBI return a 'bag' of gi's w/o any
relation to their original accession. You can grab them via esummary,
though, but you'll have to iterate through them.
The other option is LiveLists (has both nuc and protein acc => gi).
I'm assuming this would have the swissprot accessions included (famous
last words):
ftp://ftp.ncbi.nih.gov/genbank/livelists/README.genbank.livelists
chris
On May 18, 2009, at 9:34 AM, Cook, Malcolm wrote:
> you could:
>
> 1) Use eutils search with -database protein -term "srcdb swiss
> prot"[Properties]
> If you use a retmax of 100000 it should only take a few seconds to
> download the 458,445 ginumbers.
> I just did it.
>
> 2) use fastacmd to extract the fasta from nr for these gis, and
> parse the defline.
> (assuming you have a copy of nr)
>
>
> Does this work for you?
>
>
> Malcolm Cook
> Stowers Institute for Medical Research - Kansas City, Missouri
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
>> Smithies, Russell
>> Sent: Sunday, May 17, 2009 11:53 PM
>> To: 'BioPerl List'
>> Subject: [Bioperl-l] Uniprot/Swiss accessions?
>>
>> Does anyone know of a way to get GI numbers for
>> Uniprot/Swissprot accessions?
>>
>> Fasta from Uniprot's FTP site doesn't formatdb correctly
>> (with the -o T option) as it's missing the gi number in the
>> fasta header.
>> NCBI won't let you use SwissProt ids in batch-entrez and I
>> don't want to have to look up all 466,739 of them.
>> I could use Bio::DB::Eutilities and query each id but even at
>> 10 queries/second (the limit changed recently) it would take too
>> long.
>>
>> Any ideas?
>> Is there a swissprot2gi list somewhere?
>>
>> Thanx,
>>
>>
>> Russell Smithies
>>
>> Bioinformatics Applications Developer
>> T +64 3 489 9085
>> E russell.smithies at agresearch.co.nz
>>
>> Invermay Research Centre
>> Puddle Alley,
>> Mosgiel,
>> New Zealand
>> T +64 3 489 3809
>> F +64 3 489 9174
>> www.agresearch.co.nz
>>
>>
>>
>> ==============================================================
>> =========
>> Attention: The information contained in this message and/or
>> attachments from AgResearch Limited is intended only for the
>> persons or entities to which it is addressed and may contain
>> confidential and/or privileged material. Any review,
>> retransmission, dissemination or other use of, or taking of
>> any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by
>> AgResearch Limited. If you have received this message in
>> error, please notify the sender immediately.
>> ==============================================================
>> =========
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list