[Bioperl-l] Uniprot/Swiss accessions?

granjeau at tagc.univ-mrs.fr granjeau at tagc.univ-mrs.fr
Mon May 18 17:39:07 EDT 2009


May be you try the PICR service at EBI
http://www.ebi.ac.uk/Tools/picr/
or some other ID converter (as for example some Gene Ontology tools) or
even SRS.

I think there could be more than one gi per sp (it's not clear to me if
you are looking at SwissProt or UniProtKB, ie SP+TrEMBL).

Answer us your solution.

Regards,
Samuel

> If you need to retain mapping between acc => gi it gets a little more
> complicated; most procedures to NCBI return a 'bag' of gi's w/o any
> relation to their original accession.  You can grab them via esummary,
> though, but you'll have to iterate through them.
>
> The other option is LiveLists (has both nuc and protein acc => gi).
> I'm assuming this would have the swissprot accessions included (famous
> last words):
>
> ftp://ftp.ncbi.nih.gov/genbank/livelists/README.genbank.livelists
>
> chris
>
>
>
> On May 18, 2009, at 9:34 AM, Cook, Malcolm wrote:
>
>> you could:
>>
>> 1) Use eutils search with -database protein -term "srcdb swiss
>> prot"[Properties]
>>  If you use a retmax of 100000 it should only take a few seconds to
>> download the 458,445 ginumbers.
>>  I just did it.
>>
>> 2) use fastacmd to extract the fasta from nr for these gis, and
>> parse the defline.
>>  (assuming you have a copy of nr)
>>
>>
>> Does this work for you?
>>
>>
>> Malcolm Cook
>> Stowers Institute for Medical Research - Kansas City, Missouri
>>
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org
>>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
>>> Smithies, Russell
>>> Sent: Sunday, May 17, 2009 11:53 PM
>>> To: 'BioPerl List'
>>> Subject: [Bioperl-l] Uniprot/Swiss accessions?
>>>
>>> Does anyone know of a way to get GI numbers for
>>> Uniprot/Swissprot accessions?
>>>
>>> Fasta from Uniprot's FTP site doesn't formatdb correctly
>>> (with the -o T option) as it's missing the gi number in the
>>> fasta header.
>>> NCBI won't let you use SwissProt ids in batch-entrez and I
>>> don't want to have to look up all 466,739 of them.
>>> I could use Bio::DB::Eutilities and query each id but even at
>>> 10 queries/second (the limit changed recently) it would take too
>>> long.
>>>
>>> Any ideas?
>>> Is there a swissprot2gi list somewhere?
>>>
>>> Thanx,
>>>
>>>
>>> Russell Smithies
>>>
>>> Bioinformatics Applications Developer
>>> T +64 3 489 9085
>>> E  russell.smithies at agresearch.co.nz
>>>
>>> Invermay  Research Centre
>>> Puddle Alley,
>>> Mosgiel,
>>> New Zealand
>>> T  +64 3 489 3809
>>> F  +64 3 489 9174
>>> www.agresearch.co.nz
>>>
>>>



More information about the Bioperl-l mailing list