[Bioperl-l] Fetching particular tags of genbank/genpept features via eutils

Jason Stajich jason.stajich at gmail.com
Wed Sep 25 20:15:55 UTC 2013


More likely you want another place to ask the question - biostars perhaps - http://www.biostars.org/

if you were only searching genomes you could perhaps grab the ptt files from the FTP site - though you would need to lookup GI ids to get the ACC number for the NP_XX which is a bit convoluded and I'm not sure you can round trip.

But the produce descriptions for the genomes like in bacteria are in these .ptt files.
ftp.ncbi.nih.gov:/genomes/Bacteria/Yersinia_pestis_A1122_uid158119/NC_017169.ptt

Jason

On Sep 24, 2013, at 11:00 PM, Alexey Morozov <alexeymorozov1991 at gmail.com> wrote:

> Dear colleagues,
> I'm not sure if this list is the right place to ask, but I have a question
> regarding NCBI eutils. Say, I have lots of CDS's IDs like NP_769305.1 or
> YP_001638012.1. "Lots" as in "Millions of them" (a metagenomic project). Of
> course, I want to get something actually meaningful, so I decided to use
> "product" field of associated protein records. Can I spare fetching
> complete records, which will easily be gygabytes upon gigabytes of
> unnecessary data and get at least only "protein" feature of genpept file?
> 
> -- 
> Alexey Morozov,
> LIN SB RAS, bioinformatics group.
> Irkutsk, Russia.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org





More information about the Bioperl-l mailing list