[Biopython] Finding what is the most recent Pubmed ID or list of all valid PMIDs

Peter biopython at maubp.freeserve.co.uk
Mon Jul 5 20:54:37 UTC 2010


On Mon, Jul 5, 2010 at 9:00 PM, Renato Alves <rjalves at igc.gulbenkian.pt> wrote:
> Greetings All,
>
> I'm trying to figure out a way to have a more or less up-to-date list of
> Pubmed IDs for validation purposes. This has to be performed on a
> programmatic way.
>
> My first attempt was to look for this in NCBI's FTP. I could find
> ftp://ftp.ncbi.nih.gov/pubmed/deleted_pmids.txt but not information
> about the most recent PMID.
>
> I also tried to use EInfo but the count section under PubMed seems
> either outdated or completely unrelated to the total number of assigned
> PMIDs. Even when adding the total of deleted_pmids + the number from
> EInfo I couldn't get accurate information.
>
> So my question is, does anyone know how to get either a list of all the
> valid PMIDs or simply the most recent PMID?

To try and work out the latest PMID, I'd start by trying a PubMed search
by date, using a recent threshold.

What number of PMIDs are you trying to validate? Would it make
sense to use Entrez to do the validation (in batches)?

Peter



More information about the Biopython mailing list