[Biopython] Finding what is the most recent Pubmed ID or list of all valid PMIDs

Renato Alves rjalves at igc.gulbenkian.pt
Mon Jul 5 22:13:34 UTC 2010


From Peter on 07/05/2010 09:54 PM:
> On Mon, Jul 5, 2010 at 9:00 PM, Renato Alves <rjalves at igc.gulbenkian.pt> wrote:
>> Greetings All,
>>
>> I'm trying to figure out a way to have a more or less up-to-date list of
>> Pubmed IDs for validation purposes. This has to be performed on a
>> programmatic way.
>>
>> My first attempt was to look for this in NCBI's FTP. I could find
>> ftp://ftp.ncbi.nih.gov/pubmed/deleted_pmids.txt but not information
>> about the most recent PMID.
>>
>> I also tried to use EInfo but the count section under PubMed seems
>> either outdated or completely unrelated to the total number of assigned
>> PMIDs. Even when adding the total of deleted_pmids + the number from
>> EInfo I couldn't get accurate information.
>>
>> So my question is, does anyone know how to get either a list of all the
>> valid PMIDs or simply the most recent PMID?
> 
> To try and work out the latest PMID, I'd start by trying a PubMed search
> by date, using a recent threshold.
> 
> What number of PMIDs are you trying to validate? Would it make
> sense to use Entrez to do the validation (in batches)?
> 
> Peter
I've thought of using Entrez but I was trying to avoid it by using
information available locally. I've no idea how many and what PMIDs will
be requested.

But indeed searching by date I can get a rough idea of what might be the
most recent PMID.

Thanks Peter

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20100705/3724736d/attachment.sig>


More information about the Biopython mailing list