From cjfields at illinois.edu Thu Mar 18 18:02:32 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 18 Mar 2010 17:02:32 -0500 Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects! References: <4BA29706.8040606@cornell.edu> Message-ID: <980817EA-2C31-4D1C-93B0-2A836024E290@illinois.edu> (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters) Hi all, Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code! GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents). Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo Student applications are due April 9, 2010 at 19:00 UTC. Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying. For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas. Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page. Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code! Rob Buels OBF GSoC 2010 Administrator From rmb32 at cornell.edu Thu Mar 18 17:11:34 2010 From: rmb32 at cornell.edu (Robert Buels) Date: Thu, 18 Mar 2010 14:11:34 -0700 Subject: [Open-bio-l] Google Summer of Code is *ON* for OBF projects! Message-ID: <4BA29706.8040606@cornell.edu> Hi all, Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code! GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents). Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo Student applications are due April 9, 2010 at 19:00 UTC. Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying. For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas. Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page. Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code! Rob Buels OBF GSoC 2010 Administrator From peter at maubp.freeserve.co.uk Wed Mar 24 10:08:26 2010 From: peter at maubp.freeserve.co.uk (Peter) Date: Wed, 24 Mar 2010 14:08:26 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: Message-ID: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Hi, This is probably of interest to all the Bio* projects offering access to the NCBI Entrez utilities. See forwarded message below. I *think* the new guidelines basically say that the email & tool parameters are optional BUT if your IP address ever gets banned for excessive use you then have to register an email & tool combination. Regarding the email address, the NCBI say to use the email of the developer (not the end user). However, they do not distinguish between the developers of a library (like us), and the developers of an application or script using a library (who may also be the end user). Currently we (Biopython) and I think BioPerl ask developers using our libraries to populate the email address themselves. I *think* this is still the right action. Peter ---------- Forwarded message ---------- From: Date: Wed, Mar 24, 2010 at 1:53 PM Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy To: NLM/NCBI List utilities-announce New E-utility documentation now on the NCBI Bookshelf The Entrez Programming Utilities (E-Utilities) Help documentation has been added to the NCBI Bookshelf, and so?is now fully integrated with the Entrez search and retrieval system as a part of the Bookshelf database. This help document has been divided into chapters for better organization and includes several new sample Perl scripts. At present this book covers the standard URL interface for the E-utilties; material about the SOAP interface will be added soon and is still available at the same URL: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. Revised E-utility usage policy In December, 2009 NCBI announced a change to the usage policy for the E-utilities that would require all requests to contain non-null values for both the?&email and &tool parameters. After several consultations with our users and developers, we have decided to revise this policy change, and the revised?policy is described in detail at the following link: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen Please let us know if you have any questions or concerns about this policy change. Thank you, The E-Utilities Team NIH/NLM/NCBI eutilities at ncbi.nlm.nih.gov. _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce -------------- next part -------------- _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce From cjfields at illinois.edu Wed Mar 24 10:37:13 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 09:37:13 -0500 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Message-ID: <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> On Mar 24, 2010, at 9:08 AM, Peter wrote: > Hi, > > This is probably of interest to all the Bio* projects offering access > to the NCBI > Entrez utilities. See forwarded message below. > > I *think* the new guidelines basically say that the email & tool parameters are > optional BUT if your IP address ever gets banned for excessive use you then > have to register an email & tool combination. > > Regarding the email address, the NCBI say to use the email of the developer > (not the end user). However, they do not distinguish between the developers > of a library (like us), and the developers of an application or script using a > library (who may also be the end user). > > Currently we (Biopython) and I think BioPerl ask developers using our libraries > to populate the email address themselves. I *think* this is still the > right action. > > Peter Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I think with the SOAP-based ones as well). We're providing a specific set of tools for user to write up their own applications end applications. I can try contacting them regarding this to get an official response to clarify this somewhat. Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a default, but always leave the email blank and issue a warning if it isn't set. We could just as easily leave both blank and issue warnings for both. chris > ---------- Forwarded message ---------- > From: > Date: Wed, Mar 24, 2010 at 1:53 PM > Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy > To: NLM/NCBI List utilities-announce > > > New E-utility documentation now on the NCBI Bookshelf > > The Entrez Programming Utilities (E-Utilities) Help documentation has > been added to the NCBI Bookshelf, and so is now fully integrated with > the Entrez search and retrieval system as a part of the Bookshelf > database. This help document has been divided into chapters for better > organization and includes several new sample Perl scripts. At present > this book covers the standard URL interface for the E-utilties; > material about the SOAP interface will be added soon and is still > available at the same URL: > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. > > > > Revised E-utility usage policy > > In December, 2009 NCBI announced a change to the usage policy for the > E-utilities that would require all requests to contain non-null values > for both the &email and &tool parameters. After several consultations > with our users and developers, we have decided to revise this policy > change, and the revised policy is described in detail at the following > link: > > http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > > Please let us know if you have any questions or concerns about this > policy change. > > > > Thank you, > > The E-Utilities Team > > NIH/NLM/NCBI > > eutilities at ncbi.nlm.nih.gov. > > > > _______________________________________________ > Utilities-announce mailing list > http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From biopython at maubp.freeserve.co.uk Wed Mar 24 10:51:46 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 24 Mar 2010 14:51:46 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> Message-ID: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields wrote: > > On Mar 24, 2010, at 9:08 AM, Peter wrote: > >> Hi, >> >> This is probably of interest to all the Bio* projects offering access >> to the NCBI Entrez utilities. See forwarded message below. >> >> I *think* the new guidelines basically say that the email & tool parameters are >> optional BUT if your IP address ever gets banned for excessive use you then >> have to register an email & tool combination. >> >> Regarding the email address, the NCBI say to use the email of the developer >> (not the end user). However, they do not distinguish between the developers >> of a library (like us), and the developers of an application or script using a >> library (who may also be the end user). >> >> Currently we (Biopython) and I think BioPerl ask developers using our libraries >> to populate the email address themselves. I *think* this is still the >> right action. >> >> Peter > > > Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I > think with the SOAP-based ones as well). ?We're providing a specific set of > tools for user to write up their own applications end applications. ?I can try > contacting them regarding this to get an official response to clarify this > somewhat. Please give the NCBI an email - you can CC me too if you like. > Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a > default, but always leave the email blank and issue a warning if it isn't > set. ?We could just as easily leave both blank and issue warnings for both. We currently leave out the email and set the tool parameter to "Biopython" by default but this can be overridden. Currently leaving out the email does cause Biopython to give a warning. Peter From maj at fortinbras.us Wed Mar 24 10:48:55 2010 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 24 Mar 2010 10:48:55 -0400 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility UsagePolicy In-Reply-To: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Message-ID: <9834AA2D2FCC4F918C538A7B7C152B1D@NewLife> Hey Peter-- thanks for this heads-up-- cheers MAJ ----- Original Message ----- From: "Peter" To: Cc: "bioperl-l list" ; "Biopython-Dev Mailing List" ; ; Sent: Wednesday, March 24, 2010 10:08 AM Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility UsagePolicy Hi, This is probably of interest to all the Bio* projects offering access to the NCBI Entrez utilities. See forwarded message below. I *think* the new guidelines basically say that the email & tool parameters are optional BUT if your IP address ever gets banned for excessive use you then have to register an email & tool combination. Regarding the email address, the NCBI say to use the email of the developer (not the end user). However, they do not distinguish between the developers of a library (like us), and the developers of an application or script using a library (who may also be the end user). Currently we (Biopython) and I think BioPerl ask developers using our libraries to populate the email address themselves. I *think* this is still the right action. Peter ---------- Forwarded message ---------- From: Date: Wed, Mar 24, 2010 at 1:53 PM Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy To: NLM/NCBI List utilities-announce New E-utility documentation now on the NCBI Bookshelf The Entrez Programming Utilities (E-Utilities) Help documentation has been added to the NCBI Bookshelf, and so is now fully integrated with the Entrez search and retrieval system as a part of the Bookshelf database. This help document has been divided into chapters for better organization and includes several new sample Perl scripts. At present this book covers the standard URL interface for the E-utilties; material about the SOAP interface will be added soon and is still available at the same URL: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. Revised E-utility usage policy In December, 2009 NCBI announced a change to the usage policy for the E-utilities that would require all requests to contain non-null values for both the &email and &tool parameters. After several consultations with our users and developers, we have decided to revise this policy change, and the revised policy is described in detail at the following link: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen Please let us know if you have any questions or concerns about this policy change. Thank you, The E-Utilities Team NIH/NLM/NCBI eutilities at ncbi.nlm.nih.gov. _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce -------------------------------------------------------------------------------- > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l > From hlapp at drycafe.net Wed Mar 24 11:27:37 2010 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 24 Mar 2010 11:27:37 -0400 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> Message-ID: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> On Mar 24, 2010, at 10:51 AM, Peter wrote: > Please give the NCBI an email - you can CC me too if you like. Can't this be the developers' mailing list (or lists, the appropriate one for each toolkit)? We can even whitelist all NCBI sender addresses so they can easily email us if there are issues. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Mar 24 11:31:34 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 10:31:34 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies Message-ID: To whom it may concern, Just want to get some clarification from the eutils folks with the changes. I'm a core developer for the BioPerl toolkit and and in collaboration with several other Bio* toolkits (BioPython, BioRuby, BioJava). We all have interfaces to eutils, either via the standard interface, SOAP-based, or both. We're seeking a clarification regarding the new rules, specifically the rules concerning 'tool' and 'email'. Per the rules proposed in Dec. 2009, many of us have already implemented changes to address them. As we're basically language-specific toolkits (classes, modules, etc) that are used for developing other applications, we have taken the stance that the end-user will need to start providing at least the email in order to properly use the eutils-specific tools, with a warning issued otherwise. This is based on several reasons, foremost being the toolkits are very widely used (so may be spread over potentially thousands of IPs) and are being used in a large number of downstream applications. As an example of the latter, BioJava is used in several free and commercial applications, such as Taverna and Geneious. Currently, with BioPerl and Biopython, we set the 'tool' parameter specifically to the toolkit name by default. This parameter can be overridden. However, from reading the newest rules it appears that each tool (and thus each toolkit) should have one common email, no more. This also appears to make the assertion that users using these toolkits (with 'tool' set and using the relevant emails) may essentially be tied down if one IP decides to abuse the rules. This unfortunately makes the incorrect generalization that each tool is created by one developer, and simply does not make sense in large collaborative projects such as ours, where the software is used primarily in downstream applications and scripts. Setting the tool could be beneficial to the development team, but at the same time it could be a tremendous hindrance. So, what exactly should we do? Do we set the 'tool' by default, or leave it to the user? Similarly, how do we treat 'email'? If we do set them, would the entire group of users for that tool be blocked if one end-user abuses the system? Should we leave it up to the user to register themselves (both tool and email)? So, we're at an impasse and really need your help. Sincerely, chris Christopher Fields Core Developer, BioPerl Project IGB Postdoctoral Fellow Genomics of Neural & Behavioral Plasticity University of Illinois Urbana-Champaign Institute for Genomic Biology 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From biopython at maubp.freeserve.co.uk Wed Mar 24 11:41:59 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 24 Mar 2010 15:41:59 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> Message-ID: <320fb6e01003240841s5d127ff7v72e425ee00aec34c@mail.gmail.com> On Wed, Mar 24, 2010 at 3:27 PM, Hilmar Lapp wrote: > > > On Mar 24, 2010, at 10:51 AM, Peter wrote: > >> Please give the NCBI an email - you can CC me too if you like. > > Can't this be the developers' mailing list (or lists, the appropriate one > for each toolkit)? We can even whitelist all NCBI sender addresses so they > can easily email us if there are issues. > I'd wondered about signing up the dev mailing lists to the NCBI E-utility mailing list but I'm not sure how the passwords would work (the month reminder would also get sent to the list): http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce But for a discussion of the guidelines, then yes, why not CC the open-bio-l at lists.open-bio.org list and we can manually moderate and/or white list the NCBI. Peter From cjfields at illinois.edu Wed Mar 24 11:44:21 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 10:44:21 -0500 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> Message-ID: <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> On Mar 24, 2010, at 9:51 AM, Peter wrote: > On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields wrote: >> >> On Mar 24, 2010, at 9:08 AM, Peter wrote: >> >>> Hi, >>> >>> This is probably of interest to all the Bio* projects offering access >>> to the NCBI Entrez utilities. See forwarded message below. >>> >>> I *think* the new guidelines basically say that the email & tool parameters are >>> optional BUT if your IP address ever gets banned for excessive use you then >>> have to register an email & tool combination. >>> >>> Regarding the email address, the NCBI say to use the email of the developer >>> (not the end user). However, they do not distinguish between the developers >>> of a library (like us), and the developers of an application or script using a >>> library (who may also be the end user). >>> >>> Currently we (Biopython) and I think BioPerl ask developers using our libraries >>> to populate the email address themselves. I *think* this is still the >>> right action. >>> >>> Peter >> >> >> Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I >> think with the SOAP-based ones as well). We're providing a specific set of >> tools for user to write up their own applications end applications. I can try >> contacting them regarding this to get an official response to clarify this >> somewhat. > > Please give the NCBI an email - you can CC me too if you like. Sent, have cc'd the open-bio list. Don't want to cross-post this too much, so I think we should move the discussion there. >> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a >> default, but always leave the email blank and issue a warning if it isn't >> set. We could just as easily leave both blank and issue warnings for both. > > We currently leave out the email and set the tool parameter to "Biopython" > by default but this can be overridden. Currently leaving out the email does > cause Biopython to give a warning. > > Peter We follow the same, then (down to the warning). This is mentioned in my post to them, I'll wait to see what they say. My concern is the wording of the new rules. Each tool and email must be registered with them if an IP is blocked. Does this mean each tool is assigned one specific email? And an IP that is blocked can register it to be allowed back into the fold? With that in mind, should we register each of our toolkits with them? Probably not a bad thing (it might help us as devs to get an idea of use), but then if one user abuses the rules will their actions affect all toolkit users? Is this all done on a per-IP basis, per-toolkit basis, etc? Unfortunately, at least to me, none of this is made very clear, so I'm hoping there is some clarification from their end. chris From maj at fortinbras.us Wed Mar 24 12:37:56 2010 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 24 Mar 2010 12:37:56 -0400 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI RevisedE-utility Usage Policy In-Reply-To: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com><38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu><320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> Message-ID: I think this is a great idea--- MAJ ----- Original Message ----- From: "Hilmar Lapp" To: "Peter" Cc: ; "Biopython-Dev Mailing List" ; ; "bioperl-l list" ; "Chris Fields" ; Sent: Wednesday, March 24, 2010 11:27 AM Subject: Re: [Bioperl-l] [Open-bio-l] Fwd: [Utilities-announce] NCBI RevisedE-utility Usage Policy > > On Mar 24, 2010, at 10:51 AM, Peter wrote: > >> Please give the NCBI an email - you can CC me too if you like. > > > Can't this be the developers' mailing list (or lists, the appropriate one for > each toolkit)? We can even whitelist all NCBI sender addresses so they can > easily email us if there are issues. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From andy.jenkinson at ebi.ac.uk Wed Mar 24 12:24:59 2010 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Wed, 24 Mar 2010 16:24:59 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Message-ID: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> >From my experience, if you set a default value for something and there is very little advantage to changing it, people will rarely bother to do so. The library developer's email address is not very useful for NCBI, who I assume wish to use it to contact whoever is consuming their resources. Being able to contact the Bio* developer doesn't really allow them to do this. The Bio* mailing list would be an option because there is at least some chance the app developer will get the email, but on balance I think it'd be better to incentivise people to change it themselves. So I would say: leave it blank and give a warning. Cheers, Andy On 24 Mar 2010, at 14:08, Peter wrote: > Hi, > > This is probably of interest to all the Bio* projects offering access > to the NCBI > Entrez utilities. See forwarded message below. > > I *think* the new guidelines basically say that the email & tool parameters are > optional BUT if your IP address ever gets banned for excessive use you then > have to register an email & tool combination. > > Regarding the email address, the NCBI say to use the email of the developer > (not the end user). However, they do not distinguish between the developers > of a library (like us), and the developers of an application or script using a > library (who may also be the end user). > > Currently we (Biopython) and I think BioPerl ask developers using our libraries > to populate the email address themselves. I *think* this is still the > right action. > > Peter > > ---------- Forwarded message ---------- > From: > Date: Wed, Mar 24, 2010 at 1:53 PM > Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy > To: NLM/NCBI List utilities-announce > > > New E-utility documentation now on the NCBI Bookshelf > > The Entrez Programming Utilities (E-Utilities) Help documentation has > been added to the NCBI Bookshelf, and so is now fully integrated with > the Entrez search and retrieval system as a part of the Bookshelf > database. This help document has been divided into chapters for better > organization and includes several new sample Perl scripts. At present > this book covers the standard URL interface for the E-utilties; > material about the SOAP interface will be added soon and is still > available at the same URL: > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. > > > > Revised E-utility usage policy > > In December, 2009 NCBI announced a change to the usage policy for the > E-utilities that would require all requests to contain non-null values > for both the &email and &tool parameters. After several consultations > with our users and developers, we have decided to revise this policy > change, and the revised policy is described in detail at the following > link: > > http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > > Please let us know if you have any questions or concerns about this > policy change. > > > > Thank you, > > The E-Utilities Team > > NIH/NLM/NCBI > > eutilities at ncbi.nlm.nih.gov. > > > > _______________________________________________ > Utilities-announce mailing list > http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l From cjfields at illinois.edu Wed Mar 24 14:21:51 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 13:21:51 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> Message-ID: Neither the eutils notification nor the new eutils docs make this very clear. For instance, from reading the documentation, one would only have to register the tool and email once an IP is blocked. However, later on it is indicated that the values supplied must be registered with NCBI or they will be blocked, which (to me at least) reads as if they must be registered regardless. Which is it? Also, there is the bit about the tool and email belonging to the software developer or organization, not the end-user, likely for the reasons Hilmar mentions. Does this mean each tool has one assigned email? This would then mean we need to either set both and register them just in case, or leave both empty and warn the user. We have a bit of time to work out the specifics, just hoping NCBI responds (one never knows with them). chris On Mar 24, 2010, at 11:24 AM, Andy Jenkinson wrote: >> From my experience, if you set a default value for something and there is very little advantage to changing it, people will rarely bother to do so. > > The library developer's email address is not very useful for NCBI, who I assume wish to use it to contact whoever is consuming their resources. Being able to contact the Bio* developer doesn't really allow them to do this. The Bio* mailing list would be an option because there is at least some chance the app developer will get the email, but on balance I think it'd be better to incentivise people to change it themselves. > > So I would say: leave it blank and give a warning. > > Cheers, > Andy > > On 24 Mar 2010, at 14:08, Peter wrote: > >> Hi, >> >> This is probably of interest to all the Bio* projects offering access >> to the NCBI >> Entrez utilities. See forwarded message below. >> >> I *think* the new guidelines basically say that the email & tool parameters are >> optional BUT if your IP address ever gets banned for excessive use you then >> have to register an email & tool combination. >> >> Regarding the email address, the NCBI say to use the email of the developer >> (not the end user). However, they do not distinguish between the developers >> of a library (like us), and the developers of an application or script using a >> library (who may also be the end user). >> >> Currently we (Biopython) and I think BioPerl ask developers using our libraries >> to populate the email address themselves. I *think* this is still the >> right action. >> >> Peter >> >> ---------- Forwarded message ---------- >> From: >> Date: Wed, Mar 24, 2010 at 1:53 PM >> Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy >> To: NLM/NCBI List utilities-announce >> >> >> New E-utility documentation now on the NCBI Bookshelf >> >> The Entrez Programming Utilities (E-Utilities) Help documentation has >> been added to the NCBI Bookshelf, and so is now fully integrated with >> the Entrez search and retrieval system as a part of the Bookshelf >> database. This help document has been divided into chapters for better >> organization and includes several new sample Perl scripts. At present >> this book covers the standard URL interface for the E-utilties; >> material about the SOAP interface will be added soon and is still >> available at the same URL: >> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. >> >> >> >> Revised E-utility usage policy >> >> In December, 2009 NCBI announced a change to the usage policy for the >> E-utilities that would require all requests to contain non-null values >> for both the &email and &tool parameters. After several consultations >> with our users and developers, we have decided to revise this policy >> change, and the revised policy is described in detail at the following >> link: >> >> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen >> >> Please let us know if you have any questions or concerns about this >> policy change. >> >> >> >> Thank you, >> >> The E-Utilities Team >> >> NIH/NLM/NCBI >> >> eutilities at ncbi.nlm.nih.gov. >> >> >> >> _______________________________________________ >> Utilities-announce mailing list >> http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l > > > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l From birney at ebi.ac.uk Wed Mar 24 15:08:25 2010 From: birney at ebi.ac.uk (Ewan Birney) Date: Wed, 24 Mar 2010 19:08:25 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> Message-ID: <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> Sorry to perk up here, but I think the right thing is to throw an exception if it's called without a "tool" and "email" parameter. Of course, then a client programmer can abuse this, but they are forced to put something in there. On 24 Mar 2010, at 18:21, Chris Fields wrote: > Neither the eutils notification nor the new eutils docs make this > very clear. For instance, from reading the documentation, one would > only have to register the tool and email once an IP is blocked. > However, later on it is indicated that the values supplied must be > registered with NCBI or they will be blocked, which (to me at least) > reads as if they must be registered regardless. Which is it? > > Also, there is the bit about the tool and email belonging to the > software developer or organization, not the end-user, likely for the > reasons Hilmar mentions. Does this mean each tool has one assigned > email? This would then mean we need to either set both and register > them just in case, or leave both empty and warn the user. > > We have a bit of time to work out the specifics, just hoping NCBI > responds (one never knows with them). > > chris > > On Mar 24, 2010, at 11:24 AM, Andy Jenkinson wrote: > >>> From my experience, if you set a default value for something and >>> there is very little advantage to changing it, people will rarely >>> bother to do so. >> >> The library developer's email address is not very useful for NCBI, >> who I assume wish to use it to contact whoever is consuming their >> resources. Being able to contact the Bio* developer doesn't really >> allow them to do this. The Bio* mailing list would be an option >> because there is at least some chance the app developer will get >> the email, but on balance I think it'd be better to incentivise >> people to change it themselves. >> >> So I would say: leave it blank and give a warning. >> >> Cheers, >> Andy >> >> On 24 Mar 2010, at 14:08, Peter wrote: >> >>> Hi, >>> >>> This is probably of interest to all the Bio* projects offering >>> access >>> to the NCBI >>> Entrez utilities. See forwarded message below. >>> >>> I *think* the new guidelines basically say that the email & tool >>> parameters are >>> optional BUT if your IP address ever gets banned for excessive use >>> you then >>> have to register an email & tool combination. >>> >>> Regarding the email address, the NCBI say to use the email of the >>> developer >>> (not the end user). However, they do not distinguish between the >>> developers >>> of a library (like us), and the developers of an application or >>> script using a >>> library (who may also be the end user). >>> >>> Currently we (Biopython) and I think BioPerl ask developers using >>> our libraries >>> to populate the email address themselves. I *think* this is still >>> the >>> right action. >>> >>> Peter >>> >>> ---------- Forwarded message ---------- >>> From: >>> Date: Wed, Mar 24, 2010 at 1:53 PM >>> Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy >>> To: NLM/NCBI List utilities-announce >> > >>> >>> >>> New E-utility documentation now on the NCBI Bookshelf >>> >>> The Entrez Programming Utilities (E-Utilities) Help documentation >>> has >>> been added to the NCBI Bookshelf, and so is now fully integrated >>> with >>> the Entrez search and retrieval system as a part of the Bookshelf >>> database. This help document has been divided into chapters for >>> better >>> organization and includes several new sample Perl scripts. At >>> present >>> this book covers the standard URL interface for the E-utilties; >>> material about the SOAP interface will be added soon and is still >>> available at the same URL: >>> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. >>> >>> >>> >>> Revised E-utility usage policy >>> >>> In December, 2009 NCBI announced a change to the usage policy for >>> the >>> E-utilities that would require all requests to contain non-null >>> values >>> for both the &email and &tool parameters. After several >>> consultations >>> with our users and developers, we have decided to revise this policy >>> change, and the revised policy is described in detail at the >>> following >>> link: >>> >>> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen >>> >>> Please let us know if you have any questions or concerns about this >>> policy change. >>> >>> >>> >>> Thank you, >>> >>> The E-Utilities Team >>> >>> NIH/NLM/NCBI >>> >>> eutilities at ncbi.nlm.nih.gov. >>> >>> >>> >>> _______________________________________________ >>> Utilities-announce mailing list >>> http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce >>> _______________________________________________ >>> Open-Bio-l mailing list >>> Open-Bio-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/open-bio-l >> >> >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l > > > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l From cjfields at illinois.edu Thu Mar 25 00:51:25 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 23:51:25 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> Message-ID: <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Yep, tend to agree. And the client program would be ultimately responsible for whatever is entered in. Of course, we'll want to register certain tools/emails anyway ('BioPerl' possibly along with a white-listed email to the mailing list), just to be on the safe side. Don't want the hassle of dealing with someone else pretending to be one of the various Bio*. chris On Mar 24, 2010, at 2:08 PM, Ewan Birney wrote: > > Sorry to perk up here, but I think the right thing is to throw an exception > if it's called without a "tool" and "email" parameter. Of course, then a client > programmer can abuse this, but they are forced to put something in there. > > > On 24 Mar 2010, at 18:21, Chris Fields wrote: > >> Neither the eutils notification nor the new eutils docs make this very clear. For instance, from reading the documentation, one would only have to register the tool and email once an IP is blocked. However, later on it is indicated that the values supplied must be registered with NCBI or they will be blocked, which (to me at least) reads as if they must be registered regardless. Which is it? >> >> Also, there is the bit about the tool and email belonging to the software developer or organization, not the end-user, likely for the reasons Hilmar mentions. Does this mean each tool has one assigned email? This would then mean we need to either set both and register them just in case, or leave both empty and warn the user. >> >> We have a bit of time to work out the specifics, just hoping NCBI responds (one never knows with them). >> >> chris >> >> On Mar 24, 2010, at 11:24 AM, Andy Jenkinson wrote: >> >>>> From my experience, if you set a default value for something and there is very little advantage to changing it, people will rarely bother to do so. >>> >>> The library developer's email address is not very useful for NCBI, who I assume wish to use it to contact whoever is consuming their resources. Being able to contact the Bio* developer doesn't really allow them to do this. The Bio* mailing list would be an option because there is at least some chance the app developer will get the email, but on balance I think it'd be better to incentivise people to change it themselves. >>> >>> So I would say: leave it blank and give a warning. >>> >>> Cheers, >>> Andy >>> >>> On 24 Mar 2010, at 14:08, Peter wrote: >>> >>>> Hi, >>>> >>>> This is probably of interest to all the Bio* projects offering access >>>> to the NCBI >>>> Entrez utilities. See forwarded message below. >>>> >>>> I *think* the new guidelines basically say that the email & tool parameters are >>>> optional BUT if your IP address ever gets banned for excessive use you then >>>> have to register an email & tool combination. >>>> >>>> Regarding the email address, the NCBI say to use the email of the developer >>>> (not the end user). However, they do not distinguish between the developers >>>> of a library (like us), and the developers of an application or script using a >>>> library (who may also be the end user). >>>> >>>> Currently we (Biopython) and I think BioPerl ask developers using our libraries >>>> to populate the email address themselves. I *think* this is still the >>>> right action. >>>> >>>> Peter >>>> >>>> ---------- Forwarded message ---------- >>>> From: >>>> Date: Wed, Mar 24, 2010 at 1:53 PM >>>> Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy >>>> To: NLM/NCBI List utilities-announce >>>> >>>> >>>> New E-utility documentation now on the NCBI Bookshelf >>>> >>>> The Entrez Programming Utilities (E-Utilities) Help documentation has >>>> been added to the NCBI Bookshelf, and so is now fully integrated with >>>> the Entrez search and retrieval system as a part of the Bookshelf >>>> database. This help document has been divided into chapters for better >>>> organization and includes several new sample Perl scripts. At present >>>> this book covers the standard URL interface for the E-utilties; >>>> material about the SOAP interface will be added soon and is still >>>> available at the same URL: >>>> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. >>>> >>>> >>>> >>>> Revised E-utility usage policy >>>> >>>> In December, 2009 NCBI announced a change to the usage policy for the >>>> E-utilities that would require all requests to contain non-null values >>>> for both the &email and &tool parameters. After several consultations >>>> with our users and developers, we have decided to revise this policy >>>> change, and the revised policy is described in detail at the following >>>> link: >>>> >>>> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen >>>> >>>> Please let us know if you have any questions or concerns about this >>>> policy change. >>>> >>>> >>>> >>>> Thank you, >>>> >>>> The E-Utilities Team >>>> >>>> NIH/NLM/NCBI >>>> >>>> eutilities at ncbi.nlm.nih.gov. >>>> >>>> >>>> >>>> _______________________________________________ >>>> Utilities-announce mailing list >>>> http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce >>>> _______________________________________________ >>>> Open-Bio-l mailing list >>>> Open-Bio-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/open-bio-l >>> >>> >>> _______________________________________________ >>> Open-Bio-l mailing list >>> Open-Bio-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/open-bio-l >> >> >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l > From birney at ebi.ac.uk Thu Mar 25 07:07:30 2010 From: birney at ebi.ac.uk (Ewan Birney) Date: Thu, 25 Mar 2010 11:07:30 +0000 (GMT) Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: Oddly, I don't think you want to be registering BioPerl as a client with an email. Rather the Bioperl libraries should prevent a client programmer from using the functions without an email and program type entered. This forces the decision onto the client programmer. -- ----------------------------------------------------------------- Ewan Birney. Work: +44 1223 494420 Email: birney "at" ebi.ac.uk Clerical Assistant: shelley "at" ebi.ac.uk Please cc shelley for urgent or diary-dependent requests ----------------------------------------------------------------- From biopython at maubp.freeserve.co.uk Thu Mar 25 07:19:09 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 11:19:09 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: <320fb6e01003250419q1ed0af70g913a6ab4a5de23b5@mail.gmail.com> On Thu, Mar 25, 2010 at 11:07 AM, Ewan Birney wrote: > > Oddly, I don't think you want to be registering BioPerl as a client > with an email. Rather the Bioperl libraries should prevent a client > programmer from using the functions without an email and program type > entered. This forces the decision onto the client programmer. > I think for most cases, having a Bio* mailing list as a default Entrez email address is pointless (we have almost no control over how end users will call the Entrez functions, if they will use the history or not, etc). The current behaviour of defaulting to no email but raising a warning seems OK. As Ewan says (and the NCBI earlier said they would required), changing this to make the email mandatory is also a sensible option. There is a special case for running the Bio* unit tests, where it might make sense to include the developer's mailing list (and maybe set the tool to something like "BioPerl-unittests" rather than just "BioPerl"). Peter From pmiguel at purdue.edu Thu Mar 25 07:44:10 2010 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Thu, 25 Mar 2010 07:44:10 -0400 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> Message-ID: <4BAB4C8A.3050903@purdue.edu> Chris Fields wrote: > On Mar 24, 2010, at 9:51 AM, Peter wrote: > > >> On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields wrote: >> >>> >> Please give the NCBI an email - you can CC me too if you like. >> > > Sent, have cc'd the open-bio list. Don't want to cross-post this too much, so I think we should move the discussion there. > > >>> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a >>> default, but always leave the email blank and issue a warning if it isn't >>> set. We could just as easily leave both blank and issue warnings for both. >>> >> We currently leave out the email and set the tool parameter to "Biopython" >> by default but this can be overridden. Currently leaving out the email does >> cause Biopython to give a warning. >> >> Peter >> > > We follow the same, then (down to the warning). This is mentioned in my post to them, I'll wait to see what they say. > > My concern is the wording of the new rules. Each tool and email must be registered with them if an IP is blocked. Does this mean each tool is assigned one specific email? And an IP that is blocked can register it to be allowed back into the fold? With that in mind, should we register each of our toolkits with them? Probably not a bad thing (it might help us as devs to get an idea of use), but then if one user abuses the rules will their actions affect all toolkit users? Is this all done on a per-IP basis, per-toolkit basis, etc? > > Unfortunately, at least to me, none of this is made very clear, so I'm hoping there is some clarification from their end. > > chris > Maybe GenBank is hoping that developers will create Genbank rules-compliant modules when accessing their resources. That is, for EUtilities by default, the tools would check the local time and cut off requests to 100 if outside the hours of 9PM-5AM Eastern Time. Also the number of requests could be limited to 3 per second. But it seems like it would be better if Genbank would return some sort of "load" field with the response to each request. That would allow feedback control of a series of requests. It could be tuned however Genbank likes, but past a certain threshold the client program would know that another request within a certain amount of time will result in the IP being banned. -- Phillip From biopython at maubp.freeserve.co.uk Thu Mar 25 08:10:26 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 12:10:26 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <4BAB4C8A.3050903@purdue.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> <4BAB4C8A.3050903@purdue.edu> Message-ID: <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> On Thu, Mar 25, 2010 at 11:44 AM, Phillip San Miguel wrote: > > Maybe GenBank is hoping that developers will create Genbank rules-compliant > modules when accessing their resources. That is, for EUtilities by default, > the tools would check the local time and cut off requests to 100 if outside > the hours of 9PM-5AM Eastern Time. Also the number of requests could be > limited to 3 per second. Biopython (and I assume BioPerl et al) already enforces the Entrez 3 requests per second rule. That bit is easy. If we assume the user has their timezone information setup right, it should also be possible to count the number of requests made within the hours of 9AM to 5PM Eastern Time and issue a warning or raise an error if over 100. Currently Biopython leaves this to the user. Interestingly the older guideline text here gives the 100 limit, http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements but the new text does not: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > But it seems like it would be better if Genbank would return some sort of > "load" field with the response to each request. That would allow feedback > control of a series of requests. It could be tuned however Genbank likes, > but past a certain threshold the client program would know that another > request within a certain amount of time will result in the IP being banned. Interesting idea - could be useful for large jobs. Peter P.S. We're talking about more than just GenBank here - The Entrez utilities cover multiple databases. From biopython at maubp.freeserve.co.uk Thu Mar 25 08:18:38 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 12:18:38 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> <4BAB4C8A.3050903@purdue.edu> <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> Message-ID: <320fb6e01003250518y7219d055ja8f67de8b15a7bdc@mail.gmail.com> On Thu, Mar 25, 2010 at 12:10 PM, Peter wrote: > If we assume the user has their timezone information setup right, it should > also be possible to count the number of requests made within the hours of > 9AM to 5PM Eastern Time and issue a warning or raise an error if over 100. > Currently Biopython leaves this to the user. > > Interestingly the older guideline text here gives the 100 limit, > http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements > but the new text does not: > http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen Eastern Time (EST) is 5 hours behind of Coordinated Universal Time (UTC) aka Greenwich Mean Time (GMT), thus 09:00 to 17:00 EST is 14:00 to 22:00 UTC. I notice the NCBI do not appear to mention summer/winter time (daylight saving time), which may be an oversight. Peter From andy.jenkinson at ebi.ac.uk Thu Mar 25 08:50:01 2010 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Thu, 25 Mar 2010 12:50:01 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: I think Chris meant to register them with NCBI rather than use them as default values. Purely to prevent application developers registering their applications as "BioPerl". I think we all agree that default values would not be helpful! On 25 Mar 2010, at 11:07, Ewan Birney wrote: > > Oddly, I don't think you want to be registering BioPerl as a client > with an email. Rather the Bioperl libraries should prevent a client > programmer from using the functions without an email and program type > entered. This forces the decision onto the client programmer. > > > -- > ----------------------------------------------------------------- > Ewan Birney. Work: +44 1223 494420 > Email: birney "at" ebi.ac.uk > Clerical Assistant: shelley "at" ebi.ac.uk > Please cc shelley for urgent or diary-dependent requests > ----------------------------------------------------------------- From cjfields at illinois.edu Thu Mar 25 09:10:43 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 08:10:43 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: Andy, Ewan, Yes, that's what I meant; I do not think a set of defaults is a good idea. The other advantage to registering them is the list would get immediate updates from NCBI when changes occur (instead of finding out about them second-hand from other subscribers). The list is very low traffic. From their online docs: 'In addition, developers may request that the value of email be added to the E-utility mailing list that provides announcements of software updates, known bugs and other policy changes affecting the E-utilities.' chris On Mar 25, 2010, at 7:50 AM, Andy Jenkinson wrote: > I think Chris meant to register them with NCBI rather than use them as default values. Purely to prevent application developers registering their applications as "BioPerl". I think we all agree that default values would not be helpful! > > On 25 Mar 2010, at 11:07, Ewan Birney wrote: > >> >> Oddly, I don't think you want to be registering BioPerl as a client >> with an email. Rather the Bioperl libraries should prevent a client >> programmer from using the functions without an email and program type >> entered. This forces the decision onto the client programmer. >> >> >> -- >> ----------------------------------------------------------------- >> Ewan Birney. Work: +44 1223 494420 >> Email: birney "at" ebi.ac.uk >> Clerical Assistant: shelley "at" ebi.ac.uk >> Please cc shelley for urgent or diary-dependent requests >> ----------------------------------------------------------------- > From peter at maubp.freeserve.co.uk Thu Mar 25 09:18:05 2010 From: peter at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 13:18:05 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields wrote: > Andy, Ewan, > > Yes, that's what I meant; I do not think a set of defaults is a good idea. Why? I agree that putting a default project email address in is a bad idea, but having a default tool seems fine. Perhaps I have misunderstood you. If any Biopython/BioPerl user has written a dozen scripts using Entrez should they really be expected to give them all a (unique) tool name in the Entrez requests? Having it default to Biopython/BioPerl seems reasonable to me (in combination with the script writer's email address). The whole hassle about registering a tool+email is only if you need your IP address unblocked, typically if you or someone at your institute or ISP has previously abused the servers. Again we come back to the fact the new NCBI guidelines are still unclear. >?The other advantage to registering them is the list would get immediate > updates from NCBI when changes occur (instead of finding out about > them second-hand from other subscribers). ?The list is very low traffic. Well that is an advantage, but in practice having a few people from each project on the NCBI mailing list isn't a big hassle. Peter From biopython at maubp.freeserve.co.uk Thu Mar 25 09:35:19 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 13:35:19 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250518y7219d055ja8f67de8b15a7bdc@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> <4BAB4C8A.3050903@purdue.edu> <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> <320fb6e01003250518y7219d055ja8f67de8b15a7bdc@mail.gmail.com> Message-ID: <320fb6e01003250635k26a614bdhc0c983c42198bfb0@mail.gmail.com> On Thu, Mar 25, 2010 at 12:18 PM, Peter wrote: > On Thu, Mar 25, 2010 at 12:10 PM, Peter wrote: >> If we assume the user has their timezone information setup right, it should >> also be possible to count the number of requests made within the hours of >> 9AM to 5PM Eastern Time and issue a warning or raise an error if over 100. >> Currently Biopython leaves this to the user. >> >> Interestingly the older guideline text here gives the 100 limit, >> http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements >> but the new text does not: >> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > > Eastern Time (EST) is 5 hours behind of Coordinated Universal Time (UTC) > aka Greenwich Mean Time (GMT), thus 09:00 to 17:00 EST is 14:00 to 22:00 > UTC. I notice the NCBI do not appear to mention summer/winter time > (daylight saving time), which may be an oversight. > > Peter I've been looking at this more closely, the old guideline was: http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements "Run retrieval scripts on weekends or between 9 pm and 5 am Eastern Time weekdays for any series of more than 100 requests." This doesn't define a series - for example, would it be OK to run a script making 75 requests every two hours? This could be regarded as multiple separate series each under 100 requests, but the cumulative count over the 8 peak hours is 600 requests. Sadly the new guidelines are even more vague: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen "... and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays." Not very helpful - maybe leaving this rule down to the user (as Biopython currently does) is the best option. Peter From andy.jenkinson at ebi.ac.uk Thu Mar 25 09:53:20 2010 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Thu, 25 Mar 2010 13:53:20 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> Message-ID: <58FB83B7-937F-40B4-9158-61863BFA10D1@ebi.ac.uk> On 25 Mar 2010, at 13:18, Peter wrote: > On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields wrote: >> Andy, Ewan, >> >> Yes, that's what I meant; I do not think a set of defaults is a good idea. > > Why? I agree that putting a default project email address in is a bad > idea, but having a default tool seems fine. Perhaps I have misunderstood > you. Yes, I specifically mean the email parameter. I guess the only reason for the 'tool' parameter is to go some way to ensuring that the email address does actually belong to the tool developer? Otherwise it doesn't seem a very helpful piece of information to me, especially if it is one tool per email. > Again we come back to the fact the new NCBI guidelines are still > unclear. Indeed From cjfields at illinois.edu Thu Mar 25 11:43:24 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 10:43:24 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> Message-ID: On Mar 25, 2010, at 8:18 AM, Peter wrote: > On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields wrote: >> Andy, Ewan, >> >> Yes, that's what I meant; I do not think a set of defaults is a good idea. > > Why? I agree that putting a default project email address in is a bad > idea, but having a default tool seems fine. Perhaps I have misunderstood > you. > > If any Biopython/BioPerl user has written a dozen scripts using > Entrez should they really be expected to give them all a (unique) tool > name in the Entrez requests? Having it default to Biopython/BioPerl > seems reasonable to me (in combination with the script writer's email > address). > > The whole hassle about registering a tool+email is only if you need your > IP address unblocked, typically if you or someone at your institute or > ISP has previously abused the servers. Let's play devil's advocate. The best way I can think of to describe this is to lay out a possible scenario. Suppose an end user (out of possibly thousands of end users, scattered across many IPs) uses one of the eutils modules/classes where the tool is set but the email isn't (our current status). They set 'email' to their local one, and then proceed to somehow abuse NCBI's rules and are blocked. In order to be reinstated, they will have to register both the tool and email with NCBI. Until then, does this block anyone else with the same tool? Just those from that IP? Not clear at the moment. To proceed further, now the user registers the tool and email (both need to be registered to unblock). If I understand the eutils documents correctly, as stated, only one email (supposedly the software developers) is registered per tool (also supposedly the software developers). The 'Bio*' tool name could end up being registered by anyone (wittingly or unwittingly), using their own personal email. If another user uses the same tool name with a different email, would they be blocked? If not, and that user tries to register as above (after subsequent abuse), would a conflict occur and they be notified of the prior registration? Again, it's not clear what happens. We have until probably sometime in May to decide a course of action (June 1 is the enforcement date I believe), but this relies on NCBI clarifying a few things first. The current documentation (at least to me) does seem to indicate that each tool must have a single corresponding email when registered. Unless it is clarified, from my perspective the only safe course that addresses all concerns is to leave both tool and email unset, and then register a respective toolkit/email to keep it within the specific dev group as a safeguard. That last bit is for many reasons I've already outlined; an additional one is the fact that we already have a default set for 'tool' now (and have had one set for a while), so by legacy anyone using older versions will have 'Bio*' preset already when June 1 hits. BTW, I don't consider the above scenario out of the realm of possibility, particularly if they truly intend on enforcing the rules this time around. We've had many users who have asked the question 'how can I download my batch of 1,00,000 records via eutils'. Potential lack of common sense doesn't stop the persistent or the desperate. > Again we come back to the fact the new NCBI guidelines are still > unclear. > >> The other advantage to registering them is the list would get immediate >> updates from NCBI when changes occur (instead of finding out about >> them second-hand from other subscribers). The list is very low traffic. > > Well that is an advantage, but in practice having a few people from > each project on the NCBI mailing list isn't a big hassle. > > Peter Right, but my point is there are no intermediaries, the news goes straight to the list. We're not reliant on (possibly busy, possibly absent) developers for second-hand news. We've been bitten by this before many times with NCBI, both with eutils and BLAST changes, a good many which NCBI announced but not passed on to our mailing list. Saying this now, it makes me wonder whether we should have a master list of some sort to gather such announcements that may impact developers (eutils, BLAST, GenBank/EMBL/UniProt releases, etc). chris From peter at maubp.freeserve.co.uk Thu Mar 25 12:13:57 2010 From: peter at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 16:13:57 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> Message-ID: <320fb6e01003250913k662aba2awd2121a57d1cd8596@mail.gmail.com> On Thu, Mar 25, 2010 at 3:43 PM, Chris Fields wrote: > > Let's play devil's advocate. The best way I can think of to > describe this is to lay out a possible scenario. Suppose an > end user (out of possibly thousands of end users, scattered > across many IPs) uses one of the eutils modules/classes where > the tool is set but the email isn't (our current status). > They set 'email' to their local one, and then proceed to > somehow abuse NCBI's rules and are blocked. In order to be > reinstated, they will have to register both the tool and > email with NCBI. Until then, does this block anyone else > with the same tool? Just those from that IP? Not clear at > the moment. I had understood that so far the NCBI just blocked by IP. But only they know for sure, and it may change. > To proceed further, now the user registers the tool and > email (both need to be registered to unblock). If I > understand the eutils documents correctly, as stated, only > one email (supposedly the software developers) is registered > per tool (also supposedly the software developers). The > 'Bio*' tool name could end up being registered by anyone > (wittingly or unwittingly), using their own personal email. Ah. That is possible - although I had been assuming the NCBI would be looking at the *combination* of tool+email, that may not be the case. > If another user uses the same tool name with a different > email, would they be blocked? If not, and that user tries > to register as above (after subsequent abuse), would a > conflict occur and they be notified of the prior > registration? Again, it's not clear what happens. Indeed. > We have until probably sometime in May to decide a course > of action (June 1 is the enforcement date I believe), but > this relies on NCBI clarifying a few things first. The > current documentation (at least to me) does seem to indicate > that each tool must have a single corresponding email when > registered. Unless it is clarified, from my perspective > the only safe course that addresses all concerns is to > leave both tool and email unset, and then register a > respective toolkit/email to keep it within the specific > dev group as a safeguard. That last bit is for many > reasons I've already outlined; an additional one is the > fact that we already have a default set for 'tool' now > (and have had one set for a while), so by legacy anyone > using older versions will have 'Bio*' preset already when > June 1 hits. Biopython has also been setting a default tool value (with no default email) for some time. > BTW, I don't consider the above scenario out of the realm > of possibility, particularly if they truly intend on > enforcing the rules this time around. We've had many > users who have asked the question 'how can I download > my batch of 1,00,000 records via eutils'. Potential > lack of common sense doesn't stop the persistent or > the desperate. Agreed - sooner or later someone will do something silly with the Bio* Entrez wrappers. Continuing to play Devil's advocate, let's suppose we had a default tool and email set. That *could* result in the NCBI blocking all users of that Bio* toolkit running with this default tool+email which would be a big problem (even if just for a day or two while talking to the NCBI). Therefore I don't think we should be setting a default tool+email (without clarification from the NCBI). > Right, but my point is there are no intermediaries, > the news goes straight to the list. We're not reliant > on (possibly busy, possibly absent) developers for > second-hand news. We've been bitten by this before > many times with NCBI, both with eutils and BLAST > changes, a good many which NCBI announced but not > passed on to our mailing list. > > Saying this now, it makes me wonder whether we should > have a master list of some sort to gather such > announcements that may impact developers (eutils, > BLAST, GenBank/EMBL/UniProt releases, etc). This is an excellent idea, but need not be linked to any default email used for Entrez. Peter From cjfields at illinois.edu Thu Mar 25 13:44:31 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 12:44:31 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250913k662aba2awd2121a57d1cd8596@mail.gmail.com> References: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> <320fb6e01003250913k662aba2awd2121a57d1cd8596@mail.gmail.com> Message-ID: On Mar 25, 2010, at 11:13 AM, Peter wrote: > On Thu, Mar 25, 2010 at 3:43 PM, Chris Fields wrote: > > > > Let's play devil's advocate. The best way I can think of to > > describe this is to lay out a possible scenario. Suppose an > > end user (out of possibly thousands of end users, scattered > > across many IPs) uses one of the eutils modules/classes where > > the tool is set but the email isn't (our current status). > > They set 'email' to their local one, and then proceed to > > somehow abuse NCBI's rules and are blocked. In order to be > > reinstated, they will have to register both the tool and > > email with NCBI. Until then, does this block anyone else > > with the same tool? Just those from that IP? Not clear at > > the moment. > > I had understood that so far the NCBI just blocked by IP. > But only they know for sure, and it may change. Yes. And they have every right to change; they have been hammered by spam for years now, and the tool/email has always been implied to be required (thought really never enforced). So it was only a matter of time before theydid something about it. > > To proceed further, now the user registers the tool and > > email (both need to be registered to unblock). If I > > understand the eutils documents correctly, as stated, only > > one email (supposedly the software developers) is registered > > per tool (also supposedly the software developers). The > > 'Bio*' tool name could end up being registered by anyone > > (wittingly or unwittingly), using their own personal email. > > Ah. That is possible - although I had been assuming the NCBI > would be looking at the *combination* of tool+email, that > may not be the case. > > > If another user uses the same tool name with a different > > email, would they be blocked? If not, and that user tries > > to register as above (after subsequent abuse), would a > > conflict occur and they be notified of the prior > > registration? Again, it's not clear what happens. > > Indeed. > > > We have until probably sometime in May to decide a course > > of action (June 1 is the enforcement date I believe), but > > this relies on NCBI clarifying a few things first. The > > current documentation (at least to me) does seem to indicate > > that each tool must have a single corresponding email when > > registered. Unless it is clarified, from my perspective > > the only safe course that addresses all concerns is to > > leave both tool and email unset, and then register a > > respective toolkit/email to keep it within the specific > > dev group as a safeguard. That last bit is for many > > reasons I've already outlined; an additional one is the > > fact that we already have a default set for 'tool' now > > (and have had one set for a while), so by legacy anyone > > using older versions will have 'Bio*' preset already when > > June 1 hits. > > Biopython has also been setting a default tool value > (with no default email) for some time. Right. Same with BioPerl, for many years now, with a few different names (some class-specific, others toolkit-based). Kind of a mess, really. > > BTW, I don't consider the above scenario out of the realm > > of possibility, particularly if they truly intend on > > enforcing the rules this time around. We've had many > > users who have asked the question 'how can I download > > my batch of 1,00,000 records via eutils'. Potential > > lack of common sense doesn't stop the persistent or > > the desperate. > > Agreed - sooner or later someone will do something silly > with the Bio* Entrez wrappers. Continuing to play Devil's > advocate, let's suppose we had a default tool and email > set. That *could* result in the NCBI blocking all users > of that Bio* toolkit running with this default tool+email > which would be a big problem (even if just for a day or > two while talking to the NCBI). Therefore I don't think > we should be setting a default tool+email (without > clarification from the NCBI). Yes, agreed. > > Right, but my point is there are no intermediaries, > > the news goes straight to the list. We're not reliant > > on (possibly busy, possibly absent) developers for > > second-hand news. We've been bitten by this before > > many times with NCBI, both with eutils and BLAST > > changes, a good many which NCBI announced but not > > passed on to our mailing list. > > > > Saying this now, it makes me wonder whether we should > > have a master list of some sort to gather such > > announcements that may impact developers (eutils, > > BLAST, GenBank/EMBL/UniProt releases, etc). > > This is an excellent idea, but need not be linked to > any default email used for Entrez. > > Peter Agreed. Adding the tool's related email to the NCBI eutil mailing list is supposed to be optional, anyway. chris From cjfields at illinois.edu Thu Mar 25 14:17:29 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 13:17:29 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> Message-ID: I'll cc this for the others in BioPerl, Biopython, Bioruby, Ensembl, etc so they know and can pass it around. Thanks Eric, happy to have that clarified. chris On Mar 25, 2010, at 1:06 PM, sayers wrote: > Hi Chris, > > Thanks for your note. Organizations such as yours were in our minds when we considered this policy change, and perhaps we need to consider improving the wording of the policy so that it is clearer on the points that you raise. > > Our requests for registered values for &tool and &email are directed at "end-developers" of software, not necessarily at toolkit developers such as yourself, whose products may be used by end-developers. We appreciate any efforts you can make to encourage and facilitate the use of &tool and &email by end-developers, but the act of registering values of these parameters is the responsibility of the end-developers. The value of &email should be a contact address for the end-developer, and ideally the value of &tool should be the the name of the software package that uses BioPerl or other toolkits. > > My suggestion would be similar to what you mention in your second paragraph: to leave both &tool and &email without values and give a warning or some equivalent to developers if they fail to set them. > > Regarding blocks, we block on the basis of IP addresses only. The essence of the new policy is that we are far more likely to block requests that do not have registered values of &tool and &email. If we need to block activity that does have a registered &tool or &email value, we will only block those IPs that are causing the abusive activity, not all IPs using that &tool/&email value. > > Thanks again for your comments and please let me know if you have further questions. > > Regards, > Eric > ___________________ > Eric W. Sayers, PhD > NCBI/NLM/NIH > 45 Center Drive, MSC 6511 > Bldg 45, Room 4AN.44C > Bethesda, MD 20892 > sayers at ncbi.nlm.nih.gov > > > > > On Mar 24, 2010, at 11:31 AM, Chris Fields wrote: > >> To whom it may concern, >> >> Just want to get some clarification from the eutils folks with the changes. I'm a core developer for the BioPerl toolkit and and in collaboration with several other Bio* toolkits (BioPython, BioRuby, BioJava). We all have interfaces to eutils, either via the standard interface, SOAP-based, or both. >> >> We're seeking a clarification regarding the new rules, specifically the rules concerning 'tool' and 'email'. Per the rules proposed in Dec. 2009, many of us have already implemented changes to address them. As we're basically language-specific toolkits (classes, modules, etc) that are used for developing other applications, we have taken the stance that the end-user will need to start providing at least the email in order to properly use the eutils-specific tools, with a warning issued otherwise. This is based on several reasons, foremost being the toolkits are very widely used (so may be spread over potentially thousands of IPs) and are being used in a large number of downstream applications. As an example of the latter, BioJava is used in several free and commercial applications, such as Taverna and Geneious. >> >> Currently, with BioPerl and Biopython, we set the 'tool' parameter specifically to the toolkit name by default. This parameter can be overridden. However, from reading the newest rules it appears that each tool (and thus each toolkit) should have one common email, no more. This also appears to make the assertion that users using these toolkits (with 'tool' set and using the relevant emails) may essentially be tied down if one IP decides to abuse the rules. >> >> This unfortunately makes the incorrect generalization that each tool is created by one developer, and simply does not make sense in large collaborative projects such as ours, where the software is used primarily in downstream applications and scripts. Setting the tool could be beneficial to the development team, but at the same time it could be a tremendous hindrance. >> >> So, what exactly should we do? Do we set the 'tool' by default, or leave it to the user? Similarly, how do we treat 'email'? If we do set them, would the entire group of users for that tool be blocked if one end-user abuses the system? Should we leave it up to the user to register themselves (both tool and email)? So, we're at an impasse and really need your help. >> >> Sincerely, >> >> chris >> >> >> Christopher Fields >> Core Developer, BioPerl Project >> IGB Postdoctoral Fellow >> Genomics of Neural & Behavioral Plasticity >> University of Illinois Urbana-Champaign >> Institute for Genomic Biology >> 1206 W. Gregory Dr. , MC-195 >> Urbana, IL 61801 >> > From biopython at maubp.freeserve.co.uk Thu Mar 25 14:31:18 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 18:31:18 +0000 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> Message-ID: <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> On Thu, Mar 25, 2010 at 6:17 PM, Chris Fields wrote: > I'll cc this for the others in BioPerl, Biopython, Bioruby, > Ensembl, etc so they know and can pass it around. > Thanks Eric, happy to have that clarified. > > chris Thanks Chris (& Eric), Looks like we are fine as things stand: continue to encourage the user to set the email (with a warning if omitted), and try to encourage them to override the tool parameter if appropriate (e.g. if part of a larger application like a Galaxy workflow). [I don't see any point in forcing people to invent tool names for each of their one off Entrez scripts, or interactive sessions - defaulting to BioPerl etc here seems sane] Peter From birney at ebi.ac.uk Thu Mar 25 15:59:31 2010 From: birney at ebi.ac.uk (Ewan Birney) Date: Thu, 25 Mar 2010 19:59:31 +0000 (GMT) Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> Message-ID: On Thu, 25 Mar 2010, Peter wrote: > On Thu, Mar 25, 2010 at 6:17 PM, Chris Fields wrote: >> I'll cc this for the others in BioPerl, Biopython, Bioruby, >> Ensembl, etc so they know and can pass it around. >> Thanks Eric, happy to have that clarified. >> >> chris > > Thanks Chris (& Eric), > > Looks like we are fine as things stand: continue to > encourage the user to set the email (with a warning > if omitted), and try to encourage them to override the > tool parameter if appropriate (e.g. if part of a larger > application like a Galaxy workflow). > > [I don't see any point in forcing people to invent > tool names for each of their one off Entrez scripts, > or interactive sessions - defaulting to BioPerl etc > here seems sane] At the very least teh default should be "BioPerl Toolkit Placeholder For Non Registered Client" so that NCBI know precisely that the end programmer has not put something in there sensibly. And there should be a loud warning. I think it's fine to actually throw an exception. If someone is running a one off script, then they made the function call and can modify it. If someone's developing something more serious then they've got the time to think it through. I see little benefit in letting a default happen with just a warning. > > Peter > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l > -- ----------------------------------------------------------------- Ewan Birney. Work: +44 1223 494420 Email: birney "at" ebi.ac.uk Clerical Assistant: shelley "at" ebi.ac.uk Please cc shelley for urgent or diary-dependent requests ----------------------------------------------------------------- From cjfields at illinois.edu Thu Mar 25 16:34:01 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 15:34:01 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> Message-ID: <30558517-17C8-4863-B13A-CA3661076265@illinois.edu> On Mar 25, 2010, at 2:59 PM, Ewan Birney wrote: > On Thu, 25 Mar 2010, Peter wrote: > >> On Thu, Mar 25, 2010 at 6:17 PM, Chris Fields wrote: >>> I'll cc this for the others in BioPerl, Biopython, Bioruby, >>> Ensembl, etc so they know and can pass it around. >>> Thanks Eric, happy to have that clarified. >>> >>> chris >> >> Thanks Chris (& Eric), >> >> Looks like we are fine as things stand: continue to >> encourage the user to set the email (with a warning >> if omitted), and try to encourage them to override the >> tool parameter if appropriate (e.g. if part of a larger >> application like a Galaxy workflow). >> >> [I don't see any point in forcing people to invent >> tool names for each of their one off Entrez scripts, >> or interactive sessions - defaulting to BioPerl etc >> here seems sane] > > At the very least teh default should be "BioPerl Toolkit Placeholder For Non Registered Client" so that NCBI know precisely that the end programmer has not put something in there sensibly. > > And there should be a loud warning. I think it's fine to actually > throw an exception. If someone is running a one off script, then they > made the function call and can modify it. If someone's developing something more serious then they've got the time to think it through. > > > I see little benefit in letting a default happen with just a warning. I agree re: throwing an exception. Not sure I see the point of setting the tool at all if we're throwing an exception when the tool isn't changed from the default (would be just as easy to throw if it's not set), but I definitely see the benefit of an exception re: email. chris >> >> Peter >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l >> > > -- > ----------------------------------------------------------------- > Ewan Birney. Work: +44 1223 494420 > Email: birney "at" ebi.ac.uk > Clerical Assistant: shelley "at" ebi.ac.uk > Please cc shelley for urgent or diary-dependent requests > ----------------------------------------------------------------- From biopython at maubp.freeserve.co.uk Thu Mar 25 18:39:23 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 22:39:23 +0000 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> Message-ID: <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> On Thu, Mar 25, 2010 at 7:59 PM, Ewan Birney wrote: > > On Thu, 25 Mar 2010, Peter wrote: >> >> Thanks Chris (& Eric), >> >> Looks like we are fine as things stand: continue to >> encourage the user to set the email (with a warning >> if omitted), and try to encourage them to override the >> tool parameter if appropriate (e.g. if part of a larger >> application like a Galaxy workflow). >> >> [I don't see any point in forcing people to invent >> tool names for each of their one off Entrez scripts, >> or interactive sessions - defaulting to BioPerl etc >> here seems sane] > > At the very least teh default should be "BioPerl Toolkit Placeholder For Non > Registered Client" so that NCBI know precisely that the end programmer has > not put something in there sensibly. No, that's just silly IMHO. Using "BioPerl" on its own serves just the same purpose (indeed, the NCBI will be used to this from existing users and all versions of BioPerl to date), The extra long version doesn't add any useful information and more importantly makes the URLs much longer which can be a real issue because long URLs can break (e.g. if going via a proxy). > And there should be a loud warning. I think it's fine to actually > throw an exception. If someone is running a one off script, then they > made the function call and can modify it. If someone's developing something > more serious then they've got the time to think it through. > > I see little benefit in letting a default happen with just a warning. Making the email and/or tool mandatory vs throwing an exception just an implementation detail. I think the issue is should BioPerl etc treat the email and tool as optional, optional with a warning, or mandatory. Note the NCBI does not seem to be making them mandatory (for now). Peter From cjfields at illinois.edu Thu Mar 25 22:10:54 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 21:10:54 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> Message-ID: <19CCAA14-DACE-4F20-BE2C-742EA0F46995@illinois.edu> On Mar 25, 2010, at 5:39 PM, Peter wrote: > On Thu, Mar 25, 2010 at 7:59 PM, Ewan Birney wrote: >> >> On Thu, 25 Mar 2010, Peter wrote: >>> >>> Thanks Chris (& Eric), >>> >>> Looks like we are fine as things stand: continue to >>> encourage the user to set the email (with a warning >>> if omitted), and try to encourage them to override the >>> tool parameter if appropriate (e.g. if part of a larger >>> application like a Galaxy workflow). >>> >>> [I don't see any point in forcing people to invent >>> tool names for each of their one off Entrez scripts, >>> or interactive sessions - defaulting to BioPerl etc >>> here seems sane] >> >> At the very least teh default should be "BioPerl Toolkit Placeholder For Non >> Registered Client" so that NCBI know precisely that the end programmer has >> not put something in there sensibly. > > No, that's just silly IMHO. Using "BioPerl" on its own serves just > the same purpose (indeed, the NCBI will be used to this from > existing users and all versions of BioPerl to date), The extra > long version doesn't add any useful information and more > importantly makes the URLs much longer which can be a > real issue because long URLs can break (e.g. if going via > a proxy). I don't think this is meant literally, just the general idea that setting it to a specific value indicates the user in question didn't reset it. >> And there should be a loud warning. I think it's fine to actually >> throw an exception. If someone is running a one off script, then they >> made the function call and can modify it. If someone's developing something >> more serious then they've got the time to think it through. >> >> I see little benefit in letting a default happen with just a warning. > > Making the email and/or tool mandatory vs throwing an > exception just an implementation detail. > > I think the issue is should BioPerl etc treat the email and > tool as optional, optional with a warning, or mandatory. > Note the NCBI does not seem to be making them > mandatory (for now). > > Peter It is listed as a user requirement here, has been for a long while actually: http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements It just hasn't been enforced. The new rule, as I understand it, is they will likely start enforcing it. chris From maj at fortinbras.us Fri Mar 26 09:04:56 2010 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 26 Mar 2010 09:04:56 -0400 Subject: [Open-bio-l] ack! code.open-bio.org Message-ID: Hey Chris, can't ping code again. Sorry about this. MAJ From biopython at maubp.freeserve.co.uk Sat Mar 27 08:50:04 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 27 Mar 2010 12:50:04 +0000 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <19CCAA14-DACE-4F20-BE2C-742EA0F46995@illinois.edu> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> <19CCAA14-DACE-4F20-BE2C-742EA0F46995@illinois.edu> Message-ID: <320fb6e01003270550t7ae22d60waceffac7914fce6d@mail.gmail.com> On Fri, Mar 26, 2010 at 2:10 AM, Chris Fields wrote: > > On Mar 25, 2010, at 5:39 PM, Peter wrote: > >> On Thu, Mar 25, 2010 at 7:59 PM, Ewan Birney wrote: >>> >>> At the very least teh default should be "BioPerl Toolkit >>> Placeholder For Non Registered Client" so that NCBI know >>> precisely that the end programmer has not put something in >>> there sensibly. >> >> No, that's just silly IMHO. Using "BioPerl" on its own serves just >> the same purpose (indeed, the NCBI will be used to this from >> existing users and all versions of BioPerl to date), The extra >> long version doesn't add any useful information and more >> importantly makes the URLs much longer which can be a >> real issue because long URLs can break (e.g. if going via >> a proxy). > > I don't think this is meant literally, just the general idea that > setting it to a specific value indicates the user in question > didn't reset it. > That makes more sense ;) >>> And there should be a loud warning. I think it's fine to actually >>> throw an exception. If someone is running a one off script, then they >>> made the function call and can modify it. If someone's developing >>> something more serious then they've got the time to think it through. >>> >>> I see little benefit in letting a default happen with just a warning. >> >> Making the email and/or tool mandatory vs throwing an >> exception just an implementation detail. >> >> I think the issue is should BioPerl etc treat the email and >> tool as optional, optional with a warning, or mandatory. >> Note the NCBI does not seem to be making them >> mandatory (for now). >> >> Peter > > > It is listed as a user requirement here, has been for a long > while actually: > > http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements > > It just hasn't been enforced. ?The new rule, as I understand > it, is they will likely start enforcing it. So is our status quo fine? email - left out by default but with a warning tool - set to BioPerl (or similar) by default Peter From cjfields at illinois.edu Thu Mar 18 22:02:32 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 18 Mar 2010 17:02:32 -0500 Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects! References: <4BA29706.8040606@cornell.edu> Message-ID: <980817EA-2C31-4D1C-93B0-2A836024E290@illinois.edu> (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters) Hi all, Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code! GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents). Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo Student applications are due April 9, 2010 at 19:00 UTC. Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying. For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas. Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page. Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code! Rob Buels OBF GSoC 2010 Administrator From rmb32 at cornell.edu Thu Mar 18 21:11:34 2010 From: rmb32 at cornell.edu (Robert Buels) Date: Thu, 18 Mar 2010 14:11:34 -0700 Subject: [Open-bio-l] Google Summer of Code is *ON* for OBF projects! Message-ID: <4BA29706.8040606@cornell.edu> Hi all, Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code! GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents). Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo Student applications are due April 9, 2010 at 19:00 UTC. Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying. For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas. Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page. Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code! Rob Buels OBF GSoC 2010 Administrator From peter at maubp.freeserve.co.uk Wed Mar 24 14:08:26 2010 From: peter at maubp.freeserve.co.uk (Peter) Date: Wed, 24 Mar 2010 14:08:26 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: Message-ID: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Hi, This is probably of interest to all the Bio* projects offering access to the NCBI Entrez utilities. See forwarded message below. I *think* the new guidelines basically say that the email & tool parameters are optional BUT if your IP address ever gets banned for excessive use you then have to register an email & tool combination. Regarding the email address, the NCBI say to use the email of the developer (not the end user). However, they do not distinguish between the developers of a library (like us), and the developers of an application or script using a library (who may also be the end user). Currently we (Biopython) and I think BioPerl ask developers using our libraries to populate the email address themselves. I *think* this is still the right action. Peter ---------- Forwarded message ---------- From: Date: Wed, Mar 24, 2010 at 1:53 PM Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy To: NLM/NCBI List utilities-announce New E-utility documentation now on the NCBI Bookshelf The Entrez Programming Utilities (E-Utilities) Help documentation has been added to the NCBI Bookshelf, and so?is now fully integrated with the Entrez search and retrieval system as a part of the Bookshelf database. This help document has been divided into chapters for better organization and includes several new sample Perl scripts. At present this book covers the standard URL interface for the E-utilties; material about the SOAP interface will be added soon and is still available at the same URL: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. Revised E-utility usage policy In December, 2009 NCBI announced a change to the usage policy for the E-utilities that would require all requests to contain non-null values for both the?&email and &tool parameters. After several consultations with our users and developers, we have decided to revise this policy change, and the revised?policy is described in detail at the following link: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen Please let us know if you have any questions or concerns about this policy change. Thank you, The E-Utilities Team NIH/NLM/NCBI eutilities at ncbi.nlm.nih.gov. _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce -------------- next part -------------- _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce From cjfields at illinois.edu Wed Mar 24 14:37:13 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 09:37:13 -0500 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Message-ID: <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> On Mar 24, 2010, at 9:08 AM, Peter wrote: > Hi, > > This is probably of interest to all the Bio* projects offering access > to the NCBI > Entrez utilities. See forwarded message below. > > I *think* the new guidelines basically say that the email & tool parameters are > optional BUT if your IP address ever gets banned for excessive use you then > have to register an email & tool combination. > > Regarding the email address, the NCBI say to use the email of the developer > (not the end user). However, they do not distinguish between the developers > of a library (like us), and the developers of an application or script using a > library (who may also be the end user). > > Currently we (Biopython) and I think BioPerl ask developers using our libraries > to populate the email address themselves. I *think* this is still the > right action. > > Peter Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I think with the SOAP-based ones as well). We're providing a specific set of tools for user to write up their own applications end applications. I can try contacting them regarding this to get an official response to clarify this somewhat. Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a default, but always leave the email blank and issue a warning if it isn't set. We could just as easily leave both blank and issue warnings for both. chris > ---------- Forwarded message ---------- > From: > Date: Wed, Mar 24, 2010 at 1:53 PM > Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy > To: NLM/NCBI List utilities-announce > > > New E-utility documentation now on the NCBI Bookshelf > > The Entrez Programming Utilities (E-Utilities) Help documentation has > been added to the NCBI Bookshelf, and so is now fully integrated with > the Entrez search and retrieval system as a part of the Bookshelf > database. This help document has been divided into chapters for better > organization and includes several new sample Perl scripts. At present > this book covers the standard URL interface for the E-utilties; > material about the SOAP interface will be added soon and is still > available at the same URL: > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. > > > > Revised E-utility usage policy > > In December, 2009 NCBI announced a change to the usage policy for the > E-utilities that would require all requests to contain non-null values > for both the &email and &tool parameters. After several consultations > with our users and developers, we have decided to revise this policy > change, and the revised policy is described in detail at the following > link: > > http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > > Please let us know if you have any questions or concerns about this > policy change. > > > > Thank you, > > The E-Utilities Team > > NIH/NLM/NCBI > > eutilities at ncbi.nlm.nih.gov. > > > > _______________________________________________ > Utilities-announce mailing list > http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From biopython at maubp.freeserve.co.uk Wed Mar 24 14:51:46 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 24 Mar 2010 14:51:46 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> Message-ID: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields wrote: > > On Mar 24, 2010, at 9:08 AM, Peter wrote: > >> Hi, >> >> This is probably of interest to all the Bio* projects offering access >> to the NCBI Entrez utilities. See forwarded message below. >> >> I *think* the new guidelines basically say that the email & tool parameters are >> optional BUT if your IP address ever gets banned for excessive use you then >> have to register an email & tool combination. >> >> Regarding the email address, the NCBI say to use the email of the developer >> (not the end user). However, they do not distinguish between the developers >> of a library (like us), and the developers of an application or script using a >> library (who may also be the end user). >> >> Currently we (Biopython) and I think BioPerl ask developers using our libraries >> to populate the email address themselves. I *think* this is still the >> right action. >> >> Peter > > > Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I > think with the SOAP-based ones as well). ?We're providing a specific set of > tools for user to write up their own applications end applications. ?I can try > contacting them regarding this to get an official response to clarify this > somewhat. Please give the NCBI an email - you can CC me too if you like. > Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a > default, but always leave the email blank and issue a warning if it isn't > set. ?We could just as easily leave both blank and issue warnings for both. We currently leave out the email and set the tool parameter to "Biopython" by default but this can be overridden. Currently leaving out the email does cause Biopython to give a warning. Peter From maj at fortinbras.us Wed Mar 24 14:48:55 2010 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 24 Mar 2010 10:48:55 -0400 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility UsagePolicy In-Reply-To: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Message-ID: <9834AA2D2FCC4F918C538A7B7C152B1D@NewLife> Hey Peter-- thanks for this heads-up-- cheers MAJ ----- Original Message ----- From: "Peter" To: Cc: "bioperl-l list" ; "Biopython-Dev Mailing List" ; ; Sent: Wednesday, March 24, 2010 10:08 AM Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility UsagePolicy Hi, This is probably of interest to all the Bio* projects offering access to the NCBI Entrez utilities. See forwarded message below. I *think* the new guidelines basically say that the email & tool parameters are optional BUT if your IP address ever gets banned for excessive use you then have to register an email & tool combination. Regarding the email address, the NCBI say to use the email of the developer (not the end user). However, they do not distinguish between the developers of a library (like us), and the developers of an application or script using a library (who may also be the end user). Currently we (Biopython) and I think BioPerl ask developers using our libraries to populate the email address themselves. I *think* this is still the right action. Peter ---------- Forwarded message ---------- From: Date: Wed, Mar 24, 2010 at 1:53 PM Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy To: NLM/NCBI List utilities-announce New E-utility documentation now on the NCBI Bookshelf The Entrez Programming Utilities (E-Utilities) Help documentation has been added to the NCBI Bookshelf, and so is now fully integrated with the Entrez search and retrieval system as a part of the Bookshelf database. This help document has been divided into chapters for better organization and includes several new sample Perl scripts. At present this book covers the standard URL interface for the E-utilties; material about the SOAP interface will be added soon and is still available at the same URL: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. Revised E-utility usage policy In December, 2009 NCBI announced a change to the usage policy for the E-utilities that would require all requests to contain non-null values for both the &email and &tool parameters. After several consultations with our users and developers, we have decided to revise this policy change, and the revised policy is described in detail at the following link: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen Please let us know if you have any questions or concerns about this policy change. Thank you, The E-Utilities Team NIH/NLM/NCBI eutilities at ncbi.nlm.nih.gov. _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce -------------------------------------------------------------------------------- > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l > From hlapp at drycafe.net Wed Mar 24 15:27:37 2010 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 24 Mar 2010 11:27:37 -0400 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> Message-ID: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> On Mar 24, 2010, at 10:51 AM, Peter wrote: > Please give the NCBI an email - you can CC me too if you like. Can't this be the developers' mailing list (or lists, the appropriate one for each toolkit)? We can even whitelist all NCBI sender addresses so they can easily email us if there are issues. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Mar 24 15:31:34 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 10:31:34 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies Message-ID: To whom it may concern, Just want to get some clarification from the eutils folks with the changes. I'm a core developer for the BioPerl toolkit and and in collaboration with several other Bio* toolkits (BioPython, BioRuby, BioJava). We all have interfaces to eutils, either via the standard interface, SOAP-based, or both. We're seeking a clarification regarding the new rules, specifically the rules concerning 'tool' and 'email'. Per the rules proposed in Dec. 2009, many of us have already implemented changes to address them. As we're basically language-specific toolkits (classes, modules, etc) that are used for developing other applications, we have taken the stance that the end-user will need to start providing at least the email in order to properly use the eutils-specific tools, with a warning issued otherwise. This is based on several reasons, foremost being the toolkits are very widely used (so may be spread over potentially thousands of IPs) and are being used in a large number of downstream applications. As an example of the latter, BioJava is used in several free and commercial applications, such as Taverna and Geneious. Currently, with BioPerl and Biopython, we set the 'tool' parameter specifically to the toolkit name by default. This parameter can be overridden. However, from reading the newest rules it appears that each tool (and thus each toolkit) should have one common email, no more. This also appears to make the assertion that users using these toolkits (with 'tool' set and using the relevant emails) may essentially be tied down if one IP decides to abuse the rules. This unfortunately makes the incorrect generalization that each tool is created by one developer, and simply does not make sense in large collaborative projects such as ours, where the software is used primarily in downstream applications and scripts. Setting the tool could be beneficial to the development team, but at the same time it could be a tremendous hindrance. So, what exactly should we do? Do we set the 'tool' by default, or leave it to the user? Similarly, how do we treat 'email'? If we do set them, would the entire group of users for that tool be blocked if one end-user abuses the system? Should we leave it up to the user to register themselves (both tool and email)? So, we're at an impasse and really need your help. Sincerely, chris Christopher Fields Core Developer, BioPerl Project IGB Postdoctoral Fellow Genomics of Neural & Behavioral Plasticity University of Illinois Urbana-Champaign Institute for Genomic Biology 1206 W. Gregory Dr. , MC-195 Urbana, IL 61801 From biopython at maubp.freeserve.co.uk Wed Mar 24 15:41:59 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 24 Mar 2010 15:41:59 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> Message-ID: <320fb6e01003240841s5d127ff7v72e425ee00aec34c@mail.gmail.com> On Wed, Mar 24, 2010 at 3:27 PM, Hilmar Lapp wrote: > > > On Mar 24, 2010, at 10:51 AM, Peter wrote: > >> Please give the NCBI an email - you can CC me too if you like. > > Can't this be the developers' mailing list (or lists, the appropriate one > for each toolkit)? We can even whitelist all NCBI sender addresses so they > can easily email us if there are issues. > I'd wondered about signing up the dev mailing lists to the NCBI E-utility mailing list but I'm not sure how the passwords would work (the month reminder would also get sent to the list): http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce But for a discussion of the guidelines, then yes, why not CC the open-bio-l at lists.open-bio.org list and we can manually moderate and/or white list the NCBI. Peter From cjfields at illinois.edu Wed Mar 24 15:44:21 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 10:44:21 -0500 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> Message-ID: <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> On Mar 24, 2010, at 9:51 AM, Peter wrote: > On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields wrote: >> >> On Mar 24, 2010, at 9:08 AM, Peter wrote: >> >>> Hi, >>> >>> This is probably of interest to all the Bio* projects offering access >>> to the NCBI Entrez utilities. See forwarded message below. >>> >>> I *think* the new guidelines basically say that the email & tool parameters are >>> optional BUT if your IP address ever gets banned for excessive use you then >>> have to register an email & tool combination. >>> >>> Regarding the email address, the NCBI say to use the email of the developer >>> (not the end user). However, they do not distinguish between the developers >>> of a library (like us), and the developers of an application or script using a >>> library (who may also be the end user). >>> >>> Currently we (Biopython) and I think BioPerl ask developers using our libraries >>> to populate the email address themselves. I *think* this is still the >>> right action. >>> >>> Peter >> >> >> Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I >> think with the SOAP-based ones as well). We're providing a specific set of >> tools for user to write up their own applications end applications. I can try >> contacting them regarding this to get an official response to clarify this >> somewhat. > > Please give the NCBI an email - you can CC me too if you like. Sent, have cc'd the open-bio list. Don't want to cross-post this too much, so I think we should move the discussion there. >> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a >> default, but always leave the email blank and issue a warning if it isn't >> set. We could just as easily leave both blank and issue warnings for both. > > We currently leave out the email and set the tool parameter to "Biopython" > by default but this can be overridden. Currently leaving out the email does > cause Biopython to give a warning. > > Peter We follow the same, then (down to the warning). This is mentioned in my post to them, I'll wait to see what they say. My concern is the wording of the new rules. Each tool and email must be registered with them if an IP is blocked. Does this mean each tool is assigned one specific email? And an IP that is blocked can register it to be allowed back into the fold? With that in mind, should we register each of our toolkits with them? Probably not a bad thing (it might help us as devs to get an idea of use), but then if one user abuses the rules will their actions affect all toolkit users? Is this all done on a per-IP basis, per-toolkit basis, etc? Unfortunately, at least to me, none of this is made very clear, so I'm hoping there is some clarification from their end. chris From maj at fortinbras.us Wed Mar 24 16:37:56 2010 From: maj at fortinbras.us (Mark A. Jensen) Date: Wed, 24 Mar 2010 12:37:56 -0400 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI RevisedE-utility Usage Policy In-Reply-To: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com><38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu><320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net> Message-ID: I think this is a great idea--- MAJ ----- Original Message ----- From: "Hilmar Lapp" To: "Peter" Cc: ; "Biopython-Dev Mailing List" ; ; "bioperl-l list" ; "Chris Fields" ; Sent: Wednesday, March 24, 2010 11:27 AM Subject: Re: [Bioperl-l] [Open-bio-l] Fwd: [Utilities-announce] NCBI RevisedE-utility Usage Policy > > On Mar 24, 2010, at 10:51 AM, Peter wrote: > >> Please give the NCBI an email - you can CC me too if you like. > > > Can't this be the developers' mailing list (or lists, the appropriate one for > each toolkit)? We can even whitelist all NCBI sender addresses so they can > easily email us if there are issues. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From andy.jenkinson at ebi.ac.uk Wed Mar 24 16:24:59 2010 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Wed, 24 Mar 2010 16:24:59 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> Message-ID: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> >From my experience, if you set a default value for something and there is very little advantage to changing it, people will rarely bother to do so. The library developer's email address is not very useful for NCBI, who I assume wish to use it to contact whoever is consuming their resources. Being able to contact the Bio* developer doesn't really allow them to do this. The Bio* mailing list would be an option because there is at least some chance the app developer will get the email, but on balance I think it'd be better to incentivise people to change it themselves. So I would say: leave it blank and give a warning. Cheers, Andy On 24 Mar 2010, at 14:08, Peter wrote: > Hi, > > This is probably of interest to all the Bio* projects offering access > to the NCBI > Entrez utilities. See forwarded message below. > > I *think* the new guidelines basically say that the email & tool parameters are > optional BUT if your IP address ever gets banned for excessive use you then > have to register an email & tool combination. > > Regarding the email address, the NCBI say to use the email of the developer > (not the end user). However, they do not distinguish between the developers > of a library (like us), and the developers of an application or script using a > library (who may also be the end user). > > Currently we (Biopython) and I think BioPerl ask developers using our libraries > to populate the email address themselves. I *think* this is still the > right action. > > Peter > > ---------- Forwarded message ---------- > From: > Date: Wed, Mar 24, 2010 at 1:53 PM > Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy > To: NLM/NCBI List utilities-announce > > > New E-utility documentation now on the NCBI Bookshelf > > The Entrez Programming Utilities (E-Utilities) Help documentation has > been added to the NCBI Bookshelf, and so is now fully integrated with > the Entrez search and retrieval system as a part of the Bookshelf > database. This help document has been divided into chapters for better > organization and includes several new sample Perl scripts. At present > this book covers the standard URL interface for the E-utilties; > material about the SOAP interface will be added soon and is still > available at the same URL: > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. > > > > Revised E-utility usage policy > > In December, 2009 NCBI announced a change to the usage policy for the > E-utilities that would require all requests to contain non-null values > for both the &email and &tool parameters. After several consultations > with our users and developers, we have decided to revise this policy > change, and the revised policy is described in detail at the following > link: > > http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > > Please let us know if you have any questions or concerns about this > policy change. > > > > Thank you, > > The E-Utilities Team > > NIH/NLM/NCBI > > eutilities at ncbi.nlm.nih.gov. > > > > _______________________________________________ > Utilities-announce mailing list > http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l From cjfields at illinois.edu Wed Mar 24 18:21:51 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 13:21:51 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> Message-ID: Neither the eutils notification nor the new eutils docs make this very clear. For instance, from reading the documentation, one would only have to register the tool and email once an IP is blocked. However, later on it is indicated that the values supplied must be registered with NCBI or they will be blocked, which (to me at least) reads as if they must be registered regardless. Which is it? Also, there is the bit about the tool and email belonging to the software developer or organization, not the end-user, likely for the reasons Hilmar mentions. Does this mean each tool has one assigned email? This would then mean we need to either set both and register them just in case, or leave both empty and warn the user. We have a bit of time to work out the specifics, just hoping NCBI responds (one never knows with them). chris On Mar 24, 2010, at 11:24 AM, Andy Jenkinson wrote: >> From my experience, if you set a default value for something and there is very little advantage to changing it, people will rarely bother to do so. > > The library developer's email address is not very useful for NCBI, who I assume wish to use it to contact whoever is consuming their resources. Being able to contact the Bio* developer doesn't really allow them to do this. The Bio* mailing list would be an option because there is at least some chance the app developer will get the email, but on balance I think it'd be better to incentivise people to change it themselves. > > So I would say: leave it blank and give a warning. > > Cheers, > Andy > > On 24 Mar 2010, at 14:08, Peter wrote: > >> Hi, >> >> This is probably of interest to all the Bio* projects offering access >> to the NCBI >> Entrez utilities. See forwarded message below. >> >> I *think* the new guidelines basically say that the email & tool parameters are >> optional BUT if your IP address ever gets banned for excessive use you then >> have to register an email & tool combination. >> >> Regarding the email address, the NCBI say to use the email of the developer >> (not the end user). However, they do not distinguish between the developers >> of a library (like us), and the developers of an application or script using a >> library (who may also be the end user). >> >> Currently we (Biopython) and I think BioPerl ask developers using our libraries >> to populate the email address themselves. I *think* this is still the >> right action. >> >> Peter >> >> ---------- Forwarded message ---------- >> From: >> Date: Wed, Mar 24, 2010 at 1:53 PM >> Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy >> To: NLM/NCBI List utilities-announce >> >> >> New E-utility documentation now on the NCBI Bookshelf >> >> The Entrez Programming Utilities (E-Utilities) Help documentation has >> been added to the NCBI Bookshelf, and so is now fully integrated with >> the Entrez search and retrieval system as a part of the Bookshelf >> database. This help document has been divided into chapters for better >> organization and includes several new sample Perl scripts. At present >> this book covers the standard URL interface for the E-utilties; >> material about the SOAP interface will be added soon and is still >> available at the same URL: >> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. >> >> >> >> Revised E-utility usage policy >> >> In December, 2009 NCBI announced a change to the usage policy for the >> E-utilities that would require all requests to contain non-null values >> for both the &email and &tool parameters. After several consultations >> with our users and developers, we have decided to revise this policy >> change, and the revised policy is described in detail at the following >> link: >> >> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen >> >> Please let us know if you have any questions or concerns about this >> policy change. >> >> >> >> Thank you, >> >> The E-Utilities Team >> >> NIH/NLM/NCBI >> >> eutilities at ncbi.nlm.nih.gov. >> >> >> >> _______________________________________________ >> Utilities-announce mailing list >> http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l > > > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l From birney at ebi.ac.uk Wed Mar 24 19:08:25 2010 From: birney at ebi.ac.uk (Ewan Birney) Date: Wed, 24 Mar 2010 19:08:25 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> Message-ID: <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> Sorry to perk up here, but I think the right thing is to throw an exception if it's called without a "tool" and "email" parameter. Of course, then a client programmer can abuse this, but they are forced to put something in there. On 24 Mar 2010, at 18:21, Chris Fields wrote: > Neither the eutils notification nor the new eutils docs make this > very clear. For instance, from reading the documentation, one would > only have to register the tool and email once an IP is blocked. > However, later on it is indicated that the values supplied must be > registered with NCBI or they will be blocked, which (to me at least) > reads as if they must be registered regardless. Which is it? > > Also, there is the bit about the tool and email belonging to the > software developer or organization, not the end-user, likely for the > reasons Hilmar mentions. Does this mean each tool has one assigned > email? This would then mean we need to either set both and register > them just in case, or leave both empty and warn the user. > > We have a bit of time to work out the specifics, just hoping NCBI > responds (one never knows with them). > > chris > > On Mar 24, 2010, at 11:24 AM, Andy Jenkinson wrote: > >>> From my experience, if you set a default value for something and >>> there is very little advantage to changing it, people will rarely >>> bother to do so. >> >> The library developer's email address is not very useful for NCBI, >> who I assume wish to use it to contact whoever is consuming their >> resources. Being able to contact the Bio* developer doesn't really >> allow them to do this. The Bio* mailing list would be an option >> because there is at least some chance the app developer will get >> the email, but on balance I think it'd be better to incentivise >> people to change it themselves. >> >> So I would say: leave it blank and give a warning. >> >> Cheers, >> Andy >> >> On 24 Mar 2010, at 14:08, Peter wrote: >> >>> Hi, >>> >>> This is probably of interest to all the Bio* projects offering >>> access >>> to the NCBI >>> Entrez utilities. See forwarded message below. >>> >>> I *think* the new guidelines basically say that the email & tool >>> parameters are >>> optional BUT if your IP address ever gets banned for excessive use >>> you then >>> have to register an email & tool combination. >>> >>> Regarding the email address, the NCBI say to use the email of the >>> developer >>> (not the end user). However, they do not distinguish between the >>> developers >>> of a library (like us), and the developers of an application or >>> script using a >>> library (who may also be the end user). >>> >>> Currently we (Biopython) and I think BioPerl ask developers using >>> our libraries >>> to populate the email address themselves. I *think* this is still >>> the >>> right action. >>> >>> Peter >>> >>> ---------- Forwarded message ---------- >>> From: >>> Date: Wed, Mar 24, 2010 at 1:53 PM >>> Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy >>> To: NLM/NCBI List utilities-announce >> > >>> >>> >>> New E-utility documentation now on the NCBI Bookshelf >>> >>> The Entrez Programming Utilities (E-Utilities) Help documentation >>> has >>> been added to the NCBI Bookshelf, and so is now fully integrated >>> with >>> the Entrez search and retrieval system as a part of the Bookshelf >>> database. This help document has been divided into chapters for >>> better >>> organization and includes several new sample Perl scripts. At >>> present >>> this book covers the standard URL interface for the E-utilties; >>> material about the SOAP interface will be added soon and is still >>> available at the same URL: >>> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. >>> >>> >>> >>> Revised E-utility usage policy >>> >>> In December, 2009 NCBI announced a change to the usage policy for >>> the >>> E-utilities that would require all requests to contain non-null >>> values >>> for both the &email and &tool parameters. After several >>> consultations >>> with our users and developers, we have decided to revise this policy >>> change, and the revised policy is described in detail at the >>> following >>> link: >>> >>> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen >>> >>> Please let us know if you have any questions or concerns about this >>> policy change. >>> >>> >>> >>> Thank you, >>> >>> The E-Utilities Team >>> >>> NIH/NLM/NCBI >>> >>> eutilities at ncbi.nlm.nih.gov. >>> >>> >>> >>> _______________________________________________ >>> Utilities-announce mailing list >>> http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce >>> _______________________________________________ >>> Open-Bio-l mailing list >>> Open-Bio-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/open-bio-l >> >> >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l > > > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l From cjfields at illinois.edu Thu Mar 25 04:51:25 2010 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 24 Mar 2010 23:51:25 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> Message-ID: <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Yep, tend to agree. And the client program would be ultimately responsible for whatever is entered in. Of course, we'll want to register certain tools/emails anyway ('BioPerl' possibly along with a white-listed email to the mailing list), just to be on the safe side. Don't want the hassle of dealing with someone else pretending to be one of the various Bio*. chris On Mar 24, 2010, at 2:08 PM, Ewan Birney wrote: > > Sorry to perk up here, but I think the right thing is to throw an exception > if it's called without a "tool" and "email" parameter. Of course, then a client > programmer can abuse this, but they are forced to put something in there. > > > On 24 Mar 2010, at 18:21, Chris Fields wrote: > >> Neither the eutils notification nor the new eutils docs make this very clear. For instance, from reading the documentation, one would only have to register the tool and email once an IP is blocked. However, later on it is indicated that the values supplied must be registered with NCBI or they will be blocked, which (to me at least) reads as if they must be registered regardless. Which is it? >> >> Also, there is the bit about the tool and email belonging to the software developer or organization, not the end-user, likely for the reasons Hilmar mentions. Does this mean each tool has one assigned email? This would then mean we need to either set both and register them just in case, or leave both empty and warn the user. >> >> We have a bit of time to work out the specifics, just hoping NCBI responds (one never knows with them). >> >> chris >> >> On Mar 24, 2010, at 11:24 AM, Andy Jenkinson wrote: >> >>>> From my experience, if you set a default value for something and there is very little advantage to changing it, people will rarely bother to do so. >>> >>> The library developer's email address is not very useful for NCBI, who I assume wish to use it to contact whoever is consuming their resources. Being able to contact the Bio* developer doesn't really allow them to do this. The Bio* mailing list would be an option because there is at least some chance the app developer will get the email, but on balance I think it'd be better to incentivise people to change it themselves. >>> >>> So I would say: leave it blank and give a warning. >>> >>> Cheers, >>> Andy >>> >>> On 24 Mar 2010, at 14:08, Peter wrote: >>> >>>> Hi, >>>> >>>> This is probably of interest to all the Bio* projects offering access >>>> to the NCBI >>>> Entrez utilities. See forwarded message below. >>>> >>>> I *think* the new guidelines basically say that the email & tool parameters are >>>> optional BUT if your IP address ever gets banned for excessive use you then >>>> have to register an email & tool combination. >>>> >>>> Regarding the email address, the NCBI say to use the email of the developer >>>> (not the end user). However, they do not distinguish between the developers >>>> of a library (like us), and the developers of an application or script using a >>>> library (who may also be the end user). >>>> >>>> Currently we (Biopython) and I think BioPerl ask developers using our libraries >>>> to populate the email address themselves. I *think* this is still the >>>> right action. >>>> >>>> Peter >>>> >>>> ---------- Forwarded message ---------- >>>> From: >>>> Date: Wed, Mar 24, 2010 at 1:53 PM >>>> Subject: [Utilities-announce] NCBI Revised E-utility Usage Policy >>>> To: NLM/NCBI List utilities-announce >>>> >>>> >>>> New E-utility documentation now on the NCBI Bookshelf >>>> >>>> The Entrez Programming Utilities (E-Utilities) Help documentation has >>>> been added to the NCBI Bookshelf, and so is now fully integrated with >>>> the Entrez search and retrieval system as a part of the Bookshelf >>>> database. This help document has been divided into chapters for better >>>> organization and includes several new sample Perl scripts. At present >>>> this book covers the standard URL interface for the E-utilties; >>>> material about the SOAP interface will be added soon and is still >>>> available at the same URL: >>>> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html. >>>> >>>> >>>> >>>> Revised E-utility usage policy >>>> >>>> In December, 2009 NCBI announced a change to the usage policy for the >>>> E-utilities that would require all requests to contain non-null values >>>> for both the &email and &tool parameters. After several consultations >>>> with our users and developers, we have decided to revise this policy >>>> change, and the revised policy is described in detail at the following >>>> link: >>>> >>>> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen >>>> >>>> Please let us know if you have any questions or concerns about this >>>> policy change. >>>> >>>> >>>> >>>> Thank you, >>>> >>>> The E-Utilities Team >>>> >>>> NIH/NLM/NCBI >>>> >>>> eutilities at ncbi.nlm.nih.gov. >>>> >>>> >>>> >>>> _______________________________________________ >>>> Utilities-announce mailing list >>>> http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce >>>> _______________________________________________ >>>> Open-Bio-l mailing list >>>> Open-Bio-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/open-bio-l >>> >>> >>> _______________________________________________ >>> Open-Bio-l mailing list >>> Open-Bio-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/open-bio-l >> >> >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l > From birney at ebi.ac.uk Thu Mar 25 11:07:30 2010 From: birney at ebi.ac.uk (Ewan Birney) Date: Thu, 25 Mar 2010 11:07:30 +0000 (GMT) Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: Oddly, I don't think you want to be registering BioPerl as a client with an email. Rather the Bioperl libraries should prevent a client programmer from using the functions without an email and program type entered. This forces the decision onto the client programmer. -- ----------------------------------------------------------------- Ewan Birney. Work: +44 1223 494420 Email: birney "at" ebi.ac.uk Clerical Assistant: shelley "at" ebi.ac.uk Please cc shelley for urgent or diary-dependent requests ----------------------------------------------------------------- From biopython at maubp.freeserve.co.uk Thu Mar 25 11:19:09 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 11:19:09 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: <320fb6e01003250419q1ed0af70g913a6ab4a5de23b5@mail.gmail.com> On Thu, Mar 25, 2010 at 11:07 AM, Ewan Birney wrote: > > Oddly, I don't think you want to be registering BioPerl as a client > with an email. Rather the Bioperl libraries should prevent a client > programmer from using the functions without an email and program type > entered. This forces the decision onto the client programmer. > I think for most cases, having a Bio* mailing list as a default Entrez email address is pointless (we have almost no control over how end users will call the Entrez functions, if they will use the history or not, etc). The current behaviour of defaulting to no email but raising a warning seems OK. As Ewan says (and the NCBI earlier said they would required), changing this to make the email mandatory is also a sensible option. There is a special case for running the Bio* unit tests, where it might make sense to include the developer's mailing list (and maybe set the tool to something like "BioPerl-unittests" rather than just "BioPerl"). Peter From pmiguel at purdue.edu Thu Mar 25 11:44:10 2010 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Thu, 25 Mar 2010 07:44:10 -0400 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> Message-ID: <4BAB4C8A.3050903@purdue.edu> Chris Fields wrote: > On Mar 24, 2010, at 9:51 AM, Peter wrote: > > >> On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields wrote: >> >>> >> Please give the NCBI an email - you can CC me too if you like. >> > > Sent, have cc'd the open-bio list. Don't want to cross-post this too much, so I think we should move the discussion there. > > >>> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a >>> default, but always leave the email blank and issue a warning if it isn't >>> set. We could just as easily leave both blank and issue warnings for both. >>> >> We currently leave out the email and set the tool parameter to "Biopython" >> by default but this can be overridden. Currently leaving out the email does >> cause Biopython to give a warning. >> >> Peter >> > > We follow the same, then (down to the warning). This is mentioned in my post to them, I'll wait to see what they say. > > My concern is the wording of the new rules. Each tool and email must be registered with them if an IP is blocked. Does this mean each tool is assigned one specific email? And an IP that is blocked can register it to be allowed back into the fold? With that in mind, should we register each of our toolkits with them? Probably not a bad thing (it might help us as devs to get an idea of use), but then if one user abuses the rules will their actions affect all toolkit users? Is this all done on a per-IP basis, per-toolkit basis, etc? > > Unfortunately, at least to me, none of this is made very clear, so I'm hoping there is some clarification from their end. > > chris > Maybe GenBank is hoping that developers will create Genbank rules-compliant modules when accessing their resources. That is, for EUtilities by default, the tools would check the local time and cut off requests to 100 if outside the hours of 9PM-5AM Eastern Time. Also the number of requests could be limited to 3 per second. But it seems like it would be better if Genbank would return some sort of "load" field with the response to each request. That would allow feedback control of a series of requests. It could be tuned however Genbank likes, but past a certain threshold the client program would know that another request within a certain amount of time will result in the IP being banned. -- Phillip From biopython at maubp.freeserve.co.uk Thu Mar 25 12:10:26 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 12:10:26 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <4BAB4C8A.3050903@purdue.edu> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> <4BAB4C8A.3050903@purdue.edu> Message-ID: <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> On Thu, Mar 25, 2010 at 11:44 AM, Phillip San Miguel wrote: > > Maybe GenBank is hoping that developers will create Genbank rules-compliant > modules when accessing their resources. That is, for EUtilities by default, > the tools would check the local time and cut off requests to 100 if outside > the hours of 9PM-5AM Eastern Time. Also the number of requests could be > limited to 3 per second. Biopython (and I assume BioPerl et al) already enforces the Entrez 3 requests per second rule. That bit is easy. If we assume the user has their timezone information setup right, it should also be possible to count the number of requests made within the hours of 9AM to 5PM Eastern Time and issue a warning or raise an error if over 100. Currently Biopython leaves this to the user. Interestingly the older guideline text here gives the 100 limit, http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements but the new text does not: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > But it seems like it would be better if Genbank would return some sort of > "load" field with the response to each request. That would allow feedback > control of a series of requests. It could be tuned however Genbank likes, > but past a certain threshold the client program would know that another > request within a certain amount of time will result in the IP being banned. Interesting idea - could be useful for large jobs. Peter P.S. We're talking about more than just GenBank here - The Entrez utilities cover multiple databases. From biopython at maubp.freeserve.co.uk Thu Mar 25 12:18:38 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 12:18:38 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> <4BAB4C8A.3050903@purdue.edu> <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> Message-ID: <320fb6e01003250518y7219d055ja8f67de8b15a7bdc@mail.gmail.com> On Thu, Mar 25, 2010 at 12:10 PM, Peter wrote: > If we assume the user has their timezone information setup right, it should > also be possible to count the number of requests made within the hours of > 9AM to 5PM Eastern Time and issue a warning or raise an error if over 100. > Currently Biopython leaves this to the user. > > Interestingly the older guideline text here gives the 100 limit, > http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements > but the new text does not: > http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen Eastern Time (EST) is 5 hours behind of Coordinated Universal Time (UTC) aka Greenwich Mean Time (GMT), thus 09:00 to 17:00 EST is 14:00 to 22:00 UTC. I notice the NCBI do not appear to mention summer/winter time (daylight saving time), which may be an oversight. Peter From andy.jenkinson at ebi.ac.uk Thu Mar 25 12:50:01 2010 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Thu, 25 Mar 2010 12:50:01 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: I think Chris meant to register them with NCBI rather than use them as default values. Purely to prevent application developers registering their applications as "BioPerl". I think we all agree that default values would not be helpful! On 25 Mar 2010, at 11:07, Ewan Birney wrote: > > Oddly, I don't think you want to be registering BioPerl as a client > with an email. Rather the Bioperl libraries should prevent a client > programmer from using the functions without an email and program type > entered. This forces the decision onto the client programmer. > > > -- > ----------------------------------------------------------------- > Ewan Birney. Work: +44 1223 494420 > Email: birney "at" ebi.ac.uk > Clerical Assistant: shelley "at" ebi.ac.uk > Please cc shelley for urgent or diary-dependent requests > ----------------------------------------------------------------- From cjfields at illinois.edu Thu Mar 25 13:10:43 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 08:10:43 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: Andy, Ewan, Yes, that's what I meant; I do not think a set of defaults is a good idea. The other advantage to registering them is the list would get immediate updates from NCBI when changes occur (instead of finding out about them second-hand from other subscribers). The list is very low traffic. From their online docs: 'In addition, developers may request that the value of email be added to the E-utility mailing list that provides announcements of software updates, known bugs and other policy changes affecting the E-utilities.' chris On Mar 25, 2010, at 7:50 AM, Andy Jenkinson wrote: > I think Chris meant to register them with NCBI rather than use them as default values. Purely to prevent application developers registering their applications as "BioPerl". I think we all agree that default values would not be helpful! > > On 25 Mar 2010, at 11:07, Ewan Birney wrote: > >> >> Oddly, I don't think you want to be registering BioPerl as a client >> with an email. Rather the Bioperl libraries should prevent a client >> programmer from using the functions without an email and program type >> entered. This forces the decision onto the client programmer. >> >> >> -- >> ----------------------------------------------------------------- >> Ewan Birney. Work: +44 1223 494420 >> Email: birney "at" ebi.ac.uk >> Clerical Assistant: shelley "at" ebi.ac.uk >> Please cc shelley for urgent or diary-dependent requests >> ----------------------------------------------------------------- > From peter at maubp.freeserve.co.uk Thu Mar 25 13:18:05 2010 From: peter at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 13:18:05 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> Message-ID: <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields wrote: > Andy, Ewan, > > Yes, that's what I meant; I do not think a set of defaults is a good idea. Why? I agree that putting a default project email address in is a bad idea, but having a default tool seems fine. Perhaps I have misunderstood you. If any Biopython/BioPerl user has written a dozen scripts using Entrez should they really be expected to give them all a (unique) tool name in the Entrez requests? Having it default to Biopython/BioPerl seems reasonable to me (in combination with the script writer's email address). The whole hassle about registering a tool+email is only if you need your IP address unblocked, typically if you or someone at your institute or ISP has previously abused the servers. Again we come back to the fact the new NCBI guidelines are still unclear. >?The other advantage to registering them is the list would get immediate > updates from NCBI when changes occur (instead of finding out about > them second-hand from other subscribers). ?The list is very low traffic. Well that is an advantage, but in practice having a few people from each project on the NCBI mailing list isn't a big hassle. Peter From biopython at maubp.freeserve.co.uk Thu Mar 25 13:35:19 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 13:35:19 +0000 Subject: [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250518y7219d055ja8f67de8b15a7bdc@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu> <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com> <338BDDD8-2A66-4086-BFB7-35EC8F8F0D66@illinois.edu> <4BAB4C8A.3050903@purdue.edu> <320fb6e01003250510q3593c1dexcb1bc360d26b2682@mail.gmail.com> <320fb6e01003250518y7219d055ja8f67de8b15a7bdc@mail.gmail.com> Message-ID: <320fb6e01003250635k26a614bdhc0c983c42198bfb0@mail.gmail.com> On Thu, Mar 25, 2010 at 12:18 PM, Peter wrote: > On Thu, Mar 25, 2010 at 12:10 PM, Peter wrote: >> If we assume the user has their timezone information setup right, it should >> also be possible to count the number of requests made within the hours of >> 9AM to 5PM Eastern Time and issue a warning or raise an error if over 100. >> Currently Biopython leaves this to the user. >> >> Interestingly the older guideline text here gives the 100 limit, >> http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements >> but the new text does not: >> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen > > Eastern Time (EST) is 5 hours behind of Coordinated Universal Time (UTC) > aka Greenwich Mean Time (GMT), thus 09:00 to 17:00 EST is 14:00 to 22:00 > UTC. I notice the NCBI do not appear to mention summer/winter time > (daylight saving time), which may be an oversight. > > Peter I've been looking at this more closely, the old guideline was: http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements "Run retrieval scripts on weekends or between 9 pm and 5 am Eastern Time weekdays for any series of more than 100 requests." This doesn't define a series - for example, would it be OK to run a script making 75 requests every two hours? This could be regarded as multiple separate series each under 100 requests, but the cumulative count over the 8 peak hours is 600 requests. Sadly the new guidelines are even more vague: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpeutils&part=chapter2#chapter2.Usage_Guidelines_and_Requiremen "... and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays." Not very helpful - maybe leaving this rule down to the user (as Biopython currently does) is the best option. Peter From andy.jenkinson at ebi.ac.uk Thu Mar 25 13:53:20 2010 From: andy.jenkinson at ebi.ac.uk (Andy Jenkinson) Date: Thu, 25 Mar 2010 13:53:20 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> Message-ID: <58FB83B7-937F-40B4-9158-61863BFA10D1@ebi.ac.uk> On 25 Mar 2010, at 13:18, Peter wrote: > On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields wrote: >> Andy, Ewan, >> >> Yes, that's what I meant; I do not think a set of defaults is a good idea. > > Why? I agree that putting a default project email address in is a bad > idea, but having a default tool seems fine. Perhaps I have misunderstood > you. Yes, I specifically mean the email parameter. I guess the only reason for the 'tool' parameter is to go some way to ensuring that the email address does actually belong to the tool developer? Otherwise it doesn't seem a very helpful piece of information to me, especially if it is one tool per email. > Again we come back to the fact the new NCBI guidelines are still > unclear. Indeed From cjfields at illinois.edu Thu Mar 25 15:43:24 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 10:43:24 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> References: <320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com> <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> Message-ID: On Mar 25, 2010, at 8:18 AM, Peter wrote: > On Thu, Mar 25, 2010 at 1:10 PM, Chris Fields wrote: >> Andy, Ewan, >> >> Yes, that's what I meant; I do not think a set of defaults is a good idea. > > Why? I agree that putting a default project email address in is a bad > idea, but having a default tool seems fine. Perhaps I have misunderstood > you. > > If any Biopython/BioPerl user has written a dozen scripts using > Entrez should they really be expected to give them all a (unique) tool > name in the Entrez requests? Having it default to Biopython/BioPerl > seems reasonable to me (in combination with the script writer's email > address). > > The whole hassle about registering a tool+email is only if you need your > IP address unblocked, typically if you or someone at your institute or > ISP has previously abused the servers. Let's play devil's advocate. The best way I can think of to describe this is to lay out a possible scenario. Suppose an end user (out of possibly thousands of end users, scattered across many IPs) uses one of the eutils modules/classes where the tool is set but the email isn't (our current status). They set 'email' to their local one, and then proceed to somehow abuse NCBI's rules and are blocked. In order to be reinstated, they will have to register both the tool and email with NCBI. Until then, does this block anyone else with the same tool? Just those from that IP? Not clear at the moment. To proceed further, now the user registers the tool and email (both need to be registered to unblock). If I understand the eutils documents correctly, as stated, only one email (supposedly the software developers) is registered per tool (also supposedly the software developers). The 'Bio*' tool name could end up being registered by anyone (wittingly or unwittingly), using their own personal email. If another user uses the same tool name with a different email, would they be blocked? If not, and that user tries to register as above (after subsequent abuse), would a conflict occur and they be notified of the prior registration? Again, it's not clear what happens. We have until probably sometime in May to decide a course of action (June 1 is the enforcement date I believe), but this relies on NCBI clarifying a few things first. The current documentation (at least to me) does seem to indicate that each tool must have a single corresponding email when registered. Unless it is clarified, from my perspective the only safe course that addresses all concerns is to leave both tool and email unset, and then register a respective toolkit/email to keep it within the specific dev group as a safeguard. That last bit is for many reasons I've already outlined; an additional one is the fact that we already have a default set for 'tool' now (and have had one set for a while), so by legacy anyone using older versions will have 'Bio*' preset already when June 1 hits. BTW, I don't consider the above scenario out of the realm of possibility, particularly if they truly intend on enforcing the rules this time around. We've had many users who have asked the question 'how can I download my batch of 1,00,000 records via eutils'. Potential lack of common sense doesn't stop the persistent or the desperate. > Again we come back to the fact the new NCBI guidelines are still > unclear. > >> The other advantage to registering them is the list would get immediate >> updates from NCBI when changes occur (instead of finding out about >> them second-hand from other subscribers). The list is very low traffic. > > Well that is an advantage, but in practice having a few people from > each project on the NCBI mailing list isn't a big hassle. > > Peter Right, but my point is there are no intermediaries, the news goes straight to the list. We're not reliant on (possibly busy, possibly absent) developers for second-hand news. We've been bitten by this before many times with NCBI, both with eutils and BLAST changes, a good many which NCBI announced but not passed on to our mailing list. Saying this now, it makes me wonder whether we should have a master list of some sort to gather such announcements that may impact developers (eutils, BLAST, GenBank/EMBL/UniProt releases, etc). chris From peter at maubp.freeserve.co.uk Thu Mar 25 16:13:57 2010 From: peter at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 16:13:57 +0000 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: References: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> Message-ID: <320fb6e01003250913k662aba2awd2121a57d1cd8596@mail.gmail.com> On Thu, Mar 25, 2010 at 3:43 PM, Chris Fields wrote: > > Let's play devil's advocate. The best way I can think of to > describe this is to lay out a possible scenario. Suppose an > end user (out of possibly thousands of end users, scattered > across many IPs) uses one of the eutils modules/classes where > the tool is set but the email isn't (our current status). > They set 'email' to their local one, and then proceed to > somehow abuse NCBI's rules and are blocked. In order to be > reinstated, they will have to register both the tool and > email with NCBI. Until then, does this block anyone else > with the same tool? Just those from that IP? Not clear at > the moment. I had understood that so far the NCBI just blocked by IP. But only they know for sure, and it may change. > To proceed further, now the user registers the tool and > email (both need to be registered to unblock). If I > understand the eutils documents correctly, as stated, only > one email (supposedly the software developers) is registered > per tool (also supposedly the software developers). The > 'Bio*' tool name could end up being registered by anyone > (wittingly or unwittingly), using their own personal email. Ah. That is possible - although I had been assuming the NCBI would be looking at the *combination* of tool+email, that may not be the case. > If another user uses the same tool name with a different > email, would they be blocked? If not, and that user tries > to register as above (after subsequent abuse), would a > conflict occur and they be notified of the prior > registration? Again, it's not clear what happens. Indeed. > We have until probably sometime in May to decide a course > of action (June 1 is the enforcement date I believe), but > this relies on NCBI clarifying a few things first. The > current documentation (at least to me) does seem to indicate > that each tool must have a single corresponding email when > registered. Unless it is clarified, from my perspective > the only safe course that addresses all concerns is to > leave both tool and email unset, and then register a > respective toolkit/email to keep it within the specific > dev group as a safeguard. That last bit is for many > reasons I've already outlined; an additional one is the > fact that we already have a default set for 'tool' now > (and have had one set for a while), so by legacy anyone > using older versions will have 'Bio*' preset already when > June 1 hits. Biopython has also been setting a default tool value (with no default email) for some time. > BTW, I don't consider the above scenario out of the realm > of possibility, particularly if they truly intend on > enforcing the rules this time around. We've had many > users who have asked the question 'how can I download > my batch of 1,00,000 records via eutils'. Potential > lack of common sense doesn't stop the persistent or > the desperate. Agreed - sooner or later someone will do something silly with the Bio* Entrez wrappers. Continuing to play Devil's advocate, let's suppose we had a default tool and email set. That *could* result in the NCBI blocking all users of that Bio* toolkit running with this default tool+email which would be a big problem (even if just for a day or two while talking to the NCBI). Therefore I don't think we should be setting a default tool+email (without clarification from the NCBI). > Right, but my point is there are no intermediaries, > the news goes straight to the list. We're not reliant > on (possibly busy, possibly absent) developers for > second-hand news. We've been bitten by this before > many times with NCBI, both with eutils and BLAST > changes, a good many which NCBI announced but not > passed on to our mailing list. > > Saying this now, it makes me wonder whether we should > have a master list of some sort to gather such > announcements that may impact developers (eutils, > BLAST, GenBank/EMBL/UniProt releases, etc). This is an excellent idea, but need not be linked to any default email used for Entrez. Peter From cjfields at illinois.edu Thu Mar 25 17:44:31 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 12:44:31 -0500 Subject: [Open-bio-l] Fwd: [Utilities-announce] NCBI Revised E-utility Usage Policy In-Reply-To: <320fb6e01003250913k662aba2awd2121a57d1cd8596@mail.gmail.com> References: <54C09681-5D8A-461F-8A8D-090E50DAE340@ebi.ac.uk> <4168F915-DFF8-4C97-B0E9-7058A777E102@ebi.ac.uk> <0E6B9B67-A222-47F4-A533-96847D030D54@illinois.edu> <320fb6e01003250618j749eee50y8832e1a2543debd0@mail.gmail.com> <320fb6e01003250913k662aba2awd2121a57d1cd8596@mail.gmail.com> Message-ID: On Mar 25, 2010, at 11:13 AM, Peter wrote: > On Thu, Mar 25, 2010 at 3:43 PM, Chris Fields wrote: > > > > Let's play devil's advocate. The best way I can think of to > > describe this is to lay out a possible scenario. Suppose an > > end user (out of possibly thousands of end users, scattered > > across many IPs) uses one of the eutils modules/classes where > > the tool is set but the email isn't (our current status). > > They set 'email' to their local one, and then proceed to > > somehow abuse NCBI's rules and are blocked. In order to be > > reinstated, they will have to register both the tool and > > email with NCBI. Until then, does this block anyone else > > with the same tool? Just those from that IP? Not clear at > > the moment. > > I had understood that so far the NCBI just blocked by IP. > But only they know for sure, and it may change. Yes. And they have every right to change; they have been hammered by spam for years now, and the tool/email has always been implied to be required (thought really never enforced). So it was only a matter of time before theydid something about it. > > To proceed further, now the user registers the tool and > > email (both need to be registered to unblock). If I > > understand the eutils documents correctly, as stated, only > > one email (supposedly the software developers) is registered > > per tool (also supposedly the software developers). The > > 'Bio*' tool name could end up being registered by anyone > > (wittingly or unwittingly), using their own personal email. > > Ah. That is possible - although I had been assuming the NCBI > would be looking at the *combination* of tool+email, that > may not be the case. > > > If another user uses the same tool name with a different > > email, would they be blocked? If not, and that user tries > > to register as above (after subsequent abuse), would a > > conflict occur and they be notified of the prior > > registration? Again, it's not clear what happens. > > Indeed. > > > We have until probably sometime in May to decide a course > > of action (June 1 is the enforcement date I believe), but > > this relies on NCBI clarifying a few things first. The > > current documentation (at least to me) does seem to indicate > > that each tool must have a single corresponding email when > > registered. Unless it is clarified, from my perspective > > the only safe course that addresses all concerns is to > > leave both tool and email unset, and then register a > > respective toolkit/email to keep it within the specific > > dev group as a safeguard. That last bit is for many > > reasons I've already outlined; an additional one is the > > fact that we already have a default set for 'tool' now > > (and have had one set for a while), so by legacy anyone > > using older versions will have 'Bio*' preset already when > > June 1 hits. > > Biopython has also been setting a default tool value > (with no default email) for some time. Right. Same with BioPerl, for many years now, with a few different names (some class-specific, others toolkit-based). Kind of a mess, really. > > BTW, I don't consider the above scenario out of the realm > > of possibility, particularly if they truly intend on > > enforcing the rules this time around. We've had many > > users who have asked the question 'how can I download > > my batch of 1,00,000 records via eutils'. Potential > > lack of common sense doesn't stop the persistent or > > the desperate. > > Agreed - sooner or later someone will do something silly > with the Bio* Entrez wrappers. Continuing to play Devil's > advocate, let's suppose we had a default tool and email > set. That *could* result in the NCBI blocking all users > of that Bio* toolkit running with this default tool+email > which would be a big problem (even if just for a day or > two while talking to the NCBI). Therefore I don't think > we should be setting a default tool+email (without > clarification from the NCBI). Yes, agreed. > > Right, but my point is there are no intermediaries, > > the news goes straight to the list. We're not reliant > > on (possibly busy, possibly absent) developers for > > second-hand news. We've been bitten by this before > > many times with NCBI, both with eutils and BLAST > > changes, a good many which NCBI announced but not > > passed on to our mailing list. > > > > Saying this now, it makes me wonder whether we should > > have a master list of some sort to gather such > > announcements that may impact developers (eutils, > > BLAST, GenBank/EMBL/UniProt releases, etc). > > This is an excellent idea, but need not be linked to > any default email used for Entrez. > > Peter Agreed. Adding the tool's related email to the NCBI eutil mailing list is supposed to be optional, anyway. chris From cjfields at illinois.edu Thu Mar 25 18:17:29 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 13:17:29 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> Message-ID: I'll cc this for the others in BioPerl, Biopython, Bioruby, Ensembl, etc so they know and can pass it around. Thanks Eric, happy to have that clarified. chris On Mar 25, 2010, at 1:06 PM, sayers wrote: > Hi Chris, > > Thanks for your note. Organizations such as yours were in our minds when we considered this policy change, and perhaps we need to consider improving the wording of the policy so that it is clearer on the points that you raise. > > Our requests for registered values for &tool and &email are directed at "end-developers" of software, not necessarily at toolkit developers such as yourself, whose products may be used by end-developers. We appreciate any efforts you can make to encourage and facilitate the use of &tool and &email by end-developers, but the act of registering values of these parameters is the responsibility of the end-developers. The value of &email should be a contact address for the end-developer, and ideally the value of &tool should be the the name of the software package that uses BioPerl or other toolkits. > > My suggestion would be similar to what you mention in your second paragraph: to leave both &tool and &email without values and give a warning or some equivalent to developers if they fail to set them. > > Regarding blocks, we block on the basis of IP addresses only. The essence of the new policy is that we are far more likely to block requests that do not have registered values of &tool and &email. If we need to block activity that does have a registered &tool or &email value, we will only block those IPs that are causing the abusive activity, not all IPs using that &tool/&email value. > > Thanks again for your comments and please let me know if you have further questions. > > Regards, > Eric > ___________________ > Eric W. Sayers, PhD > NCBI/NLM/NIH > 45 Center Drive, MSC 6511 > Bldg 45, Room 4AN.44C > Bethesda, MD 20892 > sayers at ncbi.nlm.nih.gov > > > > > On Mar 24, 2010, at 11:31 AM, Chris Fields wrote: > >> To whom it may concern, >> >> Just want to get some clarification from the eutils folks with the changes. I'm a core developer for the BioPerl toolkit and and in collaboration with several other Bio* toolkits (BioPython, BioRuby, BioJava). We all have interfaces to eutils, either via the standard interface, SOAP-based, or both. >> >> We're seeking a clarification regarding the new rules, specifically the rules concerning 'tool' and 'email'. Per the rules proposed in Dec. 2009, many of us have already implemented changes to address them. As we're basically language-specific toolkits (classes, modules, etc) that are used for developing other applications, we have taken the stance that the end-user will need to start providing at least the email in order to properly use the eutils-specific tools, with a warning issued otherwise. This is based on several reasons, foremost being the toolkits are very widely used (so may be spread over potentially thousands of IPs) and are being used in a large number of downstream applications. As an example of the latter, BioJava is used in several free and commercial applications, such as Taverna and Geneious. >> >> Currently, with BioPerl and Biopython, we set the 'tool' parameter specifically to the toolkit name by default. This parameter can be overridden. However, from reading the newest rules it appears that each tool (and thus each toolkit) should have one common email, no more. This also appears to make the assertion that users using these toolkits (with 'tool' set and using the relevant emails) may essentially be tied down if one IP decides to abuse the rules. >> >> This unfortunately makes the incorrect generalization that each tool is created by one developer, and simply does not make sense in large collaborative projects such as ours, where the software is used primarily in downstream applications and scripts. Setting the tool could be beneficial to the development team, but at the same time it could be a tremendous hindrance. >> >> So, what exactly should we do? Do we set the 'tool' by default, or leave it to the user? Similarly, how do we treat 'email'? If we do set them, would the entire group of users for that tool be blocked if one end-user abuses the system? Should we leave it up to the user to register themselves (both tool and email)? So, we're at an impasse and really need your help. >> >> Sincerely, >> >> chris >> >> >> Christopher Fields >> Core Developer, BioPerl Project >> IGB Postdoctoral Fellow >> Genomics of Neural & Behavioral Plasticity >> University of Illinois Urbana-Champaign >> Institute for Genomic Biology >> 1206 W. Gregory Dr. , MC-195 >> Urbana, IL 61801 >> > From biopython at maubp.freeserve.co.uk Thu Mar 25 18:31:18 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 18:31:18 +0000 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> Message-ID: <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> On Thu, Mar 25, 2010 at 6:17 PM, Chris Fields wrote: > I'll cc this for the others in BioPerl, Biopython, Bioruby, > Ensembl, etc so they know and can pass it around. > Thanks Eric, happy to have that clarified. > > chris Thanks Chris (& Eric), Looks like we are fine as things stand: continue to encourage the user to set the email (with a warning if omitted), and try to encourage them to override the tool parameter if appropriate (e.g. if part of a larger application like a Galaxy workflow). [I don't see any point in forcing people to invent tool names for each of their one off Entrez scripts, or interactive sessions - defaulting to BioPerl etc here seems sane] Peter From birney at ebi.ac.uk Thu Mar 25 19:59:31 2010 From: birney at ebi.ac.uk (Ewan Birney) Date: Thu, 25 Mar 2010 19:59:31 +0000 (GMT) Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> Message-ID: On Thu, 25 Mar 2010, Peter wrote: > On Thu, Mar 25, 2010 at 6:17 PM, Chris Fields wrote: >> I'll cc this for the others in BioPerl, Biopython, Bioruby, >> Ensembl, etc so they know and can pass it around. >> Thanks Eric, happy to have that clarified. >> >> chris > > Thanks Chris (& Eric), > > Looks like we are fine as things stand: continue to > encourage the user to set the email (with a warning > if omitted), and try to encourage them to override the > tool parameter if appropriate (e.g. if part of a larger > application like a Galaxy workflow). > > [I don't see any point in forcing people to invent > tool names for each of their one off Entrez scripts, > or interactive sessions - defaulting to BioPerl etc > here seems sane] At the very least teh default should be "BioPerl Toolkit Placeholder For Non Registered Client" so that NCBI know precisely that the end programmer has not put something in there sensibly. And there should be a loud warning. I think it's fine to actually throw an exception. If someone is running a one off script, then they made the function call and can modify it. If someone's developing something more serious then they've got the time to think it through. I see little benefit in letting a default happen with just a warning. > > Peter > _______________________________________________ > Open-Bio-l mailing list > Open-Bio-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/open-bio-l > -- ----------------------------------------------------------------- Ewan Birney. Work: +44 1223 494420 Email: birney "at" ebi.ac.uk Clerical Assistant: shelley "at" ebi.ac.uk Please cc shelley for urgent or diary-dependent requests ----------------------------------------------------------------- From cjfields at illinois.edu Thu Mar 25 20:34:01 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 15:34:01 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> Message-ID: <30558517-17C8-4863-B13A-CA3661076265@illinois.edu> On Mar 25, 2010, at 2:59 PM, Ewan Birney wrote: > On Thu, 25 Mar 2010, Peter wrote: > >> On Thu, Mar 25, 2010 at 6:17 PM, Chris Fields wrote: >>> I'll cc this for the others in BioPerl, Biopython, Bioruby, >>> Ensembl, etc so they know and can pass it around. >>> Thanks Eric, happy to have that clarified. >>> >>> chris >> >> Thanks Chris (& Eric), >> >> Looks like we are fine as things stand: continue to >> encourage the user to set the email (with a warning >> if omitted), and try to encourage them to override the >> tool parameter if appropriate (e.g. if part of a larger >> application like a Galaxy workflow). >> >> [I don't see any point in forcing people to invent >> tool names for each of their one off Entrez scripts, >> or interactive sessions - defaulting to BioPerl etc >> here seems sane] > > At the very least teh default should be "BioPerl Toolkit Placeholder For Non Registered Client" so that NCBI know precisely that the end programmer has not put something in there sensibly. > > And there should be a loud warning. I think it's fine to actually > throw an exception. If someone is running a one off script, then they > made the function call and can modify it. If someone's developing something more serious then they've got the time to think it through. > > > I see little benefit in letting a default happen with just a warning. I agree re: throwing an exception. Not sure I see the point of setting the tool at all if we're throwing an exception when the tool isn't changed from the default (would be just as easy to throw if it's not set), but I definitely see the benefit of an exception re: email. chris >> >> Peter >> _______________________________________________ >> Open-Bio-l mailing list >> Open-Bio-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/open-bio-l >> > > -- > ----------------------------------------------------------------- > Ewan Birney. Work: +44 1223 494420 > Email: birney "at" ebi.ac.uk > Clerical Assistant: shelley "at" ebi.ac.uk > Please cc shelley for urgent or diary-dependent requests > ----------------------------------------------------------------- From biopython at maubp.freeserve.co.uk Thu Mar 25 22:39:23 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 25 Mar 2010 22:39:23 +0000 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> Message-ID: <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> On Thu, Mar 25, 2010 at 7:59 PM, Ewan Birney wrote: > > On Thu, 25 Mar 2010, Peter wrote: >> >> Thanks Chris (& Eric), >> >> Looks like we are fine as things stand: continue to >> encourage the user to set the email (with a warning >> if omitted), and try to encourage them to override the >> tool parameter if appropriate (e.g. if part of a larger >> application like a Galaxy workflow). >> >> [I don't see any point in forcing people to invent >> tool names for each of their one off Entrez scripts, >> or interactive sessions - defaulting to BioPerl etc >> here seems sane] > > At the very least teh default should be "BioPerl Toolkit Placeholder For Non > Registered Client" so that NCBI know precisely that the end programmer has > not put something in there sensibly. No, that's just silly IMHO. Using "BioPerl" on its own serves just the same purpose (indeed, the NCBI will be used to this from existing users and all versions of BioPerl to date), The extra long version doesn't add any useful information and more importantly makes the URLs much longer which can be a real issue because long URLs can break (e.g. if going via a proxy). > And there should be a loud warning. I think it's fine to actually > throw an exception. If someone is running a one off script, then they > made the function call and can modify it. If someone's developing something > more serious then they've got the time to think it through. > > I see little benefit in letting a default happen with just a warning. Making the email and/or tool mandatory vs throwing an exception just an implementation detail. I think the issue is should BioPerl etc treat the email and tool as optional, optional with a warning, or mandatory. Note the NCBI does not seem to be making them mandatory (for now). Peter From cjfields at illinois.edu Fri Mar 26 02:10:54 2010 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 25 Mar 2010 21:10:54 -0500 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> Message-ID: <19CCAA14-DACE-4F20-BE2C-742EA0F46995@illinois.edu> On Mar 25, 2010, at 5:39 PM, Peter wrote: > On Thu, Mar 25, 2010 at 7:59 PM, Ewan Birney wrote: >> >> On Thu, 25 Mar 2010, Peter wrote: >>> >>> Thanks Chris (& Eric), >>> >>> Looks like we are fine as things stand: continue to >>> encourage the user to set the email (with a warning >>> if omitted), and try to encourage them to override the >>> tool parameter if appropriate (e.g. if part of a larger >>> application like a Galaxy workflow). >>> >>> [I don't see any point in forcing people to invent >>> tool names for each of their one off Entrez scripts, >>> or interactive sessions - defaulting to BioPerl etc >>> here seems sane] >> >> At the very least teh default should be "BioPerl Toolkit Placeholder For Non >> Registered Client" so that NCBI know precisely that the end programmer has >> not put something in there sensibly. > > No, that's just silly IMHO. Using "BioPerl" on its own serves just > the same purpose (indeed, the NCBI will be used to this from > existing users and all versions of BioPerl to date), The extra > long version doesn't add any useful information and more > importantly makes the URLs much longer which can be a > real issue because long URLs can break (e.g. if going via > a proxy). I don't think this is meant literally, just the general idea that setting it to a specific value indicates the user in question didn't reset it. >> And there should be a loud warning. I think it's fine to actually >> throw an exception. If someone is running a one off script, then they >> made the function call and can modify it. If someone's developing something >> more serious then they've got the time to think it through. >> >> I see little benefit in letting a default happen with just a warning. > > Making the email and/or tool mandatory vs throwing an > exception just an implementation detail. > > I think the issue is should BioPerl etc treat the email and > tool as optional, optional with a warning, or mandatory. > Note the NCBI does not seem to be making them > mandatory (for now). > > Peter It is listed as a user requirement here, has been for a long while actually: http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements It just hasn't been enforced. The new rule, as I understand it, is they will likely start enforcing it. chris From maj at fortinbras.us Fri Mar 26 13:04:56 2010 From: maj at fortinbras.us (Mark A. Jensen) Date: Fri, 26 Mar 2010 09:04:56 -0400 Subject: [Open-bio-l] ack! code.open-bio.org Message-ID: Hey Chris, can't ping code again. Sorry about this. MAJ From biopython at maubp.freeserve.co.uk Sat Mar 27 12:50:04 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 27 Mar 2010 12:50:04 +0000 Subject: [Open-bio-l] Toolkits and the new eutils policies In-Reply-To: <19CCAA14-DACE-4F20-BE2C-742EA0F46995@illinois.edu> References: <2F6ECC14-3871-4BA4-AD68-96C9279EBA58@ncbi.nlm.nih.gov> <320fb6e01003251131t4d3e2f02qace9a3ba89d2f6b6@mail.gmail.com> <320fb6e01003251539k917c0b5xaafb9550cc27afee@mail.gmail.com> <19CCAA14-DACE-4F20-BE2C-742EA0F46995@illinois.edu> Message-ID: <320fb6e01003270550t7ae22d60waceffac7914fce6d@mail.gmail.com> On Fri, Mar 26, 2010 at 2:10 AM, Chris Fields wrote: > > On Mar 25, 2010, at 5:39 PM, Peter wrote: > >> On Thu, Mar 25, 2010 at 7:59 PM, Ewan Birney wrote: >>> >>> At the very least teh default should be "BioPerl Toolkit >>> Placeholder For Non Registered Client" so that NCBI know >>> precisely that the end programmer has not put something in >>> there sensibly. >> >> No, that's just silly IMHO. Using "BioPerl" on its own serves just >> the same purpose (indeed, the NCBI will be used to this from >> existing users and all versions of BioPerl to date), The extra >> long version doesn't add any useful information and more >> importantly makes the URLs much longer which can be a >> real issue because long URLs can break (e.g. if going via >> a proxy). > > I don't think this is meant literally, just the general idea that > setting it to a specific value indicates the user in question > didn't reset it. > That makes more sense ;) >>> And there should be a loud warning. I think it's fine to actually >>> throw an exception. If someone is running a one off script, then they >>> made the function call and can modify it. If someone's developing >>> something more serious then they've got the time to think it through. >>> >>> I see little benefit in letting a default happen with just a warning. >> >> Making the email and/or tool mandatory vs throwing an >> exception just an implementation detail. >> >> I think the issue is should BioPerl etc treat the email and >> tool as optional, optional with a warning, or mandatory. >> Note the NCBI does not seem to be making them >> mandatory (for now). >> >> Peter > > > It is listed as a user requirement here, has been for a long > while actually: > > http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements > > It just hasn't been enforced. ?The new rule, as I understand > it, is they will likely start enforcing it. So is our status quo fine? email - left out by default but with a warning tool - set to BioPerl (or similar) by default Peter