From eric.talevich at gmail.com Thu Aug 1 16:04:29 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Thu, 1 Aug 2013 13:04:29 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock wrote: > On Wednesday, July 31, 2013, Ben Fulton wrote: > > > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see > how > > much code is covered by tests. The overall total was 74% which is pretty > > respectable. > > > > I ran the tests on a fairly fresh machine, which meant I had to install a > > lot of software, some of which I either didn't get installed properly, or > > the tests are out of date, or there were failures for some other reason. > I > > ended up having to skip seven test files: > > > > Dialign_Tool > > EmbossPhylipNew > > Mafft > > PopGen_DFDist > > PopGen_FDist > > XXMotif > > phyml > > > I'm pretty sure I have some or all of those setup on at least one > of my test machines, so with a little more work together we > can try to resolve those (which may mean updating the docs). > I just fixed the error in test_phyml_tool.py, it was a simple one: https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f > There were three tests I managed to get running but still had failures: > > > > FastTree > > NCBI_BLAST > > Prank_too > The FastTree test is not based on the unittest framework, so the output contains the word "Failed" in three places to describe error-handling tests that worked correctly. Can we see the output for this one? (It works on my machine.) The test is also fairly new, so there could be some version-compatibility issues there too. Thanks, Eric From ben at benfulton.net Thu Aug 1 22:20:49 2013 From: ben at benfulton.net (Ben Fulton) Date: Thu, 1 Aug 2013 22:20:49 -0400 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: My test machine was running Ubuntu 12.04. For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and got this error: ApplicationError: Command 'fasttree -out temp_test.tree Quality/example.fasta' returned non-zero exit status 1, 'Unknown or incorrect use of option -out' The NCBI_BLAST error involves rpsblast not being in the install. Version 2.2.25-7 using apt-get. Dialign is version 2.2.1-5 using apt-get. I got two errors: first, DIALIGN2_DIR not being set. It was installed to /usr/bin so I set DIALIGN2_DIR to that directory; then I got "Environment variable DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these items are needed, though I may have missed them in the documentation. I downloaded version 130708 of Prank from http://code.google.com/p/prank-msa/downloads/list. The error is on line 165 of the test file: AssertionError: ----------------- PRANK v.130708: ----------------- Input for the analysis - converting 'Quality/example.fasta' to 'temp with space.phy' EmbossPhylipNew I tried to install from source, but it was complicated and I didn't get it finished. I'll send some notes on the other errors when I get a few minutes. On Thu, Aug 1, 2013 at 4:04 PM, Eric Talevich wrote: > On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock wrote: > >> On Wednesday, July 31, 2013, Ben Fulton wrote: >> >> > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see >> how >> > much code is covered by tests. The overall total was 74% which is pretty >> > respectable. >> > >> > I ran the tests on a fairly fresh machine, which meant I had to install >> a >> > lot of software, some of which I either didn't get installed properly, >> or >> > the tests are out of date, or there were failures for some other >> reason. I >> > ended up having to skip seven test files: >> > >> > Dialign_Tool >> > EmbossPhylipNew >> > Mafft >> > PopGen_DFDist >> > PopGen_FDist >> > XXMotif >> > phyml >> >> >> I'm pretty sure I have some or all of those setup on at least one >> of my test machines, so with a little more work together we >> can try to resolve those (which may mean updating the docs). >> > > I just fixed the error in test_phyml_tool.py, it was a simple one: > > https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f > > > > There were three tests I managed to get running but still had failures: >> > >> > FastTree >> > NCBI_BLAST >> > Prank_too >> > > The FastTree test is not based on the unittest framework, so the output > contains the word "Failed" in three places to describe error-handling tests > that worked correctly. Can we see the output for this one? (It works on my > machine.) > > The test is also fairly new, so there could be some version-compatibility > issues there too. > > Thanks, > Eric > From glenveegee at gmail.com Fri Aug 2 04:17:14 2013 From: glenveegee at gmail.com (Glen van Ginkel) Date: Fri, 2 Aug 2013 09:17:14 +0100 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: <51FB69C6.3040200@ebi.ac.uk> References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: Hi all, Given Lenna's recent work on the mmCIF parser I thought this might be of interest. Kind regards, Glen wwPDB Workshop on mmCIF/PDBx for Programmers -------------------------------------------- What, why and how? ------------------ The world of the PDB will be changing rapidly and profoundly over the next few years. A major change will involve the transition from PDB to mmCIF/PDBx as the principal deposition and dissemination format (see http://www.wwpdb.org/news/news_2013.html#22-May-2013 and http://wwpdb.org/workshop/wgroup.html). To help software developers in the area of structural biology to make the transition and begin supporting the mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is organising a programmers workshop. This two-day event will include lectures by experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of language-specific libraries or packages (C/C++, Java, Python). Ample time will be devoted to tutorials and individual "code hacking", with the experts available to assist the workshop participants. Confirmed tutors include Paul Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). When and where? --------------- The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton, Cambridge, UK, on 20 and 21 November 2013. How much? --------- If you are selected as a participant, we expect you to pay for your own travel to and from Cambridge. However, there is no fee for this workshop, and we will provide accommodation (at the HolidayInn Express in nearby Duxford; http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner on the 20th (all thanks to generous funding from the Wellcome Trust to PDBe). Who can apply and how? ---------------------- This workshop is intended for "high-powered" software developers in any area of structural biology and structural bioinformatics whose products process (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods, visualisation, validation, modelling, docking, structure prediction, etc. To ensure a high ratio of tutors to workshop participants, the number of participants is limited to 15. You can apply for the workshop by sending an e-mail to Sameer Velankar at PDBe (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: - a brief description of the software program(s) or package(s) you have developed or are developing, what it does, in which field, how many users, relevant publications, etc.; - what programming language(s) you are specifically interested in; - how you would benefit from this workshop; - any specific topics or questions you would like to see addressed in the workshop. If the workshop is oversubscribed, we will use the information and motivation provided by the applicants to select the participants. Participants are expected to bring their own laptop with compilers etc. installed. No previous knowledge of mmCIF/PDBx is strictly needed, but participants who are aware of the basic principles of the format will probably gain more from the workshop. Applicants will be informed by mid-September if they have been selected or not, or if they are on the stand-by list. For informal inquiries about the workshop, please contact Sameer Velankar at PDBe (sameer at ebi.ac.uk). Please feel free to distribute this announcement to other interested people or fora! --Gerard Kleywegt & Sameer Velankar Protein Data Bank in Europe A member of the Worldwide Protein Data Bank --- Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK gerard at ebi.ac.uk ..................... pdbe.org Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see https://lists.sdsc.edu/mailman/listinfo/pdb-l . From p.j.a.cock at googlemail.com Fri Aug 2 05:16:53 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 10:16:53 +0100 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: Thanks for forwarding that Glen - it would be great if any of our structural Biopython folk could go. Is anyone interested & reasonably close to Cambridge UK? Peter On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel wrote: > Hi all, > > Given Lenna's recent work on the mmCIF parser I thought this might be of > interest. > > Kind regards, > > Glen > > wwPDB Workshop on mmCIF/PDBx for Programmers > -------------------------------------------- > > What, why and how? > ------------------ > The world of the PDB will be changing rapidly and profoundly over the next > few > years. A major change will involve the transition from PDB to mmCIF/PDBx as > the principal deposition and dissemination format (see > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and > http://wwpdb.org/workshop/wgroup.html). To help software developers in the > area of structural biology to make the transition and begin supporting the > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is > organising a programmers workshop. This two-day event will include lectures > by > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of > language-specific libraries or packages (C/C++, Java, Python). Ample time > will > be devoted to tutorials and individual "code hacking", with the experts > available to assist the workshop participants. Confirmed tutors include Paul > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). > > When and where? > --------------- > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton, > Cambridge, UK, on 20 and 21 November 2013. > > How much? > --------- > If you are selected as a participant, we expect you to pay for your own > travel > to and from Cambridge. However, there is no fee for this workshop, and we > will > provide accommodation (at the HolidayInn Express in nearby Duxford; > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner > on > the 20th (all thanks to generous funding from the Wellcome Trust to PDBe). > > Who can apply and how? > ---------------------- > This workshop is intended for "high-powered" software developers in any area > of structural biology and structural bioinformatics whose products process > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods, > visualisation, validation, modelling, docking, structure prediction, etc. To > ensure a high ratio of tutors to workshop participants, the number of > participants is limited to 15. > > You can apply for the workshop by sending an e-mail to Sameer Velankar at > PDBe > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: > > - a brief description of the software program(s) or package(s) you have > developed or are developing, what it does, in which field, how many users, > relevant publications, etc.; > - what programming language(s) you are specifically interested in; > - how you would benefit from this workshop; > - any specific topics or questions you would like to see addressed in the > workshop. > > If the workshop is oversubscribed, we will use the information and > motivation > provided by the applicants to select the participants. > > Participants are expected to bring their own laptop with compilers etc. > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but > participants who are aware of the basic principles of the format will > probably > gain more from the workshop. > > Applicants will be informed by mid-September if they have been selected or > not, or if they are on the stand-by list. > > For informal inquiries about the workshop, please contact Sameer Velankar at > PDBe (sameer at ebi.ac.uk). > > Please feel free to distribute this announcement to other interested people > or > fora! > > > --Gerard Kleywegt & Sameer Velankar > Protein Data Bank in Europe > A member of the Worldwide Protein Data Bank > > --- > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK > gerard at ebi.ac.uk ..................... pdbe.org > Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see > https://lists.sdsc.edu/mailman/listinfo/pdb-l . > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From p.j.a.cock at googlemail.com Fri Aug 2 05:31:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 10:31:27 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: Thanks for these details Ben - it sounds like a mixture of real test failures, and mere warnings that an external tool wasn't found. On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > My test machine was running Ubuntu 12.04. > > For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and > got this error: > ApplicationError: Command 'fasttree -out temp_test.tree > Quality/example.fasta' returned non-zero exit status 1, 'Unknown or > incorrect use of option -out' I don't seem to have fasttree installed at all, and from the test and wrapper it is not explicit about which version is was originally written for. > The NCBI_BLAST error involves rpsblast not being in the install. > Version 2.2.25-7 using apt-get. I believe this is down to an NCBI stupidity with binary name clashes, both the old 'legacy' C BLAST and the new C++ BLAST+ suite have a binary called rpsblast. Our test code copes with this by searching the path and checking each rpsblast binary found - looking for the new version only. However, Debian policy is to resolve ambiguities like this with a unilateral renaming - in this case I *think* they called the new binary rpsblast+ instead. Can you confirm that? I don't have access to a Debian machine right now. So, strictly speaking the Biopython test is correct - you don't have the new rpsblast installed. However, it would be more helpful if we also checked for the Debian alias rpsblast+ too. That shouldn't be too complicated to do - especially if you could rerun the tests using Biopython from git for me? > Dialign is version 2.2.1-5 using apt-get. I got two errors: first, > DIALIGN2_DIR not being set. It was installed to /usr/bin so I set > DIALIGN2_DIR to that directory; then I got "Environment variable > DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these > items are needed, though I may have missed them in the documentation. This again looks like a Debian packaging issue versus the manual install instructions for Dialign. Perhaps they have fixed Dialign to find its matrix under a data folder... You could try simple commenting out the check on the environment variable in test_Dialign_tool.py and seeing if the tests pass or not. > I downloaded version 130708 of Prank from > http://code.google.com/p/prank-msa/downloads/list. The error is on line 165 > of the test file: > > AssertionError: > ----------------- > PRANK v.130708: > ----------------- > > Input for the analysis > - converting 'Quality/example.fasta' to 'temp with space.phy' This sounds like a minor change in the stdout with recent versions of PRANK. > EmbossPhylipNew I tried to install from source, but it was complicated and I > didn't get it finished. > > I'll send some notes on the other errors when I get a few minutes. Thanks, Peter From p.j.a.cock at googlemail.com Fri Aug 2 08:00:54 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 13:00:54 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock wrote: > >> The NCBI_BLAST error involves rpsblast not being in the install. >> Version 2.2.25-7 using apt-get. > > I believe this is down to an NCBI stupidity with binary name > clashes, both the old 'legacy' C BLAST and the new C++ > BLAST+ suite have a binary called rpsblast. > > Our test code copes with this by searching the path and checking > each rpsblast binary found - looking for the new version only. > > However, Debian policy is to resolve ambiguities like this with > a unilateral renaming - in this case I *think* they called the new > binary rpsblast+ instead. Can you confirm that? I don't have > access to a Debian machine right now. Certainly this was their plan and was done on Bio-Linux, http://lists.debian.org/debian-med/2011/05/msg00025.html > So, strictly speaking the Biopython test is correct - you don't > have the new rpsblast installed. However, it would be more > helpful if we also checked for the Debian alias rpsblast+ too. > > That shouldn't be too complicated to do - especially if you > could rerun the tests using Biopython from git for me? This commit is now on our master branch, https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a Thanks, Peter From anaryin at gmail.com Fri Aug 2 12:13:04 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 2 Aug 2013 09:13:04 -0700 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: Hi Peter, Glen, I'll be going (or trying to at least). Cheers, Jo?o 2013/8/2 Peter Cock > Thanks for forwarding that Glen - it would be great if any of > our structural Biopython folk could go. > > Is anyone interested & reasonably close to Cambridge UK? > > Peter > > On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel > wrote: > > Hi all, > > > > Given Lenna's recent work on the mmCIF parser I thought this might be of > > interest. > > > > Kind regards, > > > > Glen > > > > wwPDB Workshop on mmCIF/PDBx for Programmers > > -------------------------------------------- > > > > What, why and how? > > ------------------ > > The world of the PDB will be changing rapidly and profoundly over the > next > > few > > years. A major change will involve the transition from PDB to mmCIF/PDBx > as > > the principal deposition and dissemination format (see > > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and > > http://wwpdb.org/workshop/wgroup.html). To help software developers in > the > > area of structural biology to make the transition and begin supporting > the > > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is > > organising a programmers workshop. This two-day event will include > lectures > > by > > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of > > language-specific libraries or packages (C/C++, Java, Python). Ample time > > will > > be devoted to tutorials and individual "code hacking", with the experts > > available to assist the workshop participants. Confirmed tutors include > Paul > > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), > Andreas > > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). > > > > When and where? > > --------------- > > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in > Hinxton, > > Cambridge, UK, on 20 and 21 November 2013. > > > > How much? > > --------- > > If you are selected as a participant, we expect you to pay for your own > > travel > > to and from Cambridge. However, there is no fee for this workshop, and we > > will > > provide accommodation (at the HolidayInn Express in nearby Duxford; > > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop > dinner > > on > > the 20th (all thanks to generous funding from the Wellcome Trust to > PDBe). > > > > Who can apply and how? > > ---------------------- > > This workshop is intended for "high-powered" software developers in any > area > > of structural biology and structural bioinformatics whose products > process > > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid > methods, > > visualisation, validation, modelling, docking, structure prediction, > etc. To > > ensure a high ratio of tutors to workshop participants, the number of > > participants is limited to 15. > > > > You can apply for the workshop by sending an e-mail to Sameer Velankar at > > PDBe > > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: > > > > - a brief description of the software program(s) or package(s) you have > > developed or are developing, what it does, in which field, how many > users, > > relevant publications, etc.; > > - what programming language(s) you are specifically interested in; > > - how you would benefit from this workshop; > > - any specific topics or questions you would like to see addressed in the > > workshop. > > > > If the workshop is oversubscribed, we will use the information and > > motivation > > provided by the applicants to select the participants. > > > > Participants are expected to bring their own laptop with compilers etc. > > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but > > participants who are aware of the basic principles of the format will > > probably > > gain more from the workshop. > > > > Applicants will be informed by mid-September if they have been selected > or > > not, or if they are on the stand-by list. > > > > For informal inquiries about the workshop, please contact Sameer > Velankar at > > PDBe (sameer at ebi.ac.uk). > > > > Please feel free to distribute this announcement to other interested > people > > or > > fora! > > > > > > --Gerard Kleywegt & Sameer Velankar > > Protein Data Bank in Europe > > A member of the Worldwide Protein Data Bank > > > > --- > > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK > > gerard at ebi.ac.uk ..................... pdbe.org > > Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk > > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see > > https://lists.sdsc.edu/mailman/listinfo/pdb-l . > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Fri Aug 2 12:20:02 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 17:20:02 +0100 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: That's good new Jo?o - thanks! Peter. On Fri, Aug 2, 2013 at 5:13 PM, Jo?o Rodrigues wrote: > Hi Peter, Glen, > > I'll be going (or trying to at least). > > Cheers, > > Jo?o > > > 2013/8/2 Peter Cock >> >> Thanks for forwarding that Glen - it would be great if any of >> our structural Biopython folk could go. >> >> Is anyone interested & reasonably close to Cambridge UK? >> >> Peter >> >> On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel >> wrote: >> > Hi all, >> > >> > Given Lenna's recent work on the mmCIF parser I thought this might be of >> > interest. >> > >> > Kind regards, >> > >> > Glen >> > >> > wwPDB Workshop on mmCIF/PDBx for Programmers >> > -------------------------------------------- >> > >> > What, why and how? >> > ------------------ >> > The world of the PDB will be changing rapidly and profoundly over the >> > next >> > few >> > years. A major change will involve the transition from PDB to mmCIF/PDBx >> > as >> > the principal deposition and dissemination format (see >> > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and >> > http://wwpdb.org/workshop/wgroup.html). To help software developers in >> > the >> > area of structural biology to make the transition and begin supporting >> > the >> > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is >> > organising a programmers workshop. This two-day event will include >> > lectures >> > by >> > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of >> > language-specific libraries or packages (C/C++, Java, Python). Ample >> > time >> > will >> > be devoted to tutorials and individual "code hacking", with the experts >> > available to assist the workshop participants. Confirmed tutors include >> > Paul >> > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), >> > Andreas >> > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). >> > >> > When and where? >> > --------------- >> > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in >> > Hinxton, >> > Cambridge, UK, on 20 and 21 November 2013. >> > >> > How much? >> > --------- >> > If you are selected as a participant, we expect you to pay for your own >> > travel >> > to and from Cambridge. However, there is no fee for this workshop, and >> > we >> > will >> > provide accommodation (at the HolidayInn Express in nearby Duxford; >> > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop >> > dinner >> > on >> > the 20th (all thanks to generous funding from the Wellcome Trust to >> > PDBe). >> > >> > Who can apply and how? >> > ---------------------- >> > This workshop is intended for "high-powered" software developers in any >> > area >> > of structural biology and structural bioinformatics whose products >> > process >> > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid >> > methods, >> > visualisation, validation, modelling, docking, structure prediction, >> > etc. To >> > ensure a high ratio of tutors to workshop participants, the number of >> > participants is limited to 15. >> > >> > You can apply for the workshop by sending an e-mail to Sameer Velankar >> > at >> > PDBe >> > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: >> > >> > - a brief description of the software program(s) or package(s) you have >> > developed or are developing, what it does, in which field, how many >> > users, >> > relevant publications, etc.; >> > - what programming language(s) you are specifically interested in; >> > - how you would benefit from this workshop; >> > - any specific topics or questions you would like to see addressed in >> > the >> > workshop. >> > >> > If the workshop is oversubscribed, we will use the information and >> > motivation >> > provided by the applicants to select the participants. >> > >> > Participants are expected to bring their own laptop with compilers etc. >> > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but >> > participants who are aware of the basic principles of the format will >> > probably >> > gain more from the workshop. >> > >> > Applicants will be informed by mid-September if they have been selected >> > or >> > not, or if they are on the stand-by list. >> > >> > For informal inquiries about the workshop, please contact Sameer >> > Velankar at >> > PDBe (sameer at ebi.ac.uk). >> > >> > Please feel free to distribute this announcement to other interested >> > people >> > or >> > fora! >> > >> > >> > --Gerard Kleywegt & Sameer Velankar >> > Protein Data Bank in Europe >> > A member of the Worldwide Protein Data Bank >> > >> > --- >> > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK >> > gerard at ebi.ac.uk ..................... pdbe.org >> > Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk >> > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see >> > https://lists.sdsc.edu/mailman/listinfo/pdb-l . >> > _______________________________________________ >> > Biopython-dev mailing list >> > Biopython-dev at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biopython-dev >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev > > From ben at benfulton.net Sun Aug 4 21:28:34 2013 From: ben at benfulton.net (Ben Fulton) Date: Sun, 4 Aug 2013 21:28:34 -0400 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: Fixed the following: I had installed Mafft version 6.850-1 from apt-get, which apparently is more than a year old and doesn't work. The tests ran after I installed it from source. I had not gotten a path set up properly for XXMotif; once I did the tests all ran. The DiAlign tests passed after I removed the precondition checks. Did not fix: The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't find anywhere else to install the PopGen software from. So with all of those modifications, I ran coverage against the latest code from GitHub. Results are once again available on my website, http://benfulton.net/BioPython162_Coverage , and the following issues remain: EmbossPhylipNew - skipped, too hard to install Fasttree - error, apparently a versioning issue PopGen_FDist and PopGen_DFdist - skipped, unavailable Prank - failed, recent versions of the tool have some kind of output change On Fri, Aug 2, 2013 at 8:00 AM, Peter Cock wrote: > On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock > wrote: > > > >> The NCBI_BLAST error involves rpsblast not being in the install. > >> Version 2.2.25-7 using apt-get. > > > > I believe this is down to an NCBI stupidity with binary name > > clashes, both the old 'legacy' C BLAST and the new C++ > > BLAST+ suite have a binary called rpsblast. > > > > Our test code copes with this by searching the path and checking > > each rpsblast binary found - looking for the new version only. > > > > However, Debian policy is to resolve ambiguities like this with > > a unilateral renaming - in this case I *think* they called the new > > binary rpsblast+ instead. Can you confirm that? I don't have > > access to a Debian machine right now. > > Certainly this was their plan and was done on Bio-Linux, > http://lists.debian.org/debian-med/2011/05/msg00025.html > > > So, strictly speaking the Biopython test is correct - you don't > > have the new rpsblast installed. However, it would be more > > helpful if we also checked for the Debian alias rpsblast+ too. > > > > That shouldn't be too complicated to do - especially if you > > could rerun the tests using Biopython from git for me? > > This commit is now on our master branch, > > > https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a > > Thanks, > > Peter > From yeyanbo289 at gmail.com Mon Aug 5 04:57:34 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 5 Aug 2013 16:57:34 +0800 Subject: [Biopython-dev] GSOC weekly update 8 Message-ID: Hi all, I post an update for the Biopython.Phylo project here: http://blog.yeyanbo.com/posts/google-summer-of-code-8.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From p.j.a.cock at googlemail.com Mon Aug 5 07:46:00 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 12:46:00 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: > > The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't > find anywhere else to install the PopGen software from. > There seems to be a fairly recent snapshot on archive.org, http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html Meanwhile, I have emailed Dr. Mark Beaumont at Reading University to ask about the server status. Regards, Peter From p.j.a.cock at googlemail.com Mon Aug 5 08:14:04 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 13:14:04 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock wrote: > On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: >> >> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't >> find anywhere else to install the PopGen software from. >> > > There seems to be a fairly recent snapshot on archive.org, > http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html > > Meanwhile, I have emailed Dr. Mark Beaumont at Reading > University to ask about the server status. Mark has moved to Bristol: http://www.maths.bris.ac.uk/people/profile/mamab FDist and DFDist are available here now: http://www.maths.bris.ac.uk/~mamab/ We need to update the Biopython documentation (and check those versions from Bristol still work with our tests). Tiago, could you handle that? Thanks, Peter From arklenna at gmail.com Mon Aug 5 09:11:19 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Mon, 5 Aug 2013 09:11:19 -0400 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: Peter, It's been a few days that I can't connect to redmine. I just got a error page saying RoR couldn't start or connect to the MySQL server. Cheers, Lenna On Mon, Jul 22, 2013 at 10:36 AM, Peter Cock wrote: > On Mon, Jul 22, 2013 at 12:43 PM, Peter Cock > wrote: > > > > Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now. > > > > In the absence of any dissenting views, and the fact that RedMine is > > also offline right now (which I've raised with the OBF admin volunteers), > > Fixed again :) > > > I've enabled GitHub issues & linked to this from the main page: > > > > https://github.com/biopython/biopython/issues > > > > You'll notice there are already lots of issues there - all pull request > > related. This is one reason why an automated import of the old > > Bugzilla/RedMine issues could be complicated. > > > > Various other bits of our documentation will need to be updated... > > Hopefully done now, e.g. > > https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 > > https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 > > Note that there doesn't seem to be a way to turn off new issues in > a RedMine project - there are hacks via removing the ability from > the roles, but I fear that would affect the other projects still using > the RedMine server (e.g. BioPerl). > > Instead we may just have to do the triage/migration and then > drop the links to the old RedMine server from the website etc. > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Mon Aug 5 09:43:19 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 14:43:19 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock wrote: > On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock wrote: >> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: >>> >>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't >>> find anywhere else to install the PopGen software from. >>> >> >> There seems to be a fairly recent snapshot on archive.org, >> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html >> >> Meanwhile, I have emailed Dr. Mark Beaumont at Reading >> University to ask about the server status. > > Mark has moved to Bristol: > http://www.maths.bris.ac.uk/people/profile/mamab > > FDist and DFDist are available here now: > http://www.maths.bris.ac.uk/~mamab/ > > We need to update the Biopython documentation (and check > those versions from Bristol still work with our tests). > > Tiago, could you handle that? According to his email auto-reply, Tiago is away right now. I've updated a couple of URLs in the source code: https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850 Ben - did you see anything else which needs updating here? Thanks, Peter From p.j.a.cock at googlemail.com Mon Aug 5 10:01:12 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 15:01:12 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 2:11 PM, Lenna Peterson wrote: > Peter, > > It's been a few days that I can't connect to redmine. I just got a error > page saying RoR couldn't start or connect to the MySQL server. > > Cheers, > > Lenna OK, Chris Dag has got RedMine to work again, and told me what he did in case I need to restart if this happens again. If any RedMine guru is reading and has some thoughts on the cause and long term solution, drop us an email please. As to issue triage - I suggest you start with anything you filed or commented on, then things you are familiar with. But any order is fine really. I suggest for "moving" an issue, we file the new GitHub issue (linking to the old issue, but also trying to capture any relevant information from the old bug tracker to be self sufficient), and then close the old RedMine issue with a link to its replacement. Thanks, Peter From p.j.a.cock at googlemail.com Mon Aug 5 10:26:32 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 15:26:32 +0100 Subject: [Biopython-dev] Bio.XXX.Applications vs Bio.motifs.applications Message-ID: Hi all, I've noticed that as part of migrating from Bio.Motif to Bio.motifs, the Applications module has acquired a lower case name. Lower case module names are in principle a good thing (PEP8) but elsewhere in Biopython the Applications modules are all using title case. Would a lower case shorter name be better, such as apps (i.e. Bio.motifs.apps in this case)? This could also be adopted in other modules for a gradual conversion if desired (e.g. introduce Bio.Phylo.apps as an alias for Bio.Phylo.Applications). What do people think? Thanks, Peter From dalke at dalkescientific.com Mon Aug 5 21:18:06 2013 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 6 Aug 2013 03:18:06 +0200 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com> On Jul 24, 2013, at 11:13 AM, Peter Cock wrote: > The current Biopython License is very short and liberal, and I have > long described it as an MIT/BSD type licence. However the actual > wording matches neither of these exactly (as far as I could tell): That's my doing. When Jeff and I started Biopython in 1999 we needed to choose a license. We started with the Python license, which (for 1.5.2) was: Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the names of Stichting Mathematisch Centrum or CWI or Corporation for National Research Initiatives or CNRI not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. While CWI is the initial source for this software, a modified version is made available by the Corporation for National Research Initiatives (CNRI) at the Internet address ftp://ftp.python.org. STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. Compare that to the Biopython license, with the alterations marked: Permission to use, copy, modify, and distribute this software and its documentation >>>with or without modifications<< and for any purpose and without fee is hereby granted, provided that >>any copyright notices<<< appear in all copies and that both >>>those copyright notices<<< and this permission notice appear in supporting documentation, and that the names of >>>the contributors or copyright holders<<< not be used in advertising or publicity pertaining to distribution of the software without specific prior permission. [2nd paragraph of original Python license omitted] >>>THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE<<< DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL >>>THE CONTRIBUTORS OR COPYRIGHT HOLDERS<<< BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. This was called a "Python-style license", and you can see an example at http://effbot.org/zone/copyright.htm . Indeed, his PIL package is an example of a current Python module which still uses that license: http://www.pythonware.com/products/pil/license.htm You'll see that Fredrik Lundh refers to it as the "Historical Permission Notice and Disclaimer", and points to: http://opensource.org/licenses/historical.php Further note that the OSI comments that "This License has been voluntarily deprecated by its author" .. whatever that means ... and that that http://opensource.org/proliferation-report describes it as "redundant with more popular licenses", and more specifically the BSD. > In theory we could ask the OSI to approve our current license, but as > they explain "yet another license" is not a good thing to encourage: > http://opensource.org/proliferation It wouldn't be a "yet another license" as it's already registered with the OSI ... almost. The one odd alteration I made was to add "with or without modifications", because some people on comp.lang.python expressed concern that "use, copy, modify, and distribute" could be interpreted to be restrictive, as in "you can modify it original source code, or distribute the original source code, but you can't distribute the modified source code. I've since learned that this is a hyper-picky interpretation with no legal bearing. I don't know if that "with or without modifications" is enough different that the OSI would say it's doesn't fall under the 'Historical Permission Notice and Disclaimer', In any case, I agree with a relicensing. The current license is from a bygone era. Nowadays I just pick the MIT license. If there's anything copyright by me still remaining in Biopython, I hereby relicense it under the MIT and/or one of the standard n-clause BSD licenses, at your choice. Cheers, Andrew dalke at dalkescientific.com From p.j.a.cock at googlemail.com Tue Aug 6 05:11:33 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 6 Aug 2013 10:11:33 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com> References: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com> Message-ID: On Tue, Aug 6, 2013 at 2:18 AM, Andrew Dalke wrote: > On Jul 24, 2013, at 11:13 AM, Peter Cock wrote: >> The current Biopython License is very short and liberal, and I have >> long described it as an MIT/BSD type licence. However the actual >> wording matches neither of these exactly (as far as I could tell): > > That's my doing. When Jeff and I started Biopython in 1999 we > needed to choose a license. We started with the Python license, > which (for 1.5.2) was: > > ... Ah - with hindsight I should have checked the older Python licenses, but I was thinking more of their current very long version. > You'll see that Fredrik Lundh refers to it as the "Historical > Permission Notice and Disclaimer", and points to: > > http://opensource.org/licenses/historical.php > > Further note that the OSI comments that "This License has been > voluntarily deprecated by its author" .. whatever that > means ... and that that http://opensource.org/proliferation-report > describes it as "redundant with more popular licenses", and > more specifically the BSD. > >> In theory we could ask the OSI to approve our current license, but as >> they explain "yet another license" is not a good thing to encourage: >> http://opensource.org/proliferation > > It wouldn't be a "yet another license" as it's already > registered with the OSI ... almost. > > The one odd alteration I made was to add "with or without > modifications", because some people on comp.lang.python > expressed concern that "use, copy, modify, and distribute" > could be interpreted to be restrictive, as in "you can > modify it original source code, or distribute the original > source code, but you can't distribute the modified source > code. I've since learned that this is a hyper-picky > interpretation with no legal bearing. > > I don't know if that "with or without modifications" is > enough different that the OSI would say it's doesn't fall > under the 'Historical Permission Notice and Disclaimer', Thanks for that background information. Educational. > In any case, I agree with a relicensing. The current > license is from a bygone era. Nowadays I just pick the MIT > license. > > If there's anything copyright by me still remaining in > Biopython, I hereby relicense it under the MIT and/or one > of the standard n-clause BSD licenses, at your choice. That's great Andrew - thank you, Peter From p.j.a.cock at googlemail.com Tue Aug 6 18:51:22 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 6 Aug 2013 23:51:22 +0100 Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application plans Message-ID: Hi Christian et al., I've just noticed something in the XXmotif wrapper which I should have raised back in November 2012 when it was committed. This is to do with the way the options were define, e.g. _Option(["--negSet", "negSet", "negset", "NEGSET"], "sequence set which has to be used as a reference set", filename = True, equate = False), The first argument is a list of names, aliases which can be used via the (legacy) set_parameter method. Of these the first is what goes in the actual command string, and the last must be a valid Python identifier and becomes a property and a keyword argument for the __init__ method (and ideally follow PEP8 guidelines). Normally the _Option would just have TWO alias, in this case ["--negSeq, "negset"] would seem best. Clearly I'd not documented this well enough, but I've tried to make this more explicit now: https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293 Was there a special reason for all these case variants in the XXmotif options?? We could perhaps just change this now in the newer Bio.motifs module, despite this being live in the Biopython 1.61 release... since right now the nasty all upper case aliases are being used as the property names and keyword names. But that could break a few scripts already using Bio.motifs.application's XXmotif wrapper. Looking ahead, other than set_parameter, all the other legacy bits in Bio.Application have all been removed - so we could take a fresh look at if we can transition to a more explicit application definition, which I hope is possible with the class files defining these properties explicitly (perhaps with decorators for things like validation methods) - rather than implicitly as now via the __init__ method which doesn't suit things like autogenerated API docs. There may be a catch in how to best make the parameter order explicit (currently done via the parameters being in a list) which can be vital for many command line tools. Regards, Peter From christian at brueffer.de Thu Aug 8 06:37:19 2013 From: christian at brueffer.de (Christian Brueffer) Date: Thu, 08 Aug 2013 12:37:19 +0200 Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application plans In-Reply-To: References: Message-ID: <520374DF.9070301@brueffer.de> On 8/7/13 0:51 , Peter Cock wrote: > Hi Christian et al., > > I've just noticed something in the XXmotif wrapper which > I should have raised back in November 2012 when it was > committed. This is to do with the way the options were > define, e.g. > > _Option(["--negSet", "negSet", "negset", "NEGSET"], > "sequence set which has to be used as a reference set", > filename = True, > equate = False), > > The first argument is a list of names, aliases which can > be used via the (legacy) set_parameter method. Of > these the first is what goes in the actual command > string, and the last must be a valid Python identifier > and becomes a property and a keyword argument > for the __init__ method (and ideally follow PEP8 > guidelines). > Yeah, unfortunately I wasn't aware of this detail. > Normally the _Option would just have TWO alias, > in this case ["--negSeq, "negset"] would seem best. > > Clearly I'd not documented this well enough, but > I've tried to make this more explicit now: > https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293 > > Was there a special reason for all these case variants > in the XXmotif options?? > I basically followed the example set by Bio/Align/Applications/_Clustalw.py. The "rationale" was to allow for people to use their favourite spelling variety. I guess it was bad luck this happened to serve as an example, as it was the first piece of code I ever touched in BioPython. It would be nice to streamline all application wrappers in this regard sometime... Chris From p.j.a.cock at googlemail.com Thu Aug 8 07:00:22 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Aug 2013 12:00:22 +0100 Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application plans In-Reply-To: <520374DF.9070301@brueffer.de> References: <520374DF.9070301@brueffer.de> Message-ID: On Thu, Aug 8, 2013 at 11:37 AM, Christian Brueffer wrote: >> >> Was there a special reason for all these case variants >> in the XXmotif options?? > > I basically followed the example set by > Bio/Align/Applications/_Clustalw.py. Ah. Without checking I think maybe the ClustalW documentation used both cases - but the order was deliberately with the lower case one last as that was used in the Python object as the property name and keyword. > The "rationale" was to allow for people to use their favourite > spelling variety. > > I guess it was bad luck this happened to serve as an example, as it > was the first piece of code I ever touched in BioPython. > > It would be nice to streamline all application wrappers in this regard > sometime... Yeah, perhaps we can formally deprecate set_parameter in the next release which means all the aliases 'go away' and that leaves us with just the final entry exposed as the usable property name and keyword. Peter From arklenna at gmail.com Thu Aug 8 15:54:58 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 8 Aug 2013 15:54:58 -0400 Subject: [Biopython-dev] PDB occupancy behavior Message-ID: Hi all, I just submitted a pull request I'd like wider feedback on. https://github.com/biopython/biopython/pull/207 In summary, I am using software-produced PDB files that simply stop after the coordinate data, so occupancy data is missing. Currently, the Biopython PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing this to 1.0. I would like to see if anyone knows of situations in which this would be a bad idea. Cheers, Lenna From anaryin at gmail.com Thu Aug 8 16:02:39 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Thu, 8 Aug 2013 13:02:39 -0700 Subject: [Biopython-dev] [Biopython] PDB occupancy behavior In-Reply-To: References: Message-ID: Hi Lenna, As I mentioned in the Github email, I think it's fine. It doesn't matter if the occupancy is 0 or 1 in case of a model most of the time. I agree with it. The only bad thing I can think about is having occupancy for a certain atom larger than 1 in some bogus cases but to be honest, no software that I know of bothers checking that... Cheers, Jo?o 2013/8/8 Lenna Peterson > Hi all, > > I just submitted a pull request I'd like wider feedback on. > > https://github.com/biopython/biopython/pull/207 > > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the Biopython > PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing > this to 1.0. > > I would like to see if anyone knows of situations in which this would be a > bad idea. > > Cheers, > > Lenna > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From p.j.a.cock at googlemail.com Thu Aug 8 18:37:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Aug 2013 23:37:27 +0100 Subject: [Biopython-dev] [Biopython] PDB occupancy behavior In-Reply-To: References: Message-ID: Thanks everyone - that seems like a clear consensus, patch applied :) Peter On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared wrote: > Thanks, Lenna and Jo?o - > > I also agree, 1.0 is a better default occupancy value. For most > structural manipulation purposes, unless specified otherwise, we must assume > the atoms listed are present in the structure at full occupancy. Setting a > reduced occupancy can be useful for partially bound ligands, disordered > loops, and so forth, but doing so is the exception, not the rule. > > Cheers, > Jared > > -- > Jared Sampson > Xiangpeng Kong Lab > NYU Langone Medical Center > Old Public Health Building, Room 610 > 341 East 25th Street > New York, NY 10016 > 212-263-7898 > http://kong.med.nyu.edu/ > > > > > On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues > > wrote: > > Hi Lenna, > > As I mentioned in the Github email, I think it's fine. It doesn't matter > if the occupancy is 0 or 1 in case of a model most of the time. I agree > with it. The only bad thing I can think about is having occupancy for > a certain atom larger than 1 in some bogus cases but to be honest, > no software that I know of bothers checking that... > > Cheers, > > Jo?o > > > 2013/8/8 Lenna Peterson > > > Hi all, > > I just submitted a pull request I'd like wider feedback on. > > https://github.com/biopython/biopython/pull/207 > > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the > Biopython PDBParser sets missing or blank occupancy to 0.0. I am > suggesting changing this to 1.0. > > I would like to see if anyone knows of situations in which this would be a > bad idea. > > Cheers, > > Lenna From p.j.a.cock at googlemail.com Thu Aug 8 18:37:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Aug 2013 23:37:27 +0100 Subject: [Biopython-dev] [Biopython] PDB occupancy behavior In-Reply-To: References: Message-ID: Thanks everyone - that seems like a clear consensus, patch applied :) Peter On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared wrote: > Thanks, Lenna and Jo?o - > > I also agree, 1.0 is a better default occupancy value. For most > structural manipulation purposes, unless specified otherwise, we must assume > the atoms listed are present in the structure at full occupancy. Setting a > reduced occupancy can be useful for partially bound ligands, disordered > loops, and so forth, but doing so is the exception, not the rule. > > Cheers, > Jared > > -- > Jared Sampson > Xiangpeng Kong Lab > NYU Langone Medical Center > Old Public Health Building, Room 610 > 341 East 25th Street > New York, NY 10016 > 212-263-7898 > http://kong.med.nyu.edu/ > > > > > On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues > > wrote: > > Hi Lenna, > > As I mentioned in the Github email, I think it's fine. It doesn't matter > if the occupancy is 0 or 1 in case of a model most of the time. I agree > with it. The only bad thing I can think about is having occupancy for > a certain atom larger than 1 in some bogus cases but to be honest, > no software that I know of bothers checking that... > > Cheers, > > Jo?o > > > 2013/8/8 Lenna Peterson > > > Hi all, > > I just submitted a pull request I'd like wider feedback on. > > https://github.com/biopython/biopython/pull/207 > > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the > Biopython PDBParser sets missing or blank occupancy to 0.0. I am > suggesting changing this to 1.0. > > I would like to see if anyone knows of situations in which this would be a > bad idea. > > Cheers, > > Lenna From ben at benfulton.net Thu Aug 8 21:03:10 2013 From: ben at benfulton.net (Ben Fulton) Date: Thu, 8 Aug 2013 21:03:10 -0400 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: Everything else is passing. The PopGen files pass as well after installing them from source. On Mon, Aug 5, 2013 at 9:43 AM, Peter Cock wrote: > On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock > wrote: > > On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock > wrote: > >> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: > >>> > >>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I > can't > >>> find anywhere else to install the PopGen software from. > >>> > >> > >> There seems to be a fairly recent snapshot on archive.org, > >> > http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html > >> > >> Meanwhile, I have emailed Dr. Mark Beaumont at Reading > >> University to ask about the server status. > > > > Mark has moved to Bristol: > > http://www.maths.bris.ac.uk/people/profile/mamab > > > > FDist and DFDist are available here now: > > http://www.maths.bris.ac.uk/~mamab/ > > > > We need to update the Biopython documentation (and check > > those versions from Bristol still work with our tests). > > > > Tiago, could you handle that? > > According to his email auto-reply, Tiago is away right now. > > I've updated a couple of URLs in the source code: > > https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850 > > Ben - did you see anything else which needs updating here? > > Thanks, > > Peter > From mok at bioxray.dk Fri Aug 9 04:39:55 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 10:39:55 +0200 Subject: [Biopython-dev] PDB occupancy behavior Message-ID: Lenna wrote: > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the Biopython > PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing > this to 1.0. I think it is an incorrect default behaviour to set the occupancy to 1 if it's not present in the file. If the occupancy is not there, you can't say anything about it, and it should be set to 0, so the current defaults are correct IMO. If, for some reason, you NEED the occupancy to be 1, and it is not, it is very simple to write a loop modifying it. I.e. special needs should be taken care of in the users program, not Bio.PDB. Cheers, Morten -- Morten Kjeldgaard, asc. professor, MSc, PhD Dept. of Molecular Biology and Genetics, Aarhus University Gustav Wieds Vej 10C, Building 3135, DK-8000 Aarhus C, Denmark. From mok at bioxray.dk Fri Aug 9 04:33:37 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 10:33:37 +0200 Subject: [Biopython-dev] Redmine issue 2727 ready for pull Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Hi, I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record. In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done. Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example. My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future. Cheers, Morten [1] https://redmine.open-bio.org/issues/2727 [2] https://github.com/mok0/biopython/tree/pdbwork From p.j.a.cock at googlemail.com Fri Aug 9 04:47:15 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 09:47:15 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: Message-ID: On Fri, Aug 9, 2013 at 9:39 AM, Morten Kjeldgaard wrote: > Lenna wrote: > > > In summary, I am using software-produced PDB files that simply stop > after > > the coordinate data, so occupancy data is missing. Currently, the > Biopython > > PDBParser sets missing or blank occupancy to 0.0. I am suggesting > changing > > this to 1.0. > > I think it is an incorrect default behaviour to set the occupancy to 1 if it's not present in the file. If the occupancy is not there, you can't say anything about it, and it should be set to 0, so the current defaults are correct IMO. > > If, for some reason, you NEED the occupancy to be 1, and it is not, it is very simple to write a loop modifying it. I.e. special needs should be taken care of in the users program, not Bio.PDB. > > Cheers, > Morten > > How about the special float values NaN or NA instead? Or the Python special value None? Peter From mok at bioxray.dk Fri Aug 9 04:33:37 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 10:33:37 +0200 Subject: [Biopython-dev] Redmine issue 2727 ready for pull Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Hi, I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record. In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done. Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example. My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future. Cheers, Morten [1] https://redmine.open-bio.org/issues/2727 [2] https://github.com/mok0/biopython/tree/pdbwork From mok at bioxray.dk Fri Aug 9 05:07:13 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 11:07:13 +0200 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: Message-ID: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> On 09/08/2013, at 10:47, Peter Cock wrote: > How about the special float values NaN or NA instead? > Or the Python special value None? TBH I don't think there is any good reason to change the current defaults. On the contrary, we should be careful when changing default values since this might break users' programs. My point is, that Lenna wants to read files that does not follow the PDB standard, and so she needs to make provisions for that in her own program, not the toolkit. Putting None in the value of a field that isn't there, but should be according the format specification is more reasonable, since it alerts the user to the fact that something is fishy. However, it should only be done this way if that is a philosophy used throughout the Biopython toolkit. Is it? I would warn against using NaN since it is non-pythonic and a nightmare to deal with in practice. Cheers, Morten From p.j.a.cock at googlemail.com Fri Aug 9 07:06:46 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 12:06:46 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard wrote: > On 09/08/2013, at 10:47, Peter Cock wrote: > > > How about the special float values NaN or NA instead? > > Or the Python special value None? > > TBH I don't think there is any good reason to change the current defaults. > On the contrary, we should be careful when changing default values since > this might break users' programs. > > My point is, that Lenna wants to read files that does not follow the PDB > standard, and so she needs to make provisions for that in her own program, > not the toolkit. > > Do you think this should be something handled differently in strict and permissive mode? Should missing occupancy give a warning or error in strict mode? Peter From arklenna at gmail.com Fri Aug 9 09:07:41 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Fri, 9 Aug 2013 09:07:41 -0400 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Friday, 9 August 2013, Peter Cock wrote: > On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard > > wrote: > > > On 09/08/2013, at 10:47, Peter Cock > > wrote: > > > > > How about the special float values NaN or NA instead? > > > Or the Python special value None? > > > > TBH I don't think there is any good reason to change the current > defaults. > > On the contrary, we should be careful when changing default values since > > this might break users' programs. > > > > My point is, that Lenna wants to read files that does not follow the PDB > > standard, and so she needs to make provisions for that in her own > program, > > not the toolkit. > > > > > Do you think this should be something handled differently in strict and > permissive mode? Should missing occupancy give a warning or error in strict > mode? (Resending to dev list) None in permissive mode makes a lot of sense to me. Missing occupancy is a fatal error in strict mode. Lenna From p.j.a.cock at googlemail.com Fri Aug 9 09:14:44 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 14:14:44 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson wrote: > On Friday, 9 August 2013, Peter Cock wrote: > >> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard > >> wrote: >> >> > On 09/08/2013, at 10:47, Peter Cock > >> wrote: >> > >> > > How about the special float values NaN or NA instead? >> > > Or the Python special value None? >> > >> > TBH I don't think there is any good reason to change the current >> defaults. >> > On the contrary, we should be careful when changing default values since >> > this might break users' programs. >> > >> > My point is, that Lenna wants to read files that does not follow the PDB >> > standard, and so she needs to make provisions for that in her own >> > program, not the toolkit. >> > >> > >> Do you think this should be something handled differently in strict and >> permissive mode? Should missing occupancy give a warning or error in strict >> mode? > > (Resending to dev list) > > None in permissive mode makes a lot of sense to me. > > Missing occupancy is a fatal error in strict mode. > > Lenna Good (error in strict mode). Do you think a warning in permissive mode for missing occupancy is also worth adding, or would using None as the value indicate that nicely? Peter From arklenna at gmail.com Fri Aug 9 09:46:54 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Fri, 9 Aug 2013 09:46:54 -0400 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Friday, 9 August 2013, Peter Cock wrote: > On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson > > wrote: > > On Friday, 9 August 2013, Peter Cock wrote: > > > >> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard > > > >> wrote: > >> > >> > On 09/08/2013, at 10:47, Peter Cock > > > >> wrote: > >> > > >> > > How about the special float values NaN or NA instead? > >> > > Or the Python special value None? > >> > > >> > TBH I don't think there is any good reason to change the current > >> defaults. > >> > On the contrary, we should be careful when changing default values > since > >> > this might break users' programs. > >> > > >> > My point is, that Lenna wants to read files that does not follow the > PDB > >> > standard, and so she needs to make provisions for that in her own > >> > program, not the toolkit. > >> > > >> > > >> Do you think this should be something handled differently in strict and > >> permissive mode? Should missing occupancy give a warning or error in > strict > >> mode? > > > > (Resending to dev list) > > > > None in permissive mode makes a lot of sense to me. > > > > Missing occupancy is a fatal error in strict mode. > > > > Lenna > > Good (error in strict mode). > > Do you think a warning in permissive mode for missing occupancy > is also worth adding, or would using None as the value indicate > that nicely? > > Peter > I have some concern about changing the type of an attribute but I imagine any end user who cares about occupancy doesn't want spurious values of either 1.0 or 0.0 anyway. I'm not at a computer right now but I believe most problems in the PDB parser are fatal in strict and warnings in permissive. So there should already be a warning in place. It occurred to me it would also be possible o create an "ultra-permissive" mode designed for parsing computationally produced files, and suppress some of the warnings (e.g. missing occupancy and B-factor). That way, the current behavior could be left unchanged. Possibly a permissiveness level (0 for strict, 1 for current permissive, 2 for even more permissive). Anyway, I'd be happy to implement any of these options (current parser to None, restore previous behavior and None in a new permissiveness level, other?) and of course update the unit test. Cheers, Lenna From p.j.a.cock at googlemail.com Fri Aug 9 10:22:29 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 15:22:29 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Fri, Aug 9, 2013 at 2:46 PM, Lenna Peterson wrote: > On Friday, 9 August 2013, Peter Cock wrote: >> >> Good (error in strict mode). >> >> Do you think a warning in permissive mode for missing occupancy >> is also worth adding, or would using None as the value indicate >> that nicely? >> >> Peter > > > > I have some concern about changing the type of an attribute but I imagine > any end user who cares about occupancy doesn't want spurious values of > either 1.0 or 0.0 anyway. > > I'm not at a computer right now but I believe most problems in the PDB > parser are fatal in strict and warnings in permissive. So there should > already be a warning in place. > > It occurred to me it would also be possible o create an "ultra-permissive" > mode designed for parsing computationally produced files, and suppress some > of the warnings (e.g. missing occupancy and B-factor). That way, the current > behavior could be left unchanged. Possibly a permissiveness level (0 for > strict, 1 for current permissive, 2 for even more permissive). > > Anyway, I'd be happy to implement any of these options (current parser to > None, restore previous behavior and None in a new permissiveness level, > other?) and of course update the unit test. You should be able to silence the PDB warnings in two lines anyway, so I don't think we really need an ultra-permissive no-warnings mode. Peter From anaryin at gmail.com Fri Aug 9 13:26:59 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 9 Aug 2013 10:26:59 -0700 Subject: [Biopython-dev] Moratorium on commits? Message-ID: Dear all, The situation with the occupancy in the PDBParser led to think of one thing. Since not everybody is in the same timezone, has the same availability, etc, what about we introduce a brief moratorium over commits of say 3 days (except for critical bug fixes)? This will give everybody probably enough time to read the email and give their opinion. The downside is that it will make things roll a bit slower but then again, 3 days is not so much.. Cheers, Jo?o From p.j.a.cock at googlemail.com Fri Aug 9 15:06:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 20:06:21 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues wrote: > Dear all, > > The situation with the occupancy in the PDBParser led to think of one > thing. > > Since not everybody is in the same timezone, has the same availability, > etc, what about we introduce a brief moratorium over commits of say 3 days > (except for critical bug fixes)? This will give everybody probably enough > time to read the email and give their opinion. > > The downside is that it will make things roll a bit slower but then again, > 3 days is not so much.. > > Cheers, > > Jo?o I don't think that's really needed for small commits like this which are simple to interpret. In this case there were three opinions in favour of the idea, with a fourth counter view appearing later, resulting in a further tweak. Longer periods of discussion are far more important on large code additions or major changes. Peter From arklenna at gmail.com Sat Aug 10 20:43:36 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Sat, 10 Aug 2013 20:43:36 -0400 Subject: [Biopython-dev] Redmine issue 2727 ready for pull In-Reply-To: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Message-ID: Hi Morten, I think this looks great. Why not submit a pull request? Cheers, Lenna On Fri, Aug 9, 2013 at 4:33 AM, Morten Kjeldgaard wrote: > Hi, > > I've finally gotten around to following up to a very old patch I sent to > the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB > does not parse the important CRYST1 record. In the bug comments, Peter > Cock asked to include the explanation of the new keys in the docstring. > That has now been done. > > Peter also asks about the default values chosen (if the CRYST1 header is > not present). These are probably universally chosen default values in > various crystallographic programs, and these values are also used in PDB > entries containinging NMR entries, for example. > > My github branch containing the patch #2727 is in [2]. I am using Bio.PDB > quite a lot, and I would like to contribute more to it in the future. > > Cheers, > Morten > > > [1] https://redmine.open-bio.org/issues/2727 > [2] https://github.com/mok0/biopython/tree/pdbwork > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From mok at bioxray.dk Sun Aug 11 14:33:05 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Sun, 11 Aug 2013 20:33:05 +0200 Subject: [Biopython-dev] Redmine issue 2727 ready for pull In-Reply-To: References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Message-ID: On 11/08/2013, at 02:43, Lenna Peterson wrote: > I think this looks great. Why not submit a pull request? Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I thought that is what I did by posting to the -dev list). Cheers, Morten From mok at bioxray.dk Sun Aug 11 14:28:36 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Sun, 11 Aug 2013 20:28:36 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: On 09/08/2013, at 21:06, Peter Cock wrote: > On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues wrote: >> Dear all, >> >> The situation with the occupancy in the PDBParser led to think of one >> thing. >> >> Since not everybody is in the same timezone, has the same availability, >> etc, what about we introduce a brief moratorium over commits of say 3 days >> (except for critical bug fixes)? This will give everybody probably enough >> time to read the email and give their opinion. >> >> The downside is that it will make things roll a bit slower but then again, >> 3 days is not so much.. >> >> Cheers, >> >> Jo?o > > I don't think that's really needed for small commits like > this which are simple to interpret. In this case there were > three opinions in favour of the idea, with a fourth counter > view appearing later, resulting in a further tweak. > > Longer periods of discussion are far more important on > large code additions or major changes. Sorry, but I don't agree that this is a "small commit". It may not be large in terms of number of bytes, but it is large in terms of impact, since it affects users' programs in unpredictable ways. Whenever a change is made that affects values returned to the user, it is worth spending a few days discussing it, to let people have a chance to think through the consequences of the change. Cheers, Morten From arklenna at gmail.com Sun Aug 11 14:40:38 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Sun, 11 Aug 2013 14:40:38 -0400 Subject: [Biopython-dev] Redmine issue 2727 ready for pull In-Reply-To: References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Message-ID: On Sun, Aug 11, 2013 at 2:33 PM, Morten Kjeldgaard wrote: > > On 11/08/2013, at 02:43, Lenna Peterson wrote: > > > I think this looks great. Why not submit a pull request? > > Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I > thought that is what I did by posting to the -dev list). > > Cheers, > Morten Hey Morten, It's good to let the dev list know you have code ready to merge in, but if you do it on github, it will show up here too: https://github.com/biopython/biopython/pulls Here's github's instructions: https://help.github.com/articles/creating-a-pull-request Cheers, Lenna From p.j.a.cock at googlemail.com Sun Aug 11 16:50:46 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 11 Aug 2013 21:50:46 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: On Sun, Aug 11, 2013 at 7:28 PM, Morten Kjeldgaard wrote: > > On 09/08/2013, at 21:06, Peter Cock wrote: > >> On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues wrote: >>> Dear all, >>> >>> The situation with the occupancy in the PDBParser led to think of one >>> thing. >>> >>> Since not everybody is in the same timezone, has the same availability, >>> etc, what about we introduce a brief moratorium over commits of say 3 >>> days (except for critical bug fixes)? This will give everybody probably >>> enough time to read the email and give their opinion. >>> >>> The downside is that it will make things roll a bit slower but then >>> again, 3 days is not so much.. >>> >>> Cheers, >>> >>> Jo?o >> >> I don't think that's really needed for small commits like >> this which are simple to interpret. In this case there were >> three opinions in favour of the idea, with a fourth counter >> view appearing later, resulting in a further tweak. >> >> Longer periods of discussion are far more important on >> large code additions or major changes. > > Sorry, but I don't agree that this is a "small commit". It may > not be large in terms of number of bytes, but it is large in > terms of impact, since it affects users' programs in > unpredictable ways. Hello again Morten, I did mean small in number of code change, which I tried to make clear from the rest of the email, but as discussed below, I also think the PDB occupancy change was also small in terms of behaviour. > Whenever a change is made that affects values > returned to the user, it is worth spending a few days > discussing it, to let people have a chance to think > through the consequences of the change. Almost any change impacts the user in some way. I still feel this was a minor change (although of course important to some, including you). This is parsing of malformed PDF files where the user ALREADY gets a warning (or error in strict mode, where there would be no functional change) that there is a problem with the occupancy data. One reason why I specifically talked about small commits (in the sense of a simple diff) above is they are trivial to revert if the need arises, or as in this case, modify: https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e This change was suggested and supported by people who've been actively contributing to the Biopython structural module for some time, so I had reason to trust their good judgement, and as I wrote at the time there was a clear consensus with three people in all happy with the idea: http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html Changes where there isn't clear agreement are generally discussed over a longer time period. Note that Biopython is already relatively strict about not breaking things and preserving backwards compatibility (to the point where it does delay new features). We do care about not breaking existing scripts without warning - so when people speak up on the list that something is likely to cause them trouble, we do listen. Is that any clearer? Regards, Peter From zruan1991 at gmail.com Sun Aug 11 18:04:10 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sun, 11 Aug 2013 18:04:10 -0400 Subject: [Biopython-dev] Codon Alignment GSoC Update Message-ID: Hi all, An update of Codon Alignment Project can be found at (http://zruanweb.com/). In the next week, I will be implementing the Maximum Likelihood method for dN/dS ratio estimation. I do not anticipate to write any code for the optimization and Scipy's functionality is most suitable to be used here. This might be a new dependency for Biopython. Is it okay to add this? Or are there some other functions in Biopython for optimization problems? Thanks! Best, Zheng Ruan From kai.blin at biotech.uni-tuebingen.de Mon Aug 12 06:53:17 2013 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Mon, 12 Aug 2013 12:53:17 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: <5208BE9D.1090900@biotech.uni-tuebingen.de> On 2013-08-09 19:26, Jo?o Rodrigues wrote: Dear biopython devs, > Since not everybody is in the same timezone, has the same availability, > etc, what about we introduce a brief moratorium over commits of say 3 days > (except for critical bug fixes)? This will give everybody probably enough > time to read the email and give their opinion. I've been through discussions like this before, in a lot of open source projects I'm involved in. I don't think this is a good step to take. Saying that "all patches need to wait unless they're special" will eventually lead to a dilution of what is considered special, and then lead to a point where most patches by core contributors happen to be special and patches by new contributors aren't. Because the policy doesn't explicitly state this, you then create a very unwelcoming atmosphere for the project. I would recommend to consider if avoiding the occasional revert is worth that cost. Personally, one of the things I like about BioPython is how fast I'm able to get bugfixes in. My two cents, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From tiagoantao at gmail.com Mon Aug 12 07:33:40 2013 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 12 Aug 2013 12:33:40 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: <5208BE9D.1090900@biotech.uni-tuebingen.de> References: <5208BE9D.1090900@biotech.uni-tuebingen.de> Message-ID: Hi, On 12 August 2013 11:53, Kai Blin wrote: > Personally, one of the things I like about BioPython is how fast I'm able > to get bugfixes in. > > I agree that the light approach to process is great. 99% of the patches are pacific and would suffer from a heavier process. For the rare cases where there are problems, revert can be used. My code has been reverted a couple of times and I am fine with that (when one commits to a public project with shared ownership one should expect peer-review, sometimes heated discussion and corrections - it is normal). If one thinks a change can be problematic, an initial discussion would be a good idea. Of course, some times we do not know until after the fact, then again, the good thing about version control is that we can undo things... Generally things have been working very well and I would not change the process to something heavier just because of a single case. Single cases should be sorted on a case-by-case basis, with no stress. My 2p, Tiago From yeyanbo289 at gmail.com Mon Aug 12 09:25:22 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 12 Aug 2013 21:25:22 +0800 Subject: [Biopython-dev] GSOC weekly update 8 Message-ID: Hi all, My update about Biopython.Phylo project can be found here: http://blog.yeyanbo.com/posts/google-summer-of-code-9.html Best, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From mok at bioxray.dk Mon Aug 12 14:33:26 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Mon, 12 Aug 2013 20:33:26 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> On 11/08/2013, at 22:50, Peter Cock wrote: > I still feel this was a minor change (although of > course important to some, including you). This is > parsing of malformed PDF files where the user > ALREADY gets a warning (or error in strict mode, > where there would be no functional change) that > there is a problem with the occupancy data. > > One reason why I specifically talked about small > commits (in the sense of a simple diff) above is > they are trivial to revert if the need arises, or as > in this case, modify: > https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e > > This change was suggested and supported by > people who've been actively contributing to the > Biopython structural module for some time, so I > had reason to trust their good judgement, and as > I wrote at the time there was a clear consensus > with three people in all happy with the idea: > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html I respect that you listen more to developers that have been contributing for a long time. That is quite understandable, but I hope that does not prevent me from contributing my opinions. What prompted my response was the suggestion that the occupancy should be set to 1.0 if it is abscent from the file, i.e. if the PDB file is malformed. I think that is an incorrect behavior, and I say that not as a core developer, but as a crystallographer. If invalid data is present in the file, you do not want the toolkit transforming it to valid data. After thinking about it, the suggestion to set values to None when they are not defined in a malformed file now appears quite reasonable, but if it is done this way with occupancies, it should also done this way with B-factors, chain identifiers and other values that are mandatory in the file according to the format specs. From the users perspective, if the values returned are None, you are alerted to the fact that something is wrong, and you should make an appropriate choice, whatever that may be. Cheers, Morten From arklenna at gmail.com Mon Aug 12 15:25:20 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Mon, 12 Aug 2013 15:25:20 -0400 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> References: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> Message-ID: On Mon, Aug 12, 2013 at 2:33 PM, Morten Kjeldgaard wrote: > > On 11/08/2013, at 22:50, Peter Cock wrote: > > > I still feel this was a minor change (although of > > course important to some, including you). This is > > parsing of malformed PDF files where the user > > ALREADY gets a warning (or error in strict mode, > > where there would be no functional change) that > > there is a problem with the occupancy data. > > > > One reason why I specifically talked about small > > commits (in the sense of a simple diff) above is > > they are trivial to revert if the need arises, or as > > in this case, modify: > > > https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e > > > > This change was suggested and supported by > > people who've been actively contributing to the > > Biopython structural module for some time, so I > > had reason to trust their good judgement, and as > > I wrote at the time there was a clear consensus > > with three people in all happy with the idea: > > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html > > > I respect that you listen more to developers that have been contributing > for a long time. That is quite understandable, but I hope that does not > prevent me from contributing my opinions. > > What prompted my response was the suggestion that the occupancy should be > set to 1.0 if it is abscent from the file, i.e. if the PDB file is > malformed. I think that is an incorrect behavior, and I say that not as a > core developer, but as a crystallographer. If invalid data is present in > the file, you do not want the toolkit transforming it to valid data. > I appreciate the physical/practical feedback about the commits. After thinking about it, the suggestion to set values to None when they are > not defined in a malformed file now appears quite reasonable, but if it is > done this way with occupancies, it should also done this way with > B-factors, chain identifiers and other values that are mandatory in the > file according to the format specs. From the users perspective, if the > values returned are None, you are alerted to the fact that something is > wrong, and you should make an appropriate choice, whatever that may be. > > I agree that `None` is a good warning value for missing data. I just skimmed the code and summarized how some of the missing values are handled: * Serial number: 0 * Chain: fatal in both strict and permissive modes (i.e. no try/except) * Coordinates: fatal in both strict and permissive modes * Occupancy: we recently decided to set as None in permissive * B-factor: 0.0 in permissive (code comment states this is PDB default) * Model seq id: 0 The StructureBuilder class also has certain ways of handling duplicate residues and atoms that I'm not particularly familiar with. For example, I'm not quite sure what will happen if successive atoms have missing serial numbers. PDB is a format where there's always a balance between absolute adherence to the format and enough flexibility to deal with the wide range of malformed files. Lenna From mok at bioxray.dk Mon Aug 12 15:42:28 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Mon, 12 Aug 2013 21:42:28 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> Message-ID: <0F6D9BF5-BFAA-4118-8D90-936AC44A29FA@bioxray.dk> On 12/08/2013, at 21:25, Lenna Peterson wrote: > * B-factor: 0.0 in permissive (code comment states this is PDB default) The default referred to in that code comment is what the PDB annotators put in that field if the information is not provided by the depositor (which could be the case for i.e. an NMR model). From the PDB Atomic Coordinate Entry Format Description, Version 3.30: * If the depositor provides the data, then the isotropic B value is given for the temperature factor. * If there are neither isotropic B values from the depositor, nor anisotropic temperature factors in ANISOU, then the default value of 0.0 is used for the temperature factor. In other words, the PDB format specification has no recommendations for what default values should be used if the field is blank in a malformed file, only what their staff should put in the entry when they receive it from the depositor. So IMO Biopython is free to use None if the B-value is missing in a malformed file. (I haven't checked the other items that Lenna mentions.) Cheers, Morten From anaryin at gmail.com Mon Aug 12 15:51:03 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 12 Aug 2013 12:51:03 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) Message-ID: Hi all, Moving to a new thread because this is a very specific issue. I think that, from a programming point of view (but I'm a biologist so correct me if I'm wrong) having None values upon parsing is probably a better idea. Then, when writing, these should be translated to whatever default there is in the PDB documentation. Cheers, Jo?o From anaryin at gmail.com Mon Aug 12 15:51:03 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 12 Aug 2013 12:51:03 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) Message-ID: Hi all, Moving to a new thread because this is a very specific issue. I think that, from a programming point of view (but I'm a biologist so correct me if I'm wrong) having None values upon parsing is probably a better idea. Then, when writing, these should be translated to whatever default there is in the PDB documentation. Cheers, Jo?o From p.j.a.cock at googlemail.com Mon Aug 12 16:36:15 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 12 Aug 2013 21:36:15 +0100 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Monday, August 12, 2013, Jo?o Rodrigues wrote: > Hi all, > > Moving to a new thread because this is a very specific issue. > > I think that, from a programming point of view (but I'm a biologist so > correct me if I'm wrong) having None values upon parsing is probably a > better idea. Then, when writing, these should be translated to whatever > default there is in the PDB documentation. > Or throw an error to force the user to fix it? Or write a blank occupancy to allow preservation of the (flawed) input? (Thank you for raising the output question now, it is a logically consequence of putting None in the parsed structure) Peter From anaryin at gmail.com Mon Aug 12 16:39:30 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 12 Aug 2013 13:39:30 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: Throwing an error might not be a good idea because when dealing with models they sometimes have missing fields... then we'd have to fix them all somehow before parsing them. The None value seems a good indicator that something is amiss, while not putting any value there. There should also be a warning upon writing that the value is being replaced by a default value. Blank is also good actually, maybe we could add an option to the writer/parser to "preserve" values? Cheers, Jo?o 2013/8/12 Peter Cock > > > On Monday, August 12, 2013, Jo?o Rodrigues wrote: > >> Hi all, >> >> Moving to a new thread because this is a very specific issue. >> >> I think that, from a programming point of view (but I'm a biologist so >> correct me if I'm wrong) having None values upon parsing is probably a >> better idea. Then, when writing, these should be translated to whatever >> default there is in the PDB documentation. >> > > Or throw an error to force the user to fix it? > > Or write a blank occupancy to allow preservation of the > (flawed) input? > > (Thank you for raising the output question now, it is a logically > consequence of putting None in the parsed structure) > > Peter > > From p.j.a.cock at googlemail.com Mon Aug 12 16:40:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 12 Aug 2013 21:40:24 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> References: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> Message-ID: On Monday, August 12, 2013, Morten Kjeldgaard wrote: > > On 11/08/2013, at 22:50, Peter Cock > > wrote: > > > I still feel this was a minor change (although of > > course important to some, including you). This is > > parsing of malformed PDF files where the user > > ALREADY gets a warning (or error in strict mode, > > where there would be no functional change) that > > there is a problem with the occupancy data. > > > > One reason why I specifically talked about small > > commits (in the sense of a simple diff) above is > > they are trivial to revert if the need arises, or as > > in this case, modify: > > > https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e > > > > This change was suggested and supported by > > people who've been actively contributing to the > > Biopython structural module for some time, so I > > had reason to trust their good judgement, and as > > I wrote at the time there was a clear consensus > > with three people in all happy with the idea: > > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html > > > I respect that you listen more to developers that have been contributing for a long time. That is quite understandable, but I hope that does not prevent me from contributing my opinions. Of course not - your input (which was after the initial change) has already resulted in a review of that change and the adoption of None instead. So thank you for speaking up, Peter From eric.talevich at gmail.com Mon Aug 12 18:35:05 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 12 Aug 2013 15:35:05 -0700 Subject: [Biopython-dev] Codon Alignment GSoC Update In-Reply-To: References: Message-ID: Hi Zheng, Nice work this week. For the next tasks: 1. It's probably not a high priority to implement all of the dN/dS approaches described in Yang's book (i.e. LWL85m, LPB93, Ina95), beyond the simple early methods (NG86, LWL85) and the finale, YN00. If you get around to doing them all, cool, but if you only have time to do one more I'd pick YN00. 2. SciPy is a relatively large dependency, so I recommend making it a runtime import -- do the import from within the function that needs it, rather than at the top-level scope of the module. E.g.: Bio.Phylo._utils.to_networkx 3. Where are you focusing your documentation efforts? If you're keeping most of the descriptions in the docstrings, it would be convenient to format the text as reStructuredText for processing with Epydoc and Sphinx. Time permitting, it would also be nice to have a chapter on this work in the Tutorial, see Doc/Tutorial.tex (also fine to write this up as a separate LaTeX document first and roll it in later). Cheers, Eric On Sun, Aug 11, 2013 at 3:04 PM, Zheng Ruan wrote: > Hi all, > > An update of Codon Alignment Project can be found at (http://zruanweb.com/). > In the next week, I will be implementing the Maximum Likelihood method for > dN/dS ratio estimation. I do not anticipate to write any code for the > optimization and Scipy's functionality is most suitable to be used here. > This might be a new dependency for Biopython. Is it okay to add this? Or > are there some other functions in Biopython for optimization problems? > Thanks! > > Best, > Zheng Ruan > From eric.talevich at gmail.com Mon Aug 12 19:03:07 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 12 Aug 2013 16:03:07 -0700 Subject: [Biopython-dev] GSOC weekly update 8 In-Reply-To: References: Message-ID: Hi Yanbo, Looks like excellent progress. At some point, would you mind documenting how the bit array operations are used to represent trees, e.g. how a bit array (BitString instance) should be interpreted in terms of taxa and tree topologies? Thanks, Eric On Mon, Aug 12, 2013 at 6:25 AM, Yanbo Ye wrote: > Hi all, > > My update about Biopython.Phylo project can be found here: > http://blog.yeyanbo.com/posts/google-summer-of-code-9.html > > Best, > Yanbo > > -- > > *Yanbo Ye* > *Guangzhou Institutes of Biomedicine and Health, * > *Chinese Academy of Sciences* > *190 Kaiyuan Avenue, Science Park, Guangzhou, China** > * > * > * > *Email: ye_yanbo at gibh.ac.cn* > *Web: http://www.yeyanbo.com* > *Phone: (86)-020-32093810* > From p.j.a.cock at googlemail.com Wed Aug 14 05:44:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 14 Aug 2013 10:44:24 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com> References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com> Message-ID: On Friday, July 26, 2013 Peter wrote: > On Wed, Jul 24, 2013 Peter Cock wrote: >> On Wed, Jul 24, 2013 Brad Chapman wrote: >>> >>> Peter and Michiel; >>> >>>>> Do we actually need setuptools? >>>>> Looking at setup.py, it seems that distutils is sufficient for our >>>>> needs. >>>>> If so, let's remove the dependency on setuptools. >>> >>> We used setuptools/distribute to install dependencies, although >>> practically this doesn't work well since pip doesn't finish NumPy >>> installation before installing Biopython. So I'm fine with taking it out >>> if you want to simplify the setup and avoid the extra dependency. >> >> Sounds like a plan - but we should all test this change, especially >> users of PIP, easy_install, virtual env etc. >> > > So who's going to do the commit - Brad or Michiel? > > Peter > On Fri, Jul 26, 2013 at 1:09 AM, Michiel de Hoon wrote: > Brad, can you do it? > Best, > -Michiel. I've done it: https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9 This needs some more testing now - particularly with the non-standard install options like pip, easy_install, etc. Peter From p.j.a.cock at googlemail.com Thu Aug 15 07:28:47 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 15 Aug 2013 12:28:47 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 next week? Message-ID: Hello all, Are there any remaining issues people think need to be resolve prior to releasing Biopython 1.62? If not, unless anyone else volunteers, I will make time for this next week. Possible issues worth reviewing - please reply on the existing threads: Changes to setup.py to remove use of setuptools, this would benefit from wider testing: https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9 http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010806.html Changes to PDB occupancy, do we need to change PDB writing in light of this? http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010802.html Update the Prank tool test to work with recent versions: http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010757.html Note that PyPy now have a beta out support Python 3, it would be nice to fully test with that as well... http://morepypy.blogspot.co.uk/2013/07/pypy3-21-beta-1.html Thanks, Peter From arklenna at gmail.com Thu Aug 15 09:18:35 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 15 Aug 2013 09:18:35 -0400 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Monday, 12 August 2013, Jo?o Rodrigues wrote: > Throwing an error might not be a good idea because when dealing with models > they sometimes have missing fields... then we'd have to fix them all > somehow before parsing them. > > The None value seems a good indicator that something is amiss, while not > putting any value there. There should also be a warning upon writing that > the value is being replaced by a default value. Blank is also good > actually, maybe we could add an option to the writer/parser to "preserve" > values? > > I don't think writing string "None" into a fixed width field would be a good idea. So it's probably best to change occupancy (and any other missing values set to None) to blank, correct width fields for writing. I've never tangled with the writer and I have incoming PhD students this week but I can attempt to add this functionality early next week. Lenna From p.j.a.cock at googlemail.com Thu Aug 15 09:23:50 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 15 Aug 2013 14:23:50 +0100 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Thu, Aug 15, 2013 at 2:18 PM, Lenna Peterson wrote: > On Monday, 12 August 2013, Jo?o Rodrigues wrote: >> >> Throwing an error might not be a good idea because when dealing with >> models >> they sometimes have missing fields... then we'd have to fix them all >> somehow before parsing them. >> >> The None value seems a good indicator that something is amiss, while not >> putting any value there. There should also be a warning upon writing that >> the value is being replaced by a default value. Blank is also good >> actually, maybe we could add an option to the writer/parser to "preserve" >> values? >> > > I don't think writing string "None" into a fixed width field would be a good > idea. So it's probably best to change occupancy (and any other missing > values set to None) to blank, correct width fields for writing. I didn't mean to suggest writing the string "None" in the field, and I'm not sure if Jo?o did - it would certainly be an invalid PDB file. I agree that where the data structure has None (e.g. from our parser) then the writer could use a blank string (of the appropriate width). For mandatory fields like occupancy, this should give a warning. > I've never tangled with the writer and I have incoming PhD students this > week but I can attempt to add this functionality early next week. That would be great (assuming no-one else want to tackle it sooner). Thanks, Peter From arklenna at gmail.com Thu Aug 15 10:54:53 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 15 Aug 2013 10:54:53 -0400 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: > > I don't think writing string "None" into a fixed width field would be a > good > > idea. So it's probably best to change occupancy (and any other missing > > values set to None) to blank, correct width fields for writing. > > I didn't mean to suggest writing the string "None" in the field, and > I'm not sure if Jo?o did - it would certainly be an invalid PDB file. > > I didn't mean anyone was suggesting we intentionally do this, but I bet that's what the writer is doing now! From eric.talevich at gmail.com Thu Aug 15 13:35:00 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Thu, 15 Aug 2013 10:35:00 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: > Thanks for these details Ben - it sounds like a mixture of real > test failures, and mere warnings that an external tool wasn't > found. > > On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > > My test machine was running Ubuntu 12.04. > > > > For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and > > got this error: > > ApplicationError: Command 'fasttree -out temp_test.tree > > Quality/example.fasta' returned non-zero exit status 1, 'Unknown or > > incorrect use of option -out' > > I don't seem to have fasttree installed at all, and from the > test and wrapper it is not explicit about which version is > was originally written for. > I pushed a patch to not use the potentially problematic '-out' flag: https://github.com/biopython/biopython/commit/771c1ed23bbb39dcf37805b4cb7bb23ffcb0c46a According to FastTree's changelog ( http://www.microbesonline.org/fasttree/ChangeLog), the -out option was added in version 2.1.5, released August 30, 2012. So the 'fasttree' package on the stable Ubuntu (12.04) does not have the -out flag, but the package in subsequent Ubuntus and other Debian derivatives does. -Eric From eric.talevich at gmail.com Thu Aug 15 19:44:38 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Thu, 15 Aug 2013 16:44:38 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: > Thanks for these details Ben - it sounds like a mixture of real > test failures, and mere warnings that an external tool wasn't > found. > > On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > > My test machine was running Ubuntu 12.04. > [...] > > I downloaded version 130708 of Prank from > > http://code.google.com/p/prank-msa/downloads/list. The error is on line > 165 > > of the test file: > > > > AssertionError: > > ----------------- > > PRANK v.130708: > > ----------------- > > > > Input for the analysis > > - converting 'Quality/example.fasta' to 'temp with space.phy' > > This sounds like a minor change in the stdout with recent > versions of PRANK. > > It's more exciting than that: Old versions of Prank created .xml and .dnd files by default, and had "-noxml" and "-notree" options to avoid creating them (or clean them up, whichever). New Pranks do not create these files by default, but do have "-showxml" and "-showtree" flags if you want them. I removed the use of these flags in the unit test. One of the tests used the set_parameter method, so I substituted the "-dots" flag for "-notree". It passes on my machine now: https://github.com/biopython/biopython/commit/30d7bcfb6eab8283a53372b2ad64b59be7461eb3 The doctests in Bio/Align/Applications/_Prank.py should probably change, too, since the same flags are used there. (I have not done this.) -Eric From w.arindrarto at gmail.com Fri Aug 16 03:14:24 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Fri, 16 Aug 2013 09:14:24 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: Hi Michiel, Peter, In preparation for the 1.62 release, I've made the following changes to Bio.NCBIStandalone and Bio.ParserSupport: * Migrated the two modules under Bio.SearchIO._legacy * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning I've pushed the changes to this branch: https://github.com/bow/biopython/tree/bio_blast_migrate Tests seem to be running fine still, but now there is the awkward situation where if users import Bio.NCBIStandalone and/or Bio.ParserSupport directly they will be greeted with two warnings: the BiopythonWarning for the modules' deprecation and the BiopythonExperimentalWarning for SearchIO. We could suppress the SearchIO warning in Bio.NCBIStandalone and Bio.ParserSupport. But before this is done, I was wondering if we have a defined timeline for removing a BiopythonExperimentalWarning? (i.e. if it will be removed in this release, then we could do that instead). Any opinions on this :)? Cheers, Bow On Sat, Jul 13, 2013 at 12:54 PM, Michiel de Hoon wrote: > Hi Bow, > > >> Would it be ok if we move parts that are used by SearchIO into their own >> private classes in >> Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current >> files? > > That sounds fine to me. Any other opinions, anybody? > > Best, > -Michiel. > > ________________________________ > From: Wibowo Arindrarto > To: Michiel de Hoon > Cc: Peter Cock ; Eric Talevich > ; Zheng Ruan ; Biopython-Dev > Mailing List > Sent: Saturday, July 13, 2013 3:58 PM > Subject: Re: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? > > Hi Michiel, > > There are two classes from Bio.Blast.NCBIStandalone still being used > by Bio.SearchIO internally (for the BLAST text parser): the > BlastParser and the Iterator classes. The BlastParser class itself > still relies on Bio.ParserSupport. Would it be ok if we move parts > that are used by SearchIO into their own private classes in > Bio.SearchIO, while putting the BiopythonDeprecationWarning on the > current files? > > Best regards, > Bow > > On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon > wrote: >> The following pieces of code had a PendingDeprecationWarning in Biopython >> release 1.61, and can be upgraded to a BiopythonDeprecationWarning: >> >> Bio.Blast.NCBIStandalone (entire module). This module has had a >> PendingDeprecationWarning since September 2010. >> >> Bio.Motif (entire module). Its functionality is available from Bio.motifs, >> so Bio.Motif can be deprecated. >> >> Bio.ParserSupport (entire module). This module is currently only being >> used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning >> since September 2011. >> >> Any final objections? >> >> Best, >> -Michiel >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev > > From p.j.a.cock at googlemail.com Fri Aug 16 05:31:13 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 16 Aug 2013 10:31:13 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote: > On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: >> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: >> > I downloaded version 130708 of Prank from >> > http://code.google.com/p/prank-msa/downloads/list. >> > The error is on line 165 of the test file: >> > >> > AssertionError: >> > ----------------- >> > PRANK v.130708: >> > ----------------- >> > >> > Input for the analysis >> > - converting 'Quality/example.fasta' to 'temp with space.phy' >> >> This sounds like a minor change in the stdout with recent >> versions of PRANK. >> > > It's more exciting than that: Old versions of Prank created .xml and .dnd > files by default, and had "-noxml" and "-notree" options to avoid creating > them (or clean them up, whichever). New Pranks do not create these files by > default, but do have "-showxml" and "-showtree" flags if you want them. Well that API break is a bit annoying, but your test changes make sense. Do we need to add these new switches to the wrapper itself? Peter From eric.talevich at gmail.com Sun Aug 18 14:14:13 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Sun, 18 Aug 2013 11:14:13 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock wrote: > On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote: > > On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: > >> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > >> > I downloaded version 130708 of Prank from > >> > http://code.google.com/p/prank-msa/downloads/list. > >> > The error is on line 165 of the test file: > >> > > >> > AssertionError: > >> > ----------------- > >> > PRANK v.130708: > >> > ----------------- > >> > > >> > Input for the analysis > >> > - converting 'Quality/example.fasta' to 'temp with space.phy' > >> > >> This sounds like a minor change in the stdout with recent > >> versions of PRANK. > >> > > > > It's more exciting than that: Old versions of Prank created .xml and .dnd > > files by default, and had "-noxml" and "-notree" options to avoid > creating > > them (or clean them up, whichever). New Pranks do not create these files > by > > default, but do have "-showxml" and "-showtree" flags if you want them. > > Well that API break is a bit annoying, but your test changes make sense. > > Do we need to add these new switches to the wrapper itself? > Here's the commit to add those switches to the wrapper: https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1 I suppose it would be helpful if the wrapper detected the version of Prank and handled the show(tree|xml) flags appropriately to avoid errors. But that would require running the executable first, I think, which is not something our wrappers normally do. (And then it would make sense to cache the result for the duration of the running process.) -Eric From p.j.a.cock at googlemail.com Sun Aug 18 14:39:08 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 18 Aug 2013 19:39:08 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Sun, Aug 18, 2013 at 7:14 PM, Eric Talevich wrote: > On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock wrote: >> >> Well that API break is a bit annoying, but your test changes make sense. >> >> Do we need to add these new switches to the wrapper itself? > > > Here's the commit to add those switches to the wrapper: > https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1 > > I suppose it would be helpful if the wrapper detected the version of Prank > and handled the show(tree|xml) flags appropriately to avoid errors. But that > would require running the executable first, I think, which is not something > our wrappers normally do. (And then it would make sense to cache the result > for the duration of the running process.) > > -Eric Historically we've just documented this kind of issue in the parameter docstring - the idea of auto-running the tool in the background to check the version just sounds like Trouble. Peter From yeyanbo289 at gmail.com Mon Aug 19 03:36:00 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 19 Aug 2013 15:36:00 +0800 Subject: [Biopython-dev] GSOC weekly update 10 Message-ID: Hi all, Biopython.Phylo project update of last week is here: http://blog.yeyanbo.com/posts/google-summer-of-code-10.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From zruan1991 at gmail.com Mon Aug 19 11:06:05 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 19 Aug 2013 11:06:05 -0400 Subject: [Biopython-dev] Codon Alignment GSoC Weekly Update Message-ID: Hi all, An update of CodonAlignment GSoC can be found at (http://zruanweb.com/). Thanks for your comments and suggestions. Best, Zheng Ruan From michael.maher at ucsf.edu Mon Aug 19 15:24:04 2013 From: michael.maher at ucsf.edu (Cyrus Maher) Date: Mon, 19 Aug 2013 12:24:04 -0700 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: Message-ID: Hi everybody!!- My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... I am writing because I'm interested in submitting a new Biopython module. Since this is likely a one-time event, the wiki recommends proceeding through a developer. After speaking with Peter Cock, he recommended that I open things up for discussion on the mailing list. Attached is a draft that describes a new method, termed MOSAIC, which integrates multiple sequence alignments from an arbitrary number number of sources. We show that it greatly increases the number of orthologs that we are able to detect while maintaining or improving functional-, phylogenetic-, and sequence identity-based measures of ortholog quality. Code and documentation may be found here: https://dl.dropboxusercontent.com/u/43327584/html/index.html Looking forward to hearing what you think! Best, -Cyrus -------------- next part -------------- A non-text attachment was scrubbed... Name: OD_fullpaper_8_5_13.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 1666812 bytes Desc: not available URL: From davidjosephcain at gmail.com Mon Aug 19 17:18:48 2013 From: davidjosephcain at gmail.com (David Cain) Date: Mon, 19 Aug 2013 17:18:48 -0400 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: Message-ID: Hi, Cyrus! Before the constructive criticism, I just wanted to say your module looks excellent and thank you for opening it up as free software! I'm by no means a developer (just interested in Biopython's development), but I noticed your code generally doesn't adhere to PEP8. If you're interested in getting feedback from others, it's quite valuable to format your code by the standards. (Proper PEP 8 code has a look and feel that's easier for the trained eye to view). Key things that detract from your module's readability: - CamelCase method, module, and field names (when a Python developer sees these, they're prone to assuming the name is for a class). Of course, Biopython doesn't provide the best example here, but there are reasons for that (it'll be fixed eventually). All-caps names are either refrained from use, or used for constants (i.e. you may wish to rename your module `mosaic`). - Very long line wrapping - you should really try to keep your lines to 79 characters - Using integers as booleans (you should stick to True/False, e.g. `while True` in lieu of `while 1`) - module renamings: it's much easier to see `random.shuffle` over `r.shuffle`, as one can assume `random` is the standard module, whereas `r` might be completely different. Also, your module should definitely remove usage of pdb if you wish to publish it as part of an official Python package. Would you be open to hosting a development branch of your code on GitHub or a similar community-editable resource? Any acceptance to the official Biopython distribution would of course be up to the main devs, but I'd be more than happy to test your code and make suggestions, regardless of its integration to a third-party package. David From christian at brueffer.de Tue Aug 20 07:36:09 2013 From: christian at brueffer.de (Christian Brueffer) Date: Tue, 20 Aug 2013 13:36:09 +0200 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: Message-ID: <521354A9.6020701@brueffer.de> On 8/19/13 21:24 , Cyrus Maher wrote: > Hi everybody!!- > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > I am writing because I'm interested in submitting a new Biopython module. > Since this is likely a one-time event, the wiki recommends proceeding > through a developer. After speaking with Peter Cock, he recommended that I > open things up for discussion on the mailing list. > > Attached is a draft that describes a new method, termed MOSAIC, which > integrates multiple sequence alignments from an arbitrary number number of > sources. We show that it greatly increases the number of orthologs that we > are able to detect while maintaining or improving functional-, > phylogenetic-, and sequence identity-based measures of ortholog quality. > > Code and documentation may be found here: > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > Looking forward to hearing what you think! > Hi Cyrus, I agree with David on the PEP8 issue. A very nice tool to use is the pep8 checker, https://pypi.python.org/pypi/pep8 I see that you use MSAProbs. I have an MSAProbs application wrapper in the works. I haven't submitted it yet due to incomplete unit tests, but maybe it's useful to you: https://github.com/cbrueffer/biopython/tree/msaprobs Cheers, Chris From michael.maher at ucsf.edu Tue Aug 20 14:24:43 2013 From: michael.maher at ucsf.edu (Cyrus Maher) Date: Tue, 20 Aug 2013 11:24:43 -0700 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: <521354A9.6020701@brueffer.de> References: <521354A9.6020701@brueffer.de> Message-ID: Thanks for your feedback, guys!! I did a bit of general clean-up and I've made all the recommended PEP8 changes, with the exception that I kept capital letters if they were part of an acronym. I've also switched the link in the documentation over to github and configured mosaic to use the MSAProbs application wrapper if it's installed. Let me know what you think!! Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html Code: https://github.com/cyrusmaher/mosaic Cheers, -Cyrus On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer wrote: > On 8/19/13 21:24 , Cyrus Maher wrote: > > Hi everybody!!- > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > lab > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > I am writing because I'm interested in submitting a new Biopython module. > > Since this is likely a one-time event, the wiki recommends proceeding > > through a developer. After speaking with Peter Cock, he recommended that > I > > open things up for discussion on the mailing list. > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > integrates multiple sequence alignments from an arbitrary number number > of > > sources. We show that it greatly increases the number of orthologs that > we > > are able to detect while maintaining or improving functional-, > > phylogenetic-, and sequence identity-based measures of ortholog quality. > > > > Code and documentation may be found here: > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > Looking forward to hearing what you think! > > > > Hi Cyrus, > > I agree with David on the PEP8 issue. A very nice tool to use is the > pep8 checker, https://pypi.python.org/pypi/pep8 > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > the works. I haven't submitted it yet due to incomplete unit tests, > but maybe it's useful to you: > > https://github.com/cbrueffer/biopython/tree/msaprobs > > Cheers, > > Chris > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From mok at bioxray.dk Tue Aug 20 14:35:14 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Tue, 20 Aug 2013 20:35:14 +0200 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: <43FD0A6C-ED54-4861-AADA-9F3E8FB6172A@bioxray.dk> On 15/08/2013, at 16:54, Lenna Peterson wrote: >>> I don't think writing string "None" into a fixed width field would be a >> good >>> idea. So it's probably best to change occupancy (and any other missing >>> values set to None) to blank, correct width fields for writing. >> >> I didn't mean to suggest writing the string "None" in the field, and >> I'm not sure if Jo?o did - it would certainly be an invalid PDB file. >> >> > I didn't mean anyone was suggesting we intentionally do this, but I bet > that's what the writer is doing now! I think the output should be identical to the input if a PDB file is read and then written again (apart from the fact that Bio.PDB currently doesn't save all headers.) Cheers, Morten From davidjosephcain at gmail.com Tue Aug 20 17:25:07 2013 From: davidjosephcain at gmail.com (David Cain) Date: Tue, 20 Aug 2013 17:25:07 -0400 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: <521354A9.6020701@brueffer.de> Message-ID: Hi, Cyrus - I took a quick look at your code on GitHub. Did you publish a different version of MOSAIC? By my linter, there are 309 PEP8 errors on mosaic.py. Also, as a general comment, your code seems to rely on sys.exit extensively. Python's exception framework is pretty handy - maybe your module could raise its own custom exceptions (Biopython's PDB parser is a good example of this design strategy). David Cain +1 (339) 222 4452 On Tue, Aug 20, 2013 at 2:24 PM, Cyrus Maher wrote: > Thanks for your feedback, guys!! I did a bit of general clean-up and I've > made all the recommended PEP8 changes, with the exception that I kept > capital letters if they were part of an acronym. I've also switched the > link in the documentation over to github and configured mosaic to use the > MSAProbs application wrapper if it's installed. Let me know what you > think!! > > Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html > Code: https://github.com/cyrusmaher/mosaic > > Cheers, > > -Cyrus > > > On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer > wrote: > > > On 8/19/13 21:24 , Cyrus Maher wrote: > > > Hi everybody!!- > > > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > > lab > > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > > > I am writing because I'm interested in submitting a new Biopython > module. > > > Since this is likely a one-time event, the wiki recommends proceeding > > > through a developer. After speaking with Peter Cock, he recommended > that > > I > > > open things up for discussion on the mailing list. > > > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > > integrates multiple sequence alignments from an arbitrary number number > > of > > > sources. We show that it greatly increases the number of orthologs that > > we > > > are able to detect while maintaining or improving functional-, > > > phylogenetic-, and sequence identity-based measures of ortholog > quality. > > > > > > Code and documentation may be found here: > > > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > > > Looking forward to hearing what you think! > > > > > > > Hi Cyrus, > > > > I agree with David on the PEP8 issue. A very nice tool to use is the > > pep8 checker, https://pypi.python.org/pypi/pep8 > > > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > > the works. I haven't submitted it yet due to incomplete unit tests, > > but maybe it's useful to you: > > > > https://github.com/cbrueffer/biopython/tree/msaprobs > > > > Cheers, > > > > Chris > > > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From arklenna at gmail.com Tue Aug 20 17:31:40 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Tue, 20 Aug 2013 17:31:40 -0400 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: <521354A9.6020701@brueffer.de> References: <521354A9.6020701@brueffer.de> Message-ID: Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8 (it can be a bit aggressive but that's what version control is for, right?) Cheers, Lenna On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer wrote: > On 8/19/13 21:24 , Cyrus Maher wrote: > > Hi everybody!!- > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > lab > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > I am writing because I'm interested in submitting a new Biopython module. > > Since this is likely a one-time event, the wiki recommends proceeding > > through a developer. After speaking with Peter Cock, he recommended that > I > > open things up for discussion on the mailing list. > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > integrates multiple sequence alignments from an arbitrary number number > of > > sources. We show that it greatly increases the number of orthologs that > we > > are able to detect while maintaining or improving functional-, > > phylogenetic-, and sequence identity-based measures of ortholog quality. > > > > Code and documentation may be found here: > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > Looking forward to hearing what you think! > > > > Hi Cyrus, > > I agree with David on the PEP8 issue. A very nice tool to use is the > pep8 checker, https://pypi.python.org/pypi/pep8 > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > the works. I haven't submitted it yet due to incomplete unit tests, > but maybe it's useful to you: > > https://github.com/cbrueffer/biopython/tree/msaprobs > > Cheers, > > Chris > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From arklenna at gmail.com Tue Aug 20 18:16:18 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Tue, 20 Aug 2013 18:16:18 -0400 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock wrote: > > > I didn't mean to suggest writing the string "None" in the field, and > I'm not sure if Jo?o did - it would certainly be an invalid PDB file. > > I agree that where the data structure has None (e.g. from our parser) > then the writer could use a blank string (of the appropriate width). > For mandatory fields like occupancy, this should give a warning. > > As I suspected, the writer currently fails on None (it's expecting a float). Test-driven development! However, I don't see a simple or elegant way to force writing of a blank occupancy. ATOM lines are currently written using C-style string formatting, and the occupancy field is `%6.2f`. Off the top of my head, I'd: 1. Store the original format string 2. Modify the format string to have "%6s" at the appropriate position 3. Modify the occupancy to be an empty string or a space 4. Set the return value to the formatted string 5. Restore the original format string 6. Return the return value However, this seems...ugly at best. I don't know that switching formatting styles (e.g. to string.format() or others) will help. And in most circumstances, the type checking of the format string is useful. Any thoughts? Cheers, Lenna From anaryin at gmail.com Tue Aug 20 18:25:57 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Tue, 20 Aug 2013 15:25:57 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: Hi, We should probably change it to str.format() regardless of advantages. If we indeed have None in the parser then writing becomes a bit more complicated. But I guess it's more correct? I'd vote for having a small check/conversion on the writer, besides on the formatting of the string. As a biologist, I don't care if it is none of empty string, or whatever, but for scripting maybe it makes more sense to be None? That's what I mean with more correct. Cheers, Jo?o From michael.maher at ucsf.edu Wed Aug 21 18:00:04 2013 From: michael.maher at ucsf.edu (Cyrus Maher) Date: Wed, 21 Aug 2013 15:00:04 -0700 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: <521354A9.6020701@brueffer.de> Message-ID: Thanks for sending that along Lenna! And thanks everybody for being patient with me! This is my first experience sharing software, so it's great to learn from you guys... As far as updates: -I've fixed all pep8 errors, with the exception of some finicky continuation indent complaints. -I've also uploaded example files so that the file "mosaic_example.py" can be run without modification. From the mosaic directory, just type: python mosaic_example.py testfiles.txt -The documentation has be updated as well. I would of course be open to any additional feedback you guys could offer for improving the code. That said, I was also hoping to get your thoughts on whether this seemed like the type of project that would fit in with Biopython. Peter said that Eric might have some good comments on this matter? Cheers, -Cyrus On Tue, Aug 20, 2013 at 2:31 PM, Lenna Peterson wrote: > Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8 > (it can be a bit aggressive but that's what version control is for, right?) > > Cheers, > > Lenna > > > On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer > wrote: > > > On 8/19/13 21:24 , Cyrus Maher wrote: > > > Hi everybody!!- > > > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > > lab > > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > > > I am writing because I'm interested in submitting a new Biopython > module. > > > Since this is likely a one-time event, the wiki recommends proceeding > > > through a developer. After speaking with Peter Cock, he recommended > that > > I > > > open things up for discussion on the mailing list. > > > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > > integrates multiple sequence alignments from an arbitrary number number > > of > > > sources. We show that it greatly increases the number of orthologs that > > we > > > are able to detect while maintaining or improving functional-, > > > phylogenetic-, and sequence identity-based measures of ortholog > quality. > > > > > > Code and documentation may be found here: > > > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > > > Looking forward to hearing what you think! > > > > > > > Hi Cyrus, > > > > I agree with David on the PEP8 issue. A very nice tool to use is the > > pep8 checker, https://pypi.python.org/pypi/pep8 > > > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > > the works. I haven't submitted it yet due to incomplete unit tests, > > but maybe it's useful to you: > > > > https://github.com/cbrueffer/biopython/tree/msaprobs > > > > Cheers, > > > > Chris > > > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Thu Aug 22 09:01:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 22 Aug 2013 14:01:27 +0100 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: <521354A9.6020701@brueffer.de> Message-ID: On Wed, Aug 21, 2013 at 11:00 PM, Cyrus Maher wrote: > > That said, I was also hoping to get your thoughts on whether this seemed > like the type of project that would fit in with Biopython. Peter said that > Eric might have some good comments on this matter? Right - I was thinking Eric and this year's phylogenetic focused GSoC students should have some good comments, e.g. about adding something like pal2nal into Biopython. Peter From p.j.a.cock at googlemail.com Fri Aug 23 04:54:35 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 23 Aug 2013 09:54:35 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Fri, Aug 16, 2013 at 8:14 AM, Wibowo Arindrarto wrote: > Hi Michiel, Peter, > > In preparation for the 1.62 release, I've made the following changes > to Bio.NCBIStandalone and Bio.ParserSupport: > > * Migrated the two modules under Bio.SearchIO._legacy > * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning So basically you're proposing formally deprecating parsing plain text BLAST output (via NCBIStandalone and Bio.ParserSupport) but continuing to support this format via SearchIO (using a copy of the current parser as a private module)? This then gives you the freedom to rewrite the old text parser more simply (e.g. assuming only recent versions of the BLAST suite), which might be nice. > I've pushed the changes to this branch: > https://github.com/bow/biopython/tree/bio_blast_migrate > > Tests seem to be running fine still, but now there is the awkward > situation where if users import Bio.NCBIStandalone and/or > Bio.ParserSupport directly they will be greeted with two warnings: the > BiopythonWarning for the modules' deprecation and the > BiopythonExperimentalWarning for SearchIO. > > We could suppress the SearchIO warning in Bio.NCBIStandalone and > Bio.ParserSupport. But before this is done, I was wondering if we have > a defined timeline for removing a BiopythonExperimentalWarning? (i.e. > if it will be removed in this release, then we could do that instead). It doesn't make sense to have a defined timetime for removing a BiopythonExperimentalWarning - it will be on a case by case basis. Do you think SearchIO is ready for that now (or in Biopython 1.63)? Peter From p.j.a.cock at googlemail.com Fri Aug 23 05:05:02 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 23 Aug 2013 10:05:02 +0100 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Tue, Aug 20, 2013 at 11:16 PM, Lenna Peterson wrote: > > On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock > wrote: >> >> >> I didn't mean to suggest writing the string "None" in the field, and >> I'm not sure if Jo?o did - it would certainly be an invalid PDB file. >> >> I agree that where the data structure has None (e.g. from our parser) >> then the writer could use a blank string (of the appropriate width). >> For mandatory fields like occupancy, this should give a warning. >> > > As I suspected, the writer currently fails on None (it's expecting a float). > Test-driven development! > > However, I don't see a simple or elegant way to force writing of a blank > occupancy. ATOM lines are currently written using C-style string formatting, > and the occupancy field is `%6.2f`. > > Off the top of my head, I'd: > > 1. Store the original format string > 2. Modify the format string to have "%6s" at the appropriate position > 3. Modify the occupancy to be an empty string or a space > 4. Set the return value to the formatted string > 5. Restore the original format string > 6. Return the return value > > However, this seems...ugly at best. I don't know that switching formatting > styles (e.g. to string.format() or others) will help. And in most > circumstances, the type checking of the format string is useful. > > Any thoughts? I would suggest something like this (untested): $ git diff diff --git a/Bio/PDB/PDBIO.py b/Bio/PDB/PDBIO.py index 2f64571..11a52ca 100644 --- a/Bio/PDB/PDBIO.py +++ b/Bio/PDB/PDBIO.py @@ -8,7 +8,7 @@ from Bio.PDB.StructureBuilder import StructureBuilder # To allow saving of chains, residues, etc.. from Bio.Data.IUPACData import atom_weights # Allowed Elements -_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c %8.3f%8.3f%8.3f%6.2f%6.2f %4s%2s%2s\n" +_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c %8.3f%8.3f%8.3f%s%6.2f %4s%2s%2s\n" class Select(object): @@ -85,8 +85,21 @@ class PDBIO(object): x, y, z=atom.get_coord() bfactor=atom.get_bfactor() occupancy=atom.get_occupancy() + # Handle a missing occupancy (None) with a blank entry: + try: + occupancy_str = "%6.2f" % occupancy + except TypeError: + if occupancy is None: + occupancy_str = " " * 6 + import warnings + from Bio import BiopythonWarning + # TODO - Introduce exception BiopythonWriterWarning? + warning.warn("Missing occupancy will be recorded as blank", + BiopythonWarning) + else: + raise TypeError("Invalid occupancy %r in atom %r" % (occupancy, atom)) args=(record_type, atom_number, name, altloc, resname, chain_id, - resseq, icode, x, y, z, occupancy, bfactor, segid, + resseq, icode, x, y, z, occupancy_str, bfactor, segid, element, charge) return _ATOM_FORMAT_STRING % args The error message could be improved (e.g. a more helpful identification of the ATOM at fault)? Peter From w.arindrarto at gmail.com Sat Aug 24 06:22:56 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Sat, 24 Aug 2013 12:22:56 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: Hi Peter, everyone, >> In preparation for the 1.62 release, I've made the following changes >> to Bio.NCBIStandalone and Bio.ParserSupport: >> >> * Migrated the two modules under Bio.SearchIO._legacy >> * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning > > So basically you're proposing formally deprecating parsing plain > text BLAST output (via NCBIStandalone and Bio.ParserSupport) > but continuing to support this format via SearchIO (using a copy > of the current parser as a private module)? > > This then gives you the freedom to rewrite the old text parser > more simply (e.g. assuming only recent versions of the BLAST > suite), which might be nice. Yes. This seems like a sensible thing to do now. >> I've pushed the changes to this branch: >> https://github.com/bow/biopython/tree/bio_blast_migrate >> >> Tests seem to be running fine still, but now there is the awkward >> situation where if users import Bio.NCBIStandalone and/or >> Bio.ParserSupport directly they will be greeted with two warnings: the >> BiopythonWarning for the modules' deprecation and the >> BiopythonExperimentalWarning for SearchIO. >> >> We could suppress the SearchIO warning in Bio.NCBIStandalone and >> Bio.ParserSupport. But before this is done, I was wondering if we have >> a defined timeline for removing a BiopythonExperimentalWarning? (i.e. >> if it will be removed in this release, then we could do that instead). > > It doesn't make sense to have a defined timetime for removing a > BiopythonExperimentalWarning - it will be on a case by case basis. > > Do you think SearchIO is ready for that now (or in Biopython 1.63)? Hmm..what I have in mind is actually as soon as we lift SearchIO's BiopythonExperimentalWarning, we give Bio.Blast a PendingDeprecationWarning. I think this gives users a clearer / firmer choice, since it could be confusing to have two different modules that handle BLAST parsing in Biopython. As for the readiness, I think the important features that we planned have been implemented in SearchIO. I don't have any major feature change that I would like to implement anytime soon, too. So yes, I think it is ready. Best, Bow From yeyanbo289 at gmail.com Sun Aug 25 23:53:50 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 26 Aug 2013 11:53:50 +0800 Subject: [Biopython-dev] GSOC weekly update 11 Message-ID: Hi all, Biopython.Phylo project update for last week is here: http://blog.yeyanbo.com/posts/google-summer-of-code-11.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From p.j.a.cock at googlemail.com Mon Aug 26 10:04:35 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 26 Aug 2013 15:04:35 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto wrote: > Hi Peter, everyone, > > As for the readiness, I think the important features that we planned > have been implemented in SearchIO. I don't have any major feature > change that I would like to implement anytime soon, too. So yes, I > think it is ready. So you'd be comfortable with removing the experimental warning for SearchIO in Biopython 1.62 final (this week if the PDB occupancy thing is resolved)? And you would like to officially support plain text BLAST parsing (despite it not being recommend by the NCBI, and known to have been quite a lot of work in the past to keep the parser working)? We should probably also give you (Bow) commit rights too, so you can handle basic parser updates within SearchIO directly - assuming you're happy with that? Regards, Peter From w.arindrarto at gmail.com Mon Aug 26 12:04:38 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Mon, 26 Aug 2013 18:04:38 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 26, 2013 at 4:04 PM, Peter Cock wrote: > On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto > wrote: >> Hi Peter, everyone, >> >> As for the readiness, I think the important features that we planned >> have been implemented in SearchIO. I don't have any major feature >> change that I would like to implement anytime soon, too. So yes, I >> think it is ready. > > So you'd be comfortable with removing the experimental warning > for SearchIO in Biopython 1.62 final (this week if the PDB occupancy > thing is resolved)? Yes. I think all public-facing modules are ok now. There are still two issue which I consider minor, but I think should be mentioned before we lift the warning: 1. Storing [T]FAST[X|Y] query and hit strand information (see https://redmine.open-bio.org/issues/3419). I'm not sure yet if I should do the commit, but Jason's patch look sensible (and I can probably add some more so that the parser knows whether to set the strand on hit or query sequence). 2. Collapsing / merging overlapping HSPs. I've received one (or two) mail(s) asking whether it is possible to merge overlapping HSPs (apparently BLAST sometimes do this). I haven't figured a way to cleanly implement this, so this is on hold for now. In addition, we had a discussion some months ago about the Bio._utils module that SearchIO uses (see http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010219.html, http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010240.html, and http://lists.open-bio.org/pipermail/biopython-dev/2013-February/010290.html). We had an extensive discussion about this last time, which went as far as considering a change on how we run our tests. Since the Bio._utils module itself is private, however, no public-facing functions in SearchIO is affected. Other than these, some planned features are implementing the HMMER3.1 parser (which I think should not interfere with lifting the warning). > And you would like to officially support plain text BLAST parsing > (despite it not being recommend by the NCBI, and known to have > been quite a lot of work in the past to keep the parser working)? Looking at http://lists.open-bio.org/pipermail/biopython/2012-September/008166.html, the most sensible approach seems to be to put the current parser under SearchIO (hence the module reorganization I did; so we can deprecate Bio.Blast as a whole without losing functionality), without actually advertising that we have full support of parsing the text output (perhaps put a disclaimer that plain text is not guaranteed to work?). I feel like some people may still want to use previous BLAST versions anyway, and we do have a functioning parser tested up to 2.2.26+, so throwing it away doesn't seem to be the best thing to do here. And in the case that someone does want to extend the parser (could be me, could be someone else) to work with the latest BLAST version, (s)he can then extend the existing parser. > We should probably also give you (Bow) commit rights too, so you > can handle basic parser updates within SearchIO directly - assuming > you're happy with that? This is fine with me. Best, Bow P.S. I made the pull request for the reorganization here: https://github.com/biopython/biopython/pull/223, comments are welcomed :). From p.j.a.cock at googlemail.com Tue Aug 27 04:41:39 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 27 Aug 2013 09:41:39 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: > >> So you'd be comfortable with removing the experimental warning >> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy >> thing is resolved)? > > Yes. I think all public-facing modules are ok now. There are still two > issue which I consider minor, but I think should be mentioned before > we lift the warning: > > ... > > Other than these, some planned features are implementing the HMMER3.1 > parser (which I think should not interfere with lifting the warning). We'll also want to update the Tutorial as well, merging the BLAST and SearchIO chapters. Let's start work on this just after releasing Biopython 1.62 then, which I think we can now go ahead with :) Lenna has sorted out the PDB occupancy issue, and Eric has updated the PRANK unit tests. I think this means we are OK to do the release in the next day or two? Any objections? Regards, Peter From p.j.a.cock at googlemail.com Tue Aug 27 04:43:17 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 27 Aug 2013 09:43:17 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? Message-ID: Continuing this thread under a new title, as below, I would like to do the Biopython 1.62 release in the next day or two: http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html Peter On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote: > On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: >> >>> So you'd be comfortable with removing the experimental warning >>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy >>> thing is resolved)? >> >> Yes. I think all public-facing modules are ok now. There are still two >> issue which I consider minor, but I think should be mentioned before >> we lift the warning: >> >> ... >> >> Other than these, some planned features are implementing the HMMER3.1 >> parser (which I think should not interfere with lifting the warning). > > We'll also want to update the Tutorial as well, merging the BLAST > and SearchIO chapters. Let's start work on this just after releasing > Biopython 1.62 then, which I think we can now go ahead with :) > > Lenna has sorted out the PDB occupancy issue, and Eric has > updated the PRANK unit tests. > > I think this means we are OK to do the release in the next day > or two? Any objections? > > Regards, > > Peter From w.arindrarto at gmail.com Tue Aug 27 05:41:32 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Tue, 27 Aug 2013 11:41:32 +0200 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: Hi Peter, everyone, On Tue, Aug 27, 2013 at 10:43 AM, Peter Cock wrote: > Continuing this thread under a new title, as below, I would > like to do the Biopython 1.62 release in the next day or two: > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html > > Peter > > On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote: >> On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: >>> >>>> So you'd be comfortable with removing the experimental warning >>>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy >>>> thing is resolved)? >>> >>> Yes. I think all public-facing modules are ok now. There are still two >>> issue which I consider minor, but I think should be mentioned before >>> we lift the warning: >>> >>> ... >>> >>> Other than these, some planned features are implementing the HMMER3.1 >>> parser (which I think should not interfere with lifting the warning). >> >> We'll also want to update the Tutorial as well, merging the BLAST >> and SearchIO chapters. Let's start work on this just after releasing >> Biopython 1.62 then, which I think we can now go ahead with :) Ah yes. I missed the tutorial. Then yes, it should be updated as well. If we are doing this after 1.62 is released, is worth it to aim for a larger change (I recall there was a discussion some time ago about porting the tutorial to Sphinx). >> Lenna has sorted out the PDB occupancy issue, and Eric has >> updated the PRANK unit tests. >> >> I think this means we are OK to do the release in the next day >> or two? Any objections? No objections from me :). Best, Bow From eric.talevich at gmail.com Tue Aug 27 14:45:58 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Tue, 27 Aug 2013 11:45:58 -0700 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: On Tue, Aug 27, 2013 at 1:43 AM, Peter Cock wrote: > Continuing this thread under a new title, as below, I would > like to do the Biopython 1.62 release in the next day or two: > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html > > Peter > > On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote: > > On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: > >> > >>> So you'd be comfortable with removing the experimental warning > >>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy > >>> thing is resolved)? > >> > >> Yes. I think all public-facing modules are ok now. There are still two > >> issue which I consider minor, but I think should be mentioned before > >> we lift the warning: > >> > >> ... > >> > >> Other than these, some planned features are implementing the HMMER3.1 > >> parser (which I think should not interfere with lifting the warning). > > > > We'll also want to update the Tutorial as well, merging the BLAST > > and SearchIO chapters. Let's start work on this just after releasing > > Biopython 1.62 then, which I think we can now go ahead with :) > > > > Lenna has sorted out the PDB occupancy issue, and Eric has > > updated the PRANK unit tests. > > > > I think this means we are OK to do the release in the next day > > or two? Any objections? > > > > Regards, > > > > Peter > Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the Tutorial to mention CDAO support? Also, has anything else noteworthy been added since the beta that we can announce in the NEWS file? Thanks, Eric From p.j.a.cock at googlemail.com Tue Aug 27 15:27:48 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 27 Aug 2013 20:27:48 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich wrote: > > Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the > Tutorial to mention CDAO support? Go for it - I need to retest the DSSP unit test tomorrow anyway. > Also, has anything else noteworthy been added since the beta that we can > announce in the NEWS file? Minor bug fixes and more tests? Perhaps the PDB occupancy change? Peter From w.arindrarto at gmail.com Wed Aug 28 08:12:24 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Wed, 28 Aug 2013 14:12:24 +0200 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: Hi Peter, everyone, On Tue, Aug 27, 2013 at 9:27 PM, Peter Cock wrote: > On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich wrote: >> >> Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the >> Tutorial to mention CDAO support? > > Go for it - I need to retest the DSSP unit test tomorrow anyway. > >> Also, has anything else noteworthy been added since the beta that we can >> announce in the NEWS file? > > Minor bug fixes and more tests? Perhaps the PDB occupancy change? > > Peter I don't like to believe in coincidences, but just last night a user emailed me about an issue in SearchIO's exonerate parser which I feel should be mentioned here (exchange attached on his permission). He stumbled on an error where an exonerate output file is unparseable because of split codon alignments. In short, I feel we should not lift the BiopythonExperimentalWarning for the 1.62 release. The issue is caused by protein to genome alignments in exonerate (in the protein2genome alignment mode) that has split codons in it. When split codons are present, SearchIO splits these HSPs into fragments which are basically a single contiguous sequence alignment. These fragments have their own Seq objects (representing hit and query sequences). The problem is, these Seq objects have to be full sequences and the query sequence fragment (protein) do not represent a full sequence here (since the underlying codon is split). Currently, SearchIO raises an AssertionError when this type of alignment is found and simply says it can not deal with it. This should not remain the case, though. A test case was actually put up for this (https://github.com/biopython/biopython/blob/master/Tests/Exonerate/exn_22_m_protein2genome.exn#L173). However, since I have yet to find a way to properly represent these fragments with Seq objects, the actual test have not been written (and I missed this when doing the last review). I have thought of several alternatives: * I saw a ThreeLetterProtein Alphabet in https://github.com/biopython/biopython/blob/master/Bio/Alphabet/__init__.py#L136, maybe we could use this to create Seq objects that allows partial codons? * Change HSPFragment to not use full Seq objects anymore (which may require some rework on the HSP objects as well) But have not explored them thoroughly. I should note that Zheng Ruan's GSoC project on Codon alignments (http://zruanweb.com/category/gsoc.html) may prove useful as well here. While I don't expect the issue to pop up often (it shows up only when exonerate is used with the protein2genome mode out of the many modes it has and when the alignment hits a split codon), I feel like it should be discussed (if not, mentioned) here first since dealing with the issue may require some more reworking. So I'm sorry for the late warning and missing this. I hope this is not too late :). Best, Bow -------------- next part -------------- On Wed, Aug 28, 2013 at 10:31 AM, Wibowo Arindrarto wrote: > Hi Somak, > >> Do you have any idea whether Bioperl based Exonerate parser can handle such cases? >> I'm yet to try Bioperl. > > I tried your file with Bioperl's parser, and while it can parse the > entire file without errors, I don't know whether all the information > in the file (sequence, sequence coordinates) are parsed properly. But > maybe that's just me being less familiar with Bioperl. I suggest > posting to their mailing list > (http://lists.open-bio.org/pipermail/bioperl-l/) or searching the list > archive if you have any questions regarding this. The library also > have an active community behind it. > >> And please feel free to forward this mail to Biopythonlist or any other discussion forum you >> think is appropriate, > > Ok, thanks :). > >> Thanks again >> >> Somak Ray > > Best, > Bow > >> ________________________________________ >> From: w.arindrarto at gmail.com [w.arindrarto at gmail.com] on behalf of Wibowo Arindrarto [bow at bow.web.id] >> Sent: Tuesday, August 27, 2013 8:01 PM >> To: Ray, Somak >> Subject: Re: On parsing of exonerate output >> >> Hi Somak, >> >>> Dear Dr. Arindrarto, >>> >>> I came across your blog about parsing outputs from Exonerate . I have some >>> generated some files using exonarates protein2dna model. However when >>> running your scripts on them I'm getting some assertion error in python 2.7. >>> I'm attaching two of such exonerate outputs.The "Result_goodfile.txt" can >>> be passed by the parser whereas "Result_badfile.txt" can't be parsed. >>> >>> Please let me know if there's any solution to the problem. >>> >>> Thanks in advance >> >> Hmm..looking at the files, it seems that this is caused by a split >> codon in the alignment (Results_badfile.txt, line 25). The problem is, >> the three-letter amino acid sequence needs to be translated into a >> single-letter amino acid sequence since Biopython could not create Seq >> objects with three-letter amino acid codes. However, this conversion >> means that codons that span introns (as the one on line 25) could not >> be dealt with properly since a single fragment expects a full Seq >> object (hence the error you're seeing; it expects the three-letter >> amino acid sequence length to be multiples of three). >> >> So the short answer is no, there is not yet an immediate solution to this issue. >> >> I should mention that this came at an appropriate time, though, so >> thanks for the email :). I am reviewing known SearchIO issues and this >> was apparently an issue that I have lost track of (checking at the >> test suite, there is a test for this case but it has not been included >> in the test suite). >> >> Do you mind if I forward this email to the Biopython list >> (http://biopython.org/wiki/Mailing_lists)? I think other developers / >> users may be interested in this. >> >> Best, >> Bow From p.j.a.cock at googlemail.com Wed Aug 28 13:31:19 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 18:31:19 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: Hello all, I'm starting the release 1.62 process now, getting the new DSSP test working cross platform was more work than I expected - thank goodness for the BuildBot server yet again :) Please don't commit anything to the master branch until further notice, Thanks, Peter From p.j.a.cock at googlemail.com Wed Aug 28 14:28:43 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 19:28:43 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress Message-ID: On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock wrote: > Hello all, > > I'm starting the release 1.62 process now, getting the new DSSP > test working cross platform was more work than I expected - > thank goodness for the BuildBot server yet again :) > > Please don't commit anything to the master branch until further > notice, > > Thanks, > > Peter While I finish off the Windows installers etc, and have dinner, would anyone like to volunteer to write a draft for the release announcement to go out on the mailing lists and news blog? http://news.open-bio.org/news/category/obf-projects/biopython/ These are usually based on the rather dry NEWS file information, and the previous announcement for style/links/etc. Thanks, Peter From p.j.a.cock at googlemail.com Wed Aug 28 14:53:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 19:53:21 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 Message-ID: Hello all - especially newcomers, There are going to be several boring but useful things to do to the Biopython code base once we're finished with Python 2.5 (the imminent release of Biopython 1.62 has been clearly described as the final Biopython release to support it). Some of these tasks are quite easy, and might tempt some of our non-core contributors or new-comers to have a go, however to avoid too much duplication of effort I'd suggest **replying in this thread if you want to tackle anything** - and then start working out how to send us your first pull request. Things which will need doing: (0) Disable the Python 2.5 and Jython 2.5 buildbot (this will be done by me or Tiago) (1) Disable the Python 2.5 target in TravisCI, see https://travis-ci.org/biopython/biopython/ (this is a simple one line edit to the .travis.yml file) (2) Remove all the with statement imports (and any comment lines associated with them): from __future__ import with_statement (3) Remove Bio/_py3k/_namedtuple.py and adjust import lines accordingly (4) Scan over the code base looking for any comments about Python 2.5 (e.g. using the grep command), and reviewing them one by one to see if there is an old workaround we can now remove. (5) More advanced code review, for example looking for places we can better take advantage of context managers (with statements) for file handles. Of this list, (1), (2) and (3) are certainly things suitable for relative newcomers - and assuming I'm not away I will happily do the pull request reviews. For the more advances issues (4) and (5) we may need more eyes on the code... Thank you, Peter From p.j.a.cock at googlemail.com Wed Aug 28 15:01:36 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 20:01:36 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 7:28 PM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock wrote: >> Hello all, >> >> I'm starting the release 1.62 process now, getting the new DSSP >> test working cross platform was more work than I expected - >> thank goodness for the BuildBot server yet again :) >> >> Please don't commit anything to the master branch until further >> notice, >> >> Thanks, >> >> Peter > > While I finish off the Windows installers etc, and have dinner, > would anyone like to volunteer to write a draft for the release > announcement to go out on the mailing lists and news blog? > http://news.open-bio.org/news/category/obf-projects/biopython/ > > These are usually based on the rather dry NEWS file information, > and the previous announcement for style/links/etc. > > Thanks, > > Peter A provisional tar-ball, zip file, and four Windows installers are up now (but deliberately not yet listed on the download wiki page): http://biopython.org/DIST/ If anyone would care to sanity test those in the next hour or two, that would be great. Thanks, Peter From p.j.a.cock at googlemail.com Wed Aug 28 16:43:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 21:43:58 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: > Hello all - especially newcomers, > > There are going to be several boring but useful things to do to > the Biopython code base once we're finished with Python 2.5 > (the imminent release of Biopython 1.62 has been clearly > described as the final Biopython release to support it). > > Some of these tasks are quite easy, and might tempt some > of our non-core contributors or new-comers to have a go, > however to avoid too much duplication of effort I'd suggest > **replying in this thread if you want to tackle anything** - and > then start working out how to send us your first pull request. I tweeted this earlier, https://twitter.com/pjacock/status/372796602760855552 > Things which will need doing: > > ... > > (1) Disable the Python 2.5 target in TravisCI, see > https://travis-ci.org/biopython/biopython/ > (this is a simple one line edit to the .travis.yml file) The first easy task has been claimed already: https://github.com/biopython/biopython/pull/226 Wayne wrote: >> Via Twitter, I saw your note" >> (1) Disable the Python 2.5 target in TravisCI, see >> https://travis-ci.org/biopython/biopython/ >> (this is a simple one line edit to the .travis.yml file)" >> >> Turned out it really was as easy as you said. Once the release is out, that fix can go in - thanks :) Wayne (BCC'd), please sign up to the biopython-dev list if you haven't already: http://lists.open-bio.org/mailman/listinfo/biopython-dev Thank you, Peter From arklenna at gmail.com Wed Aug 28 16:57:10 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Wed, 28 Aug 2013 16:57:10 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 2:53 PM, Peter Cock wrote: > > (2) Remove all the with statement imports (and any > comment lines associated with them): > > from __future__ import with_statement > As I demonstrated, I regularly forget that `with` is "new"! > > (4) Scan over the code base looking for any comments > about Python 2.5 (e.g. using the grep command), and > reviewing them one by one to see if there is an old > workaround we can now remove. > If I count: find Bio -name "*.py" -exec grep -H -n ".*#.*2\.5" {} \; I only see 24 - not too bad. Many are `with` related. > > (5) More advanced code review, for example looking > for places we can better take advantage of context > managers (with statements) for file handles. > For this one: find Bio -name "*.py" -exec grep -H -n -P "= ?open\(" {} \; I find 145...although not all `open()` statements can be easily swapped for `with`. I'm currently prepping for my UK trip so I may not be able to do any of this before I get back mid-September. Cheers, Lenna From p.j.a.cock at googlemail.com Wed Aug 28 16:58:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 21:58:58 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 9:43 PM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: >> Hello all - especially newcomers, >> >> There are going to be several boring but useful things to do to >> the Biopython code base once we're finished with Python 2.5 >> (the imminent release of Biopython 1.62 has been clearly >> described as the final Biopython release to support it). >> >> Some of these tasks are quite easy, and might tempt some >> of our non-core contributors or new-comers to have a go, >> however to avoid too much duplication of effort I'd suggest >> **replying in this thread if you want to tackle anything** - and >> then start working out how to send us your first pull request. > > I tweeted this earlier, > https://twitter.com/pjacock/status/372796602760855552 > >> Things which will need doing: >> >> ... >> >> (1) Disable the Python 2.5 target in TravisCI, see >> https://travis-ci.org/biopython/biopython/ >> (this is a simple one line edit to the .travis.yml file) > > The first easy task has been claimed already: > https://github.com/biopython/biopython/pull/226 And task (2) as well on the same pull request - keen! Wayne (BCC'd), could you delay trying task (3) for a few days to give someone else a chance please ;) Maybe have a look for things under (4) instead, Lenna's quick count suggests plenty of things need looking at... Peter From w.arindrarto at gmail.com Wed Aug 28 17:17:57 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Wed, 28 Aug 2013 23:17:57 +0200 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: Hi everyone, I've written a draft of our 1.62 release (below). I'd appreciate it if somebody gives it another look (for typos, etc.). Also, if I miss somebody in the contributors list, please let me know :). --- Biopython 1.62 released ======================= Source distributions and Windows installers for **Biopython** 1.62 are now available from the [downloads page](http://biopython.org/wiki/Download) on the [official Biopython website](http://biopython.org/wiki/Main_Page) and from the [Python Package Index (PyPI)](https://pypi.python.org/pypi/biopython). # Python support This is our first official release that supports Python 3. Specifically, we tested under Python 3.3. Other versions of Python 3 may still work albeit with some issues. We still fully support Python 2.5, 2.6, and 2.7. Support under [Jython](http://www.jython.org/) is available for versions 2.5 and 2.7 and under [PyPy](http://pypy.org/) for versions 1.9 and 2.0. However, unlike CPython, Jython and PyPy support is partial: NumPy and our C extensions are not covered. Please note that this release marks our last official support Python 2.5. Beginning from Biopython 1.63, the minimum supported Python version will be 2.6. # Highlights * The translation functions will give a warning on any partial codons (and this will probably become an error in a future release). If you know you are dealing with partial sequences, either pad with N to extend the sequence length to a multiple of three, or explicitly trim the sequence. * The handling of joins and related complex features in Genbank/EMBL files has been changed with the introduction of a CompoundLocation object. Previously a SeqFeature for something like a multi-exon CDS would have a child SeqFeature (under the sub_features attribute) for each exon. The sub_features property will still be populated for now, but is deprecated and will in future be removed. Please consult the examples in the help (docstrings) and Tutorial. * Thanks to the efforts of Ben Morris, the Phylo module now supports the file formats NeXML and CDAO. The Newick parser is also significantly faster, and can now optionally extract bootstrap values from the Newick comment field (like Molphy and Archaeopteryx do). Nate Sutton added a wrapper for FastTree to Bio.Phylo.Applications. * New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from UniProt-GOA. * The BioSQL module is now supported in Jython. MySQL and PostgreSQL databases can be used. The relevant JDBC driver should be available in the CLASSPATH. * Feature labels on circular GenomeDiagram figures now support the label_position argument (start, middle or end) in addition to the current default placement, and in a change to prior releases these labels are outside the features which is now consistent with the linear diagrams. * The code for parsing 3D structures in mmCIF files was updated to use the Python standard library's shlex module instead of C code using flex. * The Bio.Sequencing.Applications module now includes a BWA command line wrapper. * Bio.motifs supports JASPAR format files with multiple position-frequence matrices. Additionally there have been other minor bug fixes and more unit tests. # Contributors Many thanks to the Biopython developers and community for making this release possible, especially the following contributors: Alexander Campbell (first contribution) Andrea Rizzi (first contribution) Anthony Mathelier (first contribution) Ben Morris (first contribution) Brad Chapman Christian Brueffer David Arenillas (first contribution) David Martin (first contribution) Eric Talevich Iddo Friedberg Jian-Long Huang (first contribution) Joao Rodrigues Kai Blin Michiel de Hoon Nate Sutton (first contribution) Peter Cock Petra Kubincov? (first contribution) Phillip Garland Saket Choudhary (first contribution) Tiago Antao Wibowo 'Bow' Arindrarto Xabier Bello (first contribution) ---- Best, Bow On Wed, Aug 28, 2013 at 8:28 PM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock wrote: >> Hello all, >> >> I'm starting the release 1.62 process now, getting the new DSSP >> test working cross platform was more work than I expected - >> thank goodness for the BuildBot server yet again :) >> >> Please don't commit anything to the master branch until further >> notice, >> >> Thanks, >> >> Peter > > While I finish off the Windows installers etc, and have dinner, > would anyone like to volunteer to write a draft for the release > announcement to go out on the mailing lists and news blog? > http://news.open-bio.org/news/category/obf-projects/biopython/ > > These are usually based on the rather dry NEWS file information, > and the previous announcement for style/links/etc. > > Thanks, > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From p.j.a.cock at googlemail.com Wed Aug 28 17:30:33 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 22:30:33 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 10:17 PM, Wibowo Arindrarto wrote: > Hi everyone, > > I've written a draft of our 1.62 release (below). I'd appreciate it if > somebody gives it another look (for typos, etc.). Also, if I miss > somebody in the contributors list, please let me know :). Thanks Bow - I don't think the WordPress blog understands markdown style markup, but bonus marks anyway :) I'm about to update the tar-ball and zip file to include the NEWS file updated with the two names Bow spotted as missing - hopefully there are no more and this commit will get the release tag: https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff (The Windows installers I prepared earlier should not be affected as they don't include the NEWS file) > # Python support > > This is our first official release that supports Python 3. > Specifically, we tested under Python 3.3. Other versions > of Python 3 may still work albeit with some issues. I'd be a bit more explicit: Specifically, this is supported under Python 3.3. Older versions of Python 3 may still work albeit with some issues, but are *not* supported. > Please note that this release marks our last official support Python > 2.5. Beginning from Biopython 1.63, the minimum supported Python > version will be 2.6. Minor typo, needs a for/of, e.g. Please note that this release marks our last official support for Python 2.5 Thanks Bow, Peter From w.arindrarto at gmail.com Wed Aug 28 18:17:44 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Thu, 29 Aug 2013 00:17:44 +0200 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: Hi Peter, > Thanks Bow - I don't think the WordPress blog understands > markdown style markup, but bonus marks anyway :) Ah yes, I was planning to convert it later to HTML (I find writing markdown first easier ~ and also more mailing-list friendly). > I'm about to update the tar-ball and zip file to include the > NEWS file updated with the two names Bow spotted as > missing - hopefully there are no more and this commit > will get the release tag: > > https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff > > (The Windows installers I prepared earlier should not be > affected as they don't include the NEWS file) > >> # Python support >> >> This is our first official release that supports Python 3. >> Specifically, we tested under Python 3.3. Other versions >> of Python 3 may still work albeit with some issues. > > I'd be a bit more explicit: > > Specifically, this is supported under Python 3.3. Older > versions of Python 3 may still work albeit with some > issues, but are *not* supported. > >> Please note that this release marks our last official support Python >> 2.5. Beginning from Biopython 1.63, the minimum supported Python >> version will be 2.6. > > Minor typo, needs a for/of, e.g. > > Please note that this release marks our last official support for > Python 2.5 > > Thanks Bow, > > Peter Fixes applied, thanks too :). Best, Bow From p.j.a.cock at googlemail.com Wed Aug 28 18:21:54 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 23:21:54 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 11:17 PM, Wibowo Arindrarto wrote: > Hi Peter, > >> Thanks Bow - I don't think the WordPress blog understands >> markdown style markup, but bonus marks anyway :) > > Ah yes, I was planning to convert it later to HTML (I find writing > markdown first easier ~ and also more mailing-list friendly). Thank you :) This is live now but can be edited - so we can fix any remaining issues before sending round the emails: http://news.open-bio.org/news/2013/08/biopython-1-62-released/ Tagged on GitHub too, https://github.com/biopython/biopython/tree/biopython-162 Note I have not yet pushed to PyPI - I'd like one or two positive reports first before doing that (just in case). Thanks all, Peter From p.j.a.cock at googlemail.com Wed Aug 28 18:47:04 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 23:47:04 +0100 Subject: [Biopython-dev] Biopython 1.62 released Message-ID: Dear Biopythoneers, Source distributions and Windows installers for Biopython 1.62 are now available from the downloads page on the official Biopython website and (soon) from the Python Package Index (PyPI). Python support This is our first release of Biopython which officially supports Python 3. Specifically, this is supported under Python 3.3. Older versions of Python 3 may still work albeit with some issues, but are not supported. We still fully support Python 2.5, 2.6, and 2.7. Support under Jython is available for versions 2.5 and 2.7 and under PyPy for versions 1.9 and 2.0. However, unlike CPython, Jython and PyPy support is partial: NumPy and our C extensions are not covered. Please note that this release marks our last official for support Python 2.5. Beginning from Biopython 1.63, the minimum supported Python version will be 2.6. Highlights The translation functions will give a warning on any partial codons (and this will probably become an error in a future release). If you know you are dealing with partial sequences, either pad with ?N? to extend the sequence length to a multiple of three, or explicitly trim the sequence. The handling of joins and related complex features in Genbank/EMBL files has been changed with the introduction of a CompoundLocation object. Previously a SeqFeaturefor something like a multi-exon CDS would have a child SeqFeature (under thesub_features attribute) for each exon. The sub_features property will still be populated for now, but is deprecated and will in future be removed. Please consult the examples in the help (docstrings) and Tutorial. Thanks to the efforts of Ben Morris, the Phylo module now supports the file formats NeXML and CDAO. The Newick parser is also significantly faster, and can now optionally extract bootstrap values from the Newick comment field (like Molphy and Archaeopteryx do). Nate Sutton added a wrapper for FastTree toBio.Phylo.Applications. New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from UniProt-GOA. The BioSQL module is now supported in Jython. MySQL and PostgreSQL databases can be used. The relevant JDBC driver should be available in the CLASSPATH. Feature labels on circular GenomeDiagram figures now support the label_positionargument (start, middle or end) in addition to the current default placement, and in a change to prior releases these labels are outside the features which is now consistent with the linear diagrams. The code for parsing 3D structures in mmCIF files was updated to use the Python standard library?s shlex module instead of C code using flex. The Bio.Sequencing.Applications module now includes a BWA command line wrapper. Bio.motifs supports JASPAR format files with multiple position-frequence matrices. Additionally there have been other minor bug fixes and more unit tests. Contributors Many thanks to the Biopython developers and community for making this release possible, especially the following contributors: Alexander Campbell (first contribution) Andrea Rizzi (first contribution) Anthony Mathelier (first contribution) Ben Morris (first contribution) Brad Chapman Christian Brueffer David Arenillas (first contribution) David Martin (first contribution) Eric Talevich Iddo Friedberg Jian-Long Huang (first contribution) Joao Rodrigues Kai Blin Lenna Peterson Michiel de Hoon Matsuyuki Shirota (first contribution) Nate Sutton (first contribution) Peter Cock Petra Kubincov? (first contribution) Phillip Garland Saket Choudhary (first contribution) Tiago Antao Wibowo ?Bow? Arindrarto Xabier Bello (first contribution) Thank you all. Release announcement here (RSS feed available): http://news.open-bio.org/news/2013/08/biopython-1-62-released/ P.S. You can follow @Biopython on Twitter https://twitter.com/Biopython From p.j.a.cock at googlemail.com Thu Aug 29 05:04:59 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 10:04:59 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: > Hello all - especially newcomers, > > There are going to be several boring but useful things to do to > the Biopython code base once we're finished with Python 2.5 > (the imminent release of Biopython 1.62 has been clearly > described as the final Biopython release to support it). > > Some of these tasks are quite easy, and might tempt some > of our non-core contributors or new-comers to have a go, > however to avoid too much duplication of effort I'd suggest > **replying in this thread if you want to tackle anything** - and > then start working out how to send us your first pull request. > > Things which will need doing: > > (0) Disable the Python 2.5 and Jython 2.5 buildbot > (this will be done by me or Tiago) Done. > (1) Disable the Python 2.5 target in TravisCI, see > https://travis-ci.org/biopython/biopython/ > (this is a simple one line edit to the .travis.yml file) Done by Wayne, https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de > (2) Remove all the with statement imports (and any > comment lines associated with them): > > from __future__ import with_statement Done by Wayne, https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84 > (3) Remove Bio/_py3k/_namedtuple.py and adjust > import lines accordingly Any new volunteer want to try this? > (4) Scan over the code base looking for any comments > about Python 2.5 (e.g. using the grep command), and > reviewing them one by one to see if there is an old > workaround we can now remove. Lenna had a quick look, there should be some easy one here. > (5) More advanced code review, for example looking > for places we can better take advantage of context > managers (with statements) for file handles. Another new one, related to (5), and fairly easy: (6) Reviewing examples in the docstrings and Tutorial where it would make sense to use a 'with' for file handles. This should also solve many of the ResourceWarning: unclosed file ... warnings visible running the full test suite under Python 3, e.g. see: http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio Peter From chris.mit7 at gmail.com Thu Aug 29 11:20:09 2013 From: chris.mit7 at gmail.com (Chris Mitchell) Date: Thu, 29 Aug 2013 11:20:09 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: I was going to take a stab at (3), but it seems that _namedtuple.py doesn't exist. Looking under _py3k as well as grep -Ri namedtuple ./* fails to find it. I'm pulling from https://github.com/biopython/biopython.git On Thu, Aug 29, 2013 at 5:04 AM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock > wrote: > > Hello all - especially newcomers, > > > > There are going to be several boring but useful things to do to > > the Biopython code base once we're finished with Python 2.5 > > (the imminent release of Biopython 1.62 has been clearly > > described as the final Biopython release to support it). > > > > Some of these tasks are quite easy, and might tempt some > > of our non-core contributors or new-comers to have a go, > > however to avoid too much duplication of effort I'd suggest > > **replying in this thread if you want to tackle anything** - and > > then start working out how to send us your first pull request. > > > > Things which will need doing: > > > > (0) Disable the Python 2.5 and Jython 2.5 buildbot > > (this will be done by me or Tiago) > > Done. > > > (1) Disable the Python 2.5 target in TravisCI, see > > https://travis-ci.org/biopython/biopython/ > > (this is a simple one line edit to the .travis.yml file) > > Done by Wayne, > > https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de > > > (2) Remove all the with statement imports (and any > > comment lines associated with them): > > > > from __future__ import with_statement > > Done by Wayne, > > https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84 > > > (3) Remove Bio/_py3k/_namedtuple.py and adjust > > import lines accordingly > > Any new volunteer want to try this? > > > (4) Scan over the code base looking for any comments > > about Python 2.5 (e.g. using the grep command), and > > reviewing them one by one to see if there is an old > > workaround we can now remove. > > Lenna had a quick look, there should be some easy one here. > > > (5) More advanced code review, for example looking > > for places we can better take advantage of context > > managers (with statements) for file handles. > > Another new one, related to (5), and fairly easy: > > (6) Reviewing examples in the docstrings and Tutorial > where it would make sense to use a 'with' for file handles. > > This should also solve many of the ResourceWarning: > unclosed file ... warnings visible running the full test > suite under Python 3, e.g. see: > > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Thu Aug 29 11:30:51 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 16:30:51 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell wrote: > I was going to take a stab at (3), but it seems that _namedtuple.py doesn't > exist. > > Looking under _py3k as well as grep -Ri namedtuple ./* > > fails to find it. I'm pulling from > https://github.com/biopython/biopython.git Oops. I wrote that email on my latop - it was a file never checked into source code control. Looking back it was a plan for allowing us to use named tuples on older versions of Python. Sorry! But I have come up with another easy task instead, (7) Update exception style from this, except ErrorClass, variable_name: to this: except ErrorClass as variable_name: The second form is the only allowed syntax in Python 3, but was not possible under Python 2.5. Regards, Peter From chris.mit7 at gmail.com Thu Aug 29 12:03:51 2013 From: chris.mit7 at gmail.com (Chris Mitchell) Date: Thu, 29 Aug 2013 12:03:51 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Sounds good. Just took care of (7), running the test suite and will send a pull request when that passes. Chris On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock wrote: > On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell > wrote: > > I was going to take a stab at (3), but it seems that _namedtuple.py > doesn't > > exist. > > > > Looking under _py3k as well as grep -Ri namedtuple ./* > > > > fails to find it. I'm pulling from > > https://github.com/biopython/biopython.git > > Oops. I wrote that email on my latop - it was a file never checked > into source code control. Looking back it was a plan for allowing > us to use named tuples on older versions of Python. Sorry! > > But I have come up with another easy task instead, > > (7) Update exception style from this, > > except ErrorClass, variable_name: > > to this: > > except ErrorClass as variable_name: > > The second form is the only allowed syntax in Python 3, > but was not possible under Python 2.5. > > Regards, > > Peter > From p.j.a.cock at googlemail.com Thu Aug 29 12:20:51 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 17:20:51 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 5:03 PM, Chris Mitchell wrote: > Sounds good. Just took care of (7), running the test suite and will send a > pull request when that passes. > > Chris https://github.com/biopython/biopython/pull/227 looks good, but has highlighted a bug in Scripts/debug/debug_blast_parser.py (see my comment on GitHub). Good work, Peter From p.j.a.cock at googlemail.com Thu Aug 29 12:33:43 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 17:33:43 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: > On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: >> Hello all - especially newcomers, >> >> There are going to be several boring but useful things to do to >> the Biopython code base once we're finished with Python 2.5 >> (the imminent release of Biopython 1.62 has been clearly >> described as the final Biopython release to support it). >> >> Some of these tasks are quite easy, and might tempt some >> of our non-core contributors or new-comers to have a go, >> however to avoid too much duplication of effort I'd suggest >> **replying in this thread if you want to tackle anything** - and >> then start working out how to send us your first pull request. >> >> Things which will need doing: >> >> (0) Disable the Python 2.5 and Jython 2.5 buildbot >> (this will be done by me or Tiago) > > Done. > >> (1) Disable the Python 2.5 target in TravisCI, see >> https://travis-ci.org/biopython/biopython/ >> (this is a simple one line edit to the .travis.yml file) > > Done by Wayne, > https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de > >> (2) Remove all the with statement imports (and any >> comment lines associated with them): >> >> from __future__ import with_statement > > Done by Wayne, > https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84 > >> (3) Remove Bio/_py3k/_namedtuple.py and adjust >> import lines accordingly (3) was a false alarm, just an old file on my latop confusing me. >> (4) Scan over the code base looking for any comments >> about Python 2.5 (e.g. using the grep command), and >> reviewing them one by one to see if there is an old >> workaround we can now remove. > > Lenna had a quick look, there should be some easy one here. > >> (5) More advanced code review, for example looking >> for places we can better take advantage of context >> managers (with statements) for file handles. > > Another new one, related to (5), and fairly easy: > > (6) Reviewing examples in the docstrings and Tutorial > where it would make sense to use a 'with' for file handles. > > This should also solve many of the ResourceWarning: > unclosed file ... warnings visible running the full test > suite under Python 3, e.g. see: > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock wrote: > ... I have come up with another easy task instead, > > (7) Update exception style from this, > > except ErrorClass, variable_name: > > to this: > > except ErrorClass as variable_name: > > The second form is the only allowed syntax in Python 3, > but was not possible under Python 2.5. (7) is being tackled by Chris Mitchell, https://github.com/biopython/biopython/pull/227 Here's another fairly easy task for another new volunteer?: (8) Excluding doctests and the Tutorial, use print function rather than print statement. e.g. replace this: print variable1, variable2 with this: from __future__ import print_function ... print(variable1, variable2) Note that I am deliberately not suggesting we switch the user visible examples on our documentation yet - that deserves some discussion first. Peter From p.j.a.cock at googlemail.com Thu Aug 29 13:03:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 18:03:24 +0100 Subject: [Biopython-dev] Python 2.6+ support for __dir__ method Message-ID: Hi all, I was reading over the list of what's new in Python 2.6 and wondered about this: > The built-in dir() function now checks for a __dir__() method on the > objects it receives. This method must return a list of strings containing > the names of valid attributes for the object, and lets the object control > the value that dir() produces. Objects that have __getattr__() or > __getattribute__() methods can use this to advertise pseudo-attributes > they will honor. (issue 1591665) http://docs.python.org/2/whatsnew/2.6.html Does that sound useful for some of our more dynamic objects? Peter From arklenna at gmail.com Thu Aug 29 13:18:16 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 29 Aug 2013 13:18:16 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock wrote: > > Here's another fairly easy task for another new volunteer?: > > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: > > print variable1, variable2 > > with this: > > from __future__ import print_function > ... > print(variable1, variable2) > > Note that I am deliberately not suggesting we switch the > user visible examples on our documentation yet - that > deserves some discussion first. > > >From the docs: "When using the 2to3 source-to-source conversion tool, all print statements are automatically converted to print() function calls, so this is mostly a non-issue for larger projects." http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function Which suggests either doing it with the tool or just waiting until the full 3.0 changeover? From p.j.a.cock at googlemail.com Thu Aug 29 13:35:16 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 18:35:16 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thursday, August 29, 2013, Lenna Peterson wrote: > > > On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock > > wrote: > >> >> Here's another fairly easy task for another new volunteer?: >> >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: >> >> print variable1, variable2 >> >> with this: >> >> from __future__ import print_function >> ... >> print(variable1, variable2) >> >> Note that I am deliberately not suggesting we switch the >> user visible examples on our documentation yet - that >> deserves some discussion first. >> >> > From the docs: "When using the 2to3 source-to-source conversion tool, all > print statements are automatically converted to print() function calls, so > this is mostly a non-issue for larger projects." > > http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function > > Which suggests either doing it with the tool or just waiting until the > full 3.0 changeover? > My motivation is a step towards a single codebase for both Python 2 and Python 3 without needing 2to3, see: http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html http://www.slideshare.net/pjacock/biopython-update-bosc2013/ Peter From superbobry at gmail.com Thu Aug 29 16:34:59 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Fri, 30 Aug 2013 00:34:59 +0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock wrote: > Here's another fairly easy task for another new volunteer?: > > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: > > print variable1, variable2 > > with this: > > from __future__ import print_function > ... > print(variable1, variable2) > > Note that I am deliberately not suggesting we switch the > user visible examples on our documentation yet - that > deserves some discussion first. So the task is to remove print statement from the code only, right? I think I can do this, should I use a separate branch? Sergei From p.j.a.cock at googlemail.com Thu Aug 29 16:44:49 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 21:44:49 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev wrote: > On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock > wrote: >> >> Here's another fairly easy task for another new volunteer?: >> >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: >> >> print variable1, variable2 >> >> with this: >> >> from __future__ import print_function >> ... >> print(variable1, variable2) >> >> Note that I am deliberately not suggesting we switch the >> user visible examples on our documentation yet - that >> deserves some discussion first. > > > So the task is to remove print statement from the code only, right? Replacing them with print functions, and testing this worked OK under both Python 2 and Python 3, yes :) > I think I can do this, should I use a separate branch? > > Sergei Yes, I would certainly recommend keeping the default 'master' branch as a copy of the official one, and creating a new 'print-function' branch (or whatever name you prefer) for this work. We probably need to improve this wiki page - so any comments about what is unclear would be great (on a new email thread): http://biopython.org/wiki/GitUsage Thanks, Peter From p.j.a.cock at googlemail.com Fri Aug 30 06:49:23 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 11:49:23 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Hello Biopythoneers, I've outlined another relatively simple improvement for potential new contributors to try below.... On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock wrote: >> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: >>> Hello all - especially newcomers, >>> >>> There are going to be several boring but useful things to do to >>> the Biopython code base once we're finished with Python 2.5 >>> (the imminent release of Biopython 1.62 has been clearly >>> described as the final Biopython release to support it). >>> >>> ... >>> >>> (4) Scan over the code base looking for any comments >>> about Python 2.5 (e.g. using the grep command), and >>> reviewing them one by one to see if there is an old >>> workaround we can now remove. >> >> Lenna had a quick look, there should be some easy one here. >> >>> (5) More advanced code review, for example looking >>> for places we can better take advantage of context >>> managers (with statements) for file handles. >> >> Another new one, related to (5), and fairly easy: >> >> (6) Reviewing examples in the docstrings and Tutorial >> where it would make sense to use a 'with' for file handles. >> >> This should also solve many of the ResourceWarning: >> unclosed file ... warnings visible running the full test >> suite under Python 3, e.g. see: >> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio > > On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock wrote: >> ... I have come up with another easy task instead, >> >> (7) Update exception style (7) was done by Chris Mitchell, https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942 > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: (8) is being looked at by Sergei Lebedev. ---- Here's another idea, under the general issue (5) of taking advantage of context managers (with statements), which I would judge to be fairly easy (but not trivial). (9) Use context managers (with statements) for temporary warning filters in the unit tests. Currently many of our unit tests add simple filters to ignore a warning, and then restore the old filters using pop(). This mostly works, but is fragile and the filter list is global so this can have strange side effects. See: $ grep "warnings." Tests/*.py The idea here is to replace this: warnings.simplefilter('ignore', PDBConstructionWarning) #some code which may trigger the warning warnings.filters.pop() with this: with warnings.catch_warnings(): warnings.simplefilter("ignore", PDBConstructionWarning) #some code which may trigger the warning Note the indentation - these changes will not give nice clean diffs, so this will not be so easy to review. I would therefore suggest editing just one test file at a time (i.e. limit each commit to changing a single file), as that makes it easier to selectively apply your changes Please make sure you test this Python 2.6 which is most likely to have problems with this "new" style ;) (Again, if anyone plans to work on this, please let the list know to minimised duplicated effort.) If you're not familiar with our test suite, there is a chapter introducing this in the main Tutorial & Cookbook, http://biopython.org/DIST/docs/tutorial/Tutorial.html Thanks, Peter From superbobry at gmail.com Fri Aug 30 08:58:31 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Fri, 30 Aug 2013 16:58:31 +0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: Unfortunately we cannot exclude doctests, because 'from __future__' import is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on docstrings with print statement. Sergei On Fri, Aug 30, 2013 at 12:44 AM, Peter Cock wrote: > On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev > wrote: > > On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock > > wrote: > >> > >> Here's another fairly easy task for another new volunteer?: > >> > >> (8) Excluding doctests and the Tutorial, use print function > >> rather than print statement. e.g. replace this: > >> > >> print variable1, variable2 > >> > >> with this: > >> > >> from __future__ import print_function > >> ... > >> print(variable1, variable2) > >> > >> Note that I am deliberately not suggesting we switch the > >> user visible examples on our documentation yet - that > >> deserves some discussion first. > > > > > > So the task is to remove print statement from the code only, right? > > Replacing them with print functions, and testing this > worked OK under both Python 2 and Python 3, yes :) > > > I think I can do this, should I use a separate branch? > > > > Sergei > > Yes, I would certainly recommend keeping the > default 'master' branch as a copy of the official one, > and creating a new 'print-function' branch (or whatever > name you prefer) for this work. > > We probably need to improve this wiki page - so any > comments about what is unclear would be great (on > a new email thread): http://biopython.org/wiki/GitUsage > > Thanks, > > Peter > From p.j.a.cock at googlemail.com Fri Aug 30 09:14:14 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 14:14:14 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev wrote: >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: > > Unfortunately we cannot exclude doctests, because 'from __future__' import > is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on > docstrings with print statement. > > Sergei Could you clarify this? Does this cause a problem via: [Tests]$ python run_tests.py doctest If you have a small example, copy & paste the "git diff" output here. Peter From superbobry at gmail.com Fri Aug 30 09:28:50 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Fri, 30 Aug 2013 17:28:50 +0400 Subject: [Biopython-dev] =?utf-8?q?_Re=3A__Post_Biopython_1=2E62_release?= =?utf-8?q?=2C_clean-up_after_dropping_Python_2=2E5?= In-Reply-To: References: Message-ID: Sure,?a common pattern for a lot of BioPython modules seems to be: ? ? # +from __future__ import print_function ? ? def foo(): ? ? ? ? """A docstring with print statement. ? ? ? ? >>> print "foo" ? ? ? ? foo ? ? ? ? """ ? ? ? ? print "Running foo ..." ? ? ? ? # +print("Running foo ...") ? ? if __name__ == "__main__": ? ? ? ? import doctest ? ? ? ? doctest.testmod() where foo is some function, which uses print statement in its body. Since we want to switch from print statements to print function we replace?print "Running foo ..."?with a?print()?call and add from?__future__ import ...?to the?beginning?of the module.? What happens if we try to run the doctests after we've switched to?print_function? ? ? $ python /tmp/foo.py ? ? ********************************************************************** ? ? File "/tmp/foo.py", line 7, in __main__.foo ? ? Failed example: ? ? ? ? print "foo" ? ? Exception raised: ? ? ? ? Traceback (most recent call last): ? ? ? ? ? File ".../doctest.py", line 1254, in __run ? ? ? ? ? ? compileflags, 1) in test.globs ? ? ? ? ? File "", line 1 ? ? ? ? ? ? print "foo" ? ? ? ? ? ? ? ? ? ? ? ^ ? ? ? ? SyntaxError: invalid syntax ? ? ********************************************************************** ? ? 1 items had failures: ? ? ? ?1 of ? 1 in __main__.foo ? ? ***Test Failed*** 1 failures. So, enabling?print_function?makes doctests using print statement fail with a SyntaxError, as shown by the example above. Thus, if we want to get rid of print statement in the code we have no other choice but to do the same it in the doctests. Sergei? On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com) wrote: On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev wrote: >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: > > Unfortunately we cannot exclude doctests, because 'from __future__' import > is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on > docstrings with print statement. > > Sergei Could you clarify this? Does this cause a problem via: [Tests]$ python run_tests.py doctest If you have a small example, copy & paste the "git diff" output here. Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.j.a.cock at googlemail.com Fri Aug 30 10:22:26 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 15:22:26 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Thanks Sergei - that clarified things. Unfortunately this doesn't just break our convenience __main__ trick for running the doctests in any single module, it also breaks doing it via: $ python run_tests.py doctest This means we'd have to update the doctests to also use Python 3 style print functions... which may be premature (we'll need to do this at some point though). How about the less ambitious plan of replacing lines like this: print variable with: print(variable) This will be understood as a print function call on Python 3 (and work), and will also work on Python 2 (without the future import) where it will be parsed as redundant parentheses. Note you can't use this trick where more than one variable is printed, because then on Python 2 the brackets will create a tuple instead. Peter On Fri, Aug 30, 2013 at 2:28 PM, Sergei Lebedev wrote: > Sure, a common pattern for a lot of BioPython modules seems to be: > > # +from __future__ import print_function > > > def foo(): > """A docstring with print statement. > > >>> print "foo" > foo > """ > print "Running foo ..." > # +print("Running foo ...") > > > if __name__ == "__main__": > import doctest > doctest.testmod() > > where foo is some function, which uses print statement in its body. Since we > want to switch from print statements to print function we replace print > "Running foo ..." with a print() call and add from __future__ import ... to > the beginning of the module. > > What happens if we try to run the doctests after we've switched to > print_function? > > $ python /tmp/foo.py > ********************************************************************** > File "/tmp/foo.py", line 7, in __main__.foo > Failed example: > print "foo" > Exception raised: > Traceback (most recent call last): > File ".../doctest.py", line 1254, in __run > compileflags, 1) in test.globs > File "", line 1 > print "foo" > ^ > SyntaxError: invalid syntax > ********************************************************************** > 1 items had failures: > 1 of 1 in __main__.foo > ***Test Failed*** 1 failures. > > So, enabling print_function makes doctests using print statement fail with a > SyntaxError, as shown by the example above. Thus, if we want to get rid of > print statement in the code we have no other choice but to do the same it in > the doctests. > > Sergei > > > > On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com) > wrote: > > On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev > wrote: >>> (8) Excluding doctests and the Tutorial, use print function >>> rather than print statement. e.g. replace this: >> >> Unfortunately we cannot exclude doctests, because 'from __future__' import >> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on >> docstrings with print statement. >> >> Sergei > > Could you clarify this? Does this cause a problem via: > > [Tests]$ python run_tests.py doctest > > If you have a small example, copy & paste the "git diff" output here. > > Peter From p.j.a.cock at googlemail.com Fri Aug 30 11:46:59 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 16:46:59 +0100 Subject: [Biopython-dev] Fwd: [biopython] Potential error in mass calculations for RNA/DNA? (#229) In-Reply-To: References: Message-ID: Who are our sequence mass experts? https://github.com/biopython/biopython/issues/229 ---------- Forwarded message ---------- From: nruggero Date: Thu, Aug 29, 2013 at 11:03 PM Subject: [biopython] Potential error in mass calculations for RNA/DNA? (#229) To: biopython/biopython In Bio/Data/IUPACData.py the molecular weights of unambiguous DNA are listed as: unambiguous_dna_weights = { "A": 347., "C": 323., "G": 363., "T": 322., } As far as I can tell these are the molecular weights for the non-deoxy bases instead of the deoxy bases. For example, AMP (347.22) instead of dAMP (331.22) is listed. I've looked at the original BioPearl code that these numbers were taken from and I think they were just copied incorrectly. I have also looked at the code which uses this dict in Bio/SeqUtils/__init__.py called molecular_weight() and it just takes the sum of these values over the sequence (no correction made). So, is this an error or am I missing something basic? Thanks ? Reply to this email directly or view it on GitHub . From superbobry at gmail.com Fri Aug 30 18:53:53 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Sat, 31 Aug 2013 02:53:53 +0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which does all the job, so I think I can take #9. Sergei [*] https://github.com/biopython/biopython/pull/230 On Fri, Aug 30, 2013 at 2:49 PM, Peter Cock wrote: > Hello Biopythoneers, > > I've outlined another relatively simple improvement for potential > new contributors to try below.... > > On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock > wrote: > >> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock > wrote: > >>> Hello all - especially newcomers, > >>> > >>> There are going to be several boring but useful things to do to > >>> the Biopython code base once we're finished with Python 2.5 > >>> (the imminent release of Biopython 1.62 has been clearly > >>> described as the final Biopython release to support it). > >>> > >>> ... > >>> > >>> (4) Scan over the code base looking for any comments > >>> about Python 2.5 (e.g. using the grep command), and > >>> reviewing them one by one to see if there is an old > >>> workaround we can now remove. > >> > >> Lenna had a quick look, there should be some easy one here. > >> > >>> (5) More advanced code review, for example looking > >>> for places we can better take advantage of context > >>> managers (with statements) for file handles. > >> > >> Another new one, related to (5), and fairly easy: > >> > >> (6) Reviewing examples in the docstrings and Tutorial > >> where it would make sense to use a 'with' for file handles. > >> > >> This should also solve many of the ResourceWarning: > >> unclosed file ... warnings visible running the full test > >> suite under Python 3, e.g. see: > >> > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio > > > > On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock > wrote: > >> ... I have come up with another easy task instead, > >> > >> (7) Update exception style > > (7) was done by Chris Mitchell, > > https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942 > > > (8) Excluding doctests and the Tutorial, use print function > > rather than print statement. e.g. replace this: > > (8) is being looked at by Sergei Lebedev. > > ---- > > Here's another idea, under the general issue (5) of taking > advantage of context managers (with statements), which > I would judge to be fairly easy (but not trivial). > > (9) Use context managers (with statements) for temporary > warning filters in the unit tests. > > Currently many of our unit tests add simple filters to ignore > a warning, and then restore the old filters using pop(). This > mostly works, but is fragile and the filter list is global so this > can have strange side effects. See: > > $ grep "warnings." Tests/*.py > > The idea here is to replace this: > > warnings.simplefilter('ignore', PDBConstructionWarning) > #some code which may trigger the warning > warnings.filters.pop() > > with this: > > with warnings.catch_warnings(): > warnings.simplefilter("ignore", PDBConstructionWarning) > #some code which may trigger the warning > > Note the indentation - these changes will not give nice > clean diffs, so this will not be so easy to review. > > I would therefore suggest editing just one test file at a > time (i.e. limit each commit to changing a single file), as > that makes it easier to selectively apply your changes > > Please make sure you test this Python 2.6 which is most > likely to have problems with this "new" style ;) > > (Again, if anyone plans to work on this, please let the list > know to minimised duplicated effort.) > > If you're not familiar with our test suite, there is a chapter > introducing this in the main Tutorial & Cookbook, > http://biopython.org/DIST/docs/tutorial/Tutorial.html > > Thanks, > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Sat Aug 31 05:31:53 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 31 Aug 2013 10:31:53 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Fri, Aug 30, 2013 at 11:53 PM, Sergei Lebedev wrote: > Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which > does all the job, so I think I can take #9. > > Sergei > > [*] https://github.com/biopython/biopython/pull/230 Print-function-like syntax committed for (8), thank you. We'll need to come back to this later as there are still lots of print statements left in the codebase... time for a more general discussion about what people would prefer to see in the user-facing documentation. If you'd like to try some context managers for the warnings in the unit tests (9), that would be great. Note some of the tests will require you to install a command line tool - it should be clear, but if we need to add more documentation (e.g. URLs) please let us know. Thanks, Peter From eric.talevich at gmail.com Thu Aug 1 20:04:29 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Thu, 1 Aug 2013 13:04:29 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock wrote: > On Wednesday, July 31, 2013, Ben Fulton wrote: > > > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see > how > > much code is covered by tests. The overall total was 74% which is pretty > > respectable. > > > > I ran the tests on a fairly fresh machine, which meant I had to install a > > lot of software, some of which I either didn't get installed properly, or > > the tests are out of date, or there were failures for some other reason. > I > > ended up having to skip seven test files: > > > > Dialign_Tool > > EmbossPhylipNew > > Mafft > > PopGen_DFDist > > PopGen_FDist > > XXMotif > > phyml > > > I'm pretty sure I have some or all of those setup on at least one > of my test machines, so with a little more work together we > can try to resolve those (which may mean updating the docs). > I just fixed the error in test_phyml_tool.py, it was a simple one: https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f > There were three tests I managed to get running but still had failures: > > > > FastTree > > NCBI_BLAST > > Prank_too > The FastTree test is not based on the unittest framework, so the output contains the word "Failed" in three places to describe error-handling tests that worked correctly. Can we see the output for this one? (It works on my machine.) The test is also fairly new, so there could be some version-compatibility issues there too. Thanks, Eric From ben at benfulton.net Fri Aug 2 02:20:49 2013 From: ben at benfulton.net (Ben Fulton) Date: Thu, 1 Aug 2013 22:20:49 -0400 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: My test machine was running Ubuntu 12.04. For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and got this error: ApplicationError: Command 'fasttree -out temp_test.tree Quality/example.fasta' returned non-zero exit status 1, 'Unknown or incorrect use of option -out' The NCBI_BLAST error involves rpsblast not being in the install. Version 2.2.25-7 using apt-get. Dialign is version 2.2.1-5 using apt-get. I got two errors: first, DIALIGN2_DIR not being set. It was installed to /usr/bin so I set DIALIGN2_DIR to that directory; then I got "Environment variable DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these items are needed, though I may have missed them in the documentation. I downloaded version 130708 of Prank from http://code.google.com/p/prank-msa/downloads/list. The error is on line 165 of the test file: AssertionError: ----------------- PRANK v.130708: ----------------- Input for the analysis - converting 'Quality/example.fasta' to 'temp with space.phy' EmbossPhylipNew I tried to install from source, but it was complicated and I didn't get it finished. I'll send some notes on the other errors when I get a few minutes. On Thu, Aug 1, 2013 at 4:04 PM, Eric Talevich wrote: > On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock wrote: > >> On Wednesday, July 31, 2013, Ben Fulton wrote: >> >> > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see >> how >> > much code is covered by tests. The overall total was 74% which is pretty >> > respectable. >> > >> > I ran the tests on a fairly fresh machine, which meant I had to install >> a >> > lot of software, some of which I either didn't get installed properly, >> or >> > the tests are out of date, or there were failures for some other >> reason. I >> > ended up having to skip seven test files: >> > >> > Dialign_Tool >> > EmbossPhylipNew >> > Mafft >> > PopGen_DFDist >> > PopGen_FDist >> > XXMotif >> > phyml >> >> >> I'm pretty sure I have some or all of those setup on at least one >> of my test machines, so with a little more work together we >> can try to resolve those (which may mean updating the docs). >> > > I just fixed the error in test_phyml_tool.py, it was a simple one: > > https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f > > > > There were three tests I managed to get running but still had failures: >> > >> > FastTree >> > NCBI_BLAST >> > Prank_too >> > > The FastTree test is not based on the unittest framework, so the output > contains the word "Failed" in three places to describe error-handling tests > that worked correctly. Can we see the output for this one? (It works on my > machine.) > > The test is also fairly new, so there could be some version-compatibility > issues there too. > > Thanks, > Eric > From glenveegee at gmail.com Fri Aug 2 08:17:14 2013 From: glenveegee at gmail.com (Glen van Ginkel) Date: Fri, 2 Aug 2013 09:17:14 +0100 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: <51FB69C6.3040200@ebi.ac.uk> References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: Hi all, Given Lenna's recent work on the mmCIF parser I thought this might be of interest. Kind regards, Glen wwPDB Workshop on mmCIF/PDBx for Programmers -------------------------------------------- What, why and how? ------------------ The world of the PDB will be changing rapidly and profoundly over the next few years. A major change will involve the transition from PDB to mmCIF/PDBx as the principal deposition and dissemination format (see http://www.wwpdb.org/news/news_2013.html#22-May-2013 and http://wwpdb.org/workshop/wgroup.html). To help software developers in the area of structural biology to make the transition and begin supporting the mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is organising a programmers workshop. This two-day event will include lectures by experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of language-specific libraries or packages (C/C++, Java, Python). Ample time will be devoted to tutorials and individual "code hacking", with the experts available to assist the workshop participants. Confirmed tutors include Paul Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). When and where? --------------- The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton, Cambridge, UK, on 20 and 21 November 2013. How much? --------- If you are selected as a participant, we expect you to pay for your own travel to and from Cambridge. However, there is no fee for this workshop, and we will provide accommodation (at the HolidayInn Express in nearby Duxford; http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner on the 20th (all thanks to generous funding from the Wellcome Trust to PDBe). Who can apply and how? ---------------------- This workshop is intended for "high-powered" software developers in any area of structural biology and structural bioinformatics whose products process (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods, visualisation, validation, modelling, docking, structure prediction, etc. To ensure a high ratio of tutors to workshop participants, the number of participants is limited to 15. You can apply for the workshop by sending an e-mail to Sameer Velankar at PDBe (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: - a brief description of the software program(s) or package(s) you have developed or are developing, what it does, in which field, how many users, relevant publications, etc.; - what programming language(s) you are specifically interested in; - how you would benefit from this workshop; - any specific topics or questions you would like to see addressed in the workshop. If the workshop is oversubscribed, we will use the information and motivation provided by the applicants to select the participants. Participants are expected to bring their own laptop with compilers etc. installed. No previous knowledge of mmCIF/PDBx is strictly needed, but participants who are aware of the basic principles of the format will probably gain more from the workshop. Applicants will be informed by mid-September if they have been selected or not, or if they are on the stand-by list. For informal inquiries about the workshop, please contact Sameer Velankar at PDBe (sameer at ebi.ac.uk). Please feel free to distribute this announcement to other interested people or fora! --Gerard Kleywegt & Sameer Velankar Protein Data Bank in Europe A member of the Worldwide Protein Data Bank --- Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK gerard at ebi.ac.uk ..................... pdbe.org Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see https://lists.sdsc.edu/mailman/listinfo/pdb-l . From p.j.a.cock at googlemail.com Fri Aug 2 09:16:53 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 10:16:53 +0100 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: Thanks for forwarding that Glen - it would be great if any of our structural Biopython folk could go. Is anyone interested & reasonably close to Cambridge UK? Peter On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel wrote: > Hi all, > > Given Lenna's recent work on the mmCIF parser I thought this might be of > interest. > > Kind regards, > > Glen > > wwPDB Workshop on mmCIF/PDBx for Programmers > -------------------------------------------- > > What, why and how? > ------------------ > The world of the PDB will be changing rapidly and profoundly over the next > few > years. A major change will involve the transition from PDB to mmCIF/PDBx as > the principal deposition and dissemination format (see > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and > http://wwpdb.org/workshop/wgroup.html). To help software developers in the > area of structural biology to make the transition and begin supporting the > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is > organising a programmers workshop. This two-day event will include lectures > by > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of > language-specific libraries or packages (C/C++, Java, Python). Ample time > will > be devoted to tutorials and individual "code hacking", with the experts > available to assist the workshop participants. Confirmed tutors include Paul > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). > > When and where? > --------------- > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton, > Cambridge, UK, on 20 and 21 November 2013. > > How much? > --------- > If you are selected as a participant, we expect you to pay for your own > travel > to and from Cambridge. However, there is no fee for this workshop, and we > will > provide accommodation (at the HolidayInn Express in nearby Duxford; > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner > on > the 20th (all thanks to generous funding from the Wellcome Trust to PDBe). > > Who can apply and how? > ---------------------- > This workshop is intended for "high-powered" software developers in any area > of structural biology and structural bioinformatics whose products process > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods, > visualisation, validation, modelling, docking, structure prediction, etc. To > ensure a high ratio of tutors to workshop participants, the number of > participants is limited to 15. > > You can apply for the workshop by sending an e-mail to Sameer Velankar at > PDBe > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: > > - a brief description of the software program(s) or package(s) you have > developed or are developing, what it does, in which field, how many users, > relevant publications, etc.; > - what programming language(s) you are specifically interested in; > - how you would benefit from this workshop; > - any specific topics or questions you would like to see addressed in the > workshop. > > If the workshop is oversubscribed, we will use the information and > motivation > provided by the applicants to select the participants. > > Participants are expected to bring their own laptop with compilers etc. > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but > participants who are aware of the basic principles of the format will > probably > gain more from the workshop. > > Applicants will be informed by mid-September if they have been selected or > not, or if they are on the stand-by list. > > For informal inquiries about the workshop, please contact Sameer Velankar at > PDBe (sameer at ebi.ac.uk). > > Please feel free to distribute this announcement to other interested people > or > fora! > > > --Gerard Kleywegt & Sameer Velankar > Protein Data Bank in Europe > A member of the Worldwide Protein Data Bank > > --- > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK > gerard at ebi.ac.uk ..................... pdbe.org > Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see > https://lists.sdsc.edu/mailman/listinfo/pdb-l . > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From p.j.a.cock at googlemail.com Fri Aug 2 09:31:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 10:31:27 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: Thanks for these details Ben - it sounds like a mixture of real test failures, and mere warnings that an external tool wasn't found. On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > My test machine was running Ubuntu 12.04. > > For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and > got this error: > ApplicationError: Command 'fasttree -out temp_test.tree > Quality/example.fasta' returned non-zero exit status 1, 'Unknown or > incorrect use of option -out' I don't seem to have fasttree installed at all, and from the test and wrapper it is not explicit about which version is was originally written for. > The NCBI_BLAST error involves rpsblast not being in the install. > Version 2.2.25-7 using apt-get. I believe this is down to an NCBI stupidity with binary name clashes, both the old 'legacy' C BLAST and the new C++ BLAST+ suite have a binary called rpsblast. Our test code copes with this by searching the path and checking each rpsblast binary found - looking for the new version only. However, Debian policy is to resolve ambiguities like this with a unilateral renaming - in this case I *think* they called the new binary rpsblast+ instead. Can you confirm that? I don't have access to a Debian machine right now. So, strictly speaking the Biopython test is correct - you don't have the new rpsblast installed. However, it would be more helpful if we also checked for the Debian alias rpsblast+ too. That shouldn't be too complicated to do - especially if you could rerun the tests using Biopython from git for me? > Dialign is version 2.2.1-5 using apt-get. I got two errors: first, > DIALIGN2_DIR not being set. It was installed to /usr/bin so I set > DIALIGN2_DIR to that directory; then I got "Environment variable > DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these > items are needed, though I may have missed them in the documentation. This again looks like a Debian packaging issue versus the manual install instructions for Dialign. Perhaps they have fixed Dialign to find its matrix under a data folder... You could try simple commenting out the check on the environment variable in test_Dialign_tool.py and seeing if the tests pass or not. > I downloaded version 130708 of Prank from > http://code.google.com/p/prank-msa/downloads/list. The error is on line 165 > of the test file: > > AssertionError: > ----------------- > PRANK v.130708: > ----------------- > > Input for the analysis > - converting 'Quality/example.fasta' to 'temp with space.phy' This sounds like a minor change in the stdout with recent versions of PRANK. > EmbossPhylipNew I tried to install from source, but it was complicated and I > didn't get it finished. > > I'll send some notes on the other errors when I get a few minutes. Thanks, Peter From p.j.a.cock at googlemail.com Fri Aug 2 12:00:54 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 13:00:54 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock wrote: > >> The NCBI_BLAST error involves rpsblast not being in the install. >> Version 2.2.25-7 using apt-get. > > I believe this is down to an NCBI stupidity with binary name > clashes, both the old 'legacy' C BLAST and the new C++ > BLAST+ suite have a binary called rpsblast. > > Our test code copes with this by searching the path and checking > each rpsblast binary found - looking for the new version only. > > However, Debian policy is to resolve ambiguities like this with > a unilateral renaming - in this case I *think* they called the new > binary rpsblast+ instead. Can you confirm that? I don't have > access to a Debian machine right now. Certainly this was their plan and was done on Bio-Linux, http://lists.debian.org/debian-med/2011/05/msg00025.html > So, strictly speaking the Biopython test is correct - you don't > have the new rpsblast installed. However, it would be more > helpful if we also checked for the Debian alias rpsblast+ too. > > That shouldn't be too complicated to do - especially if you > could rerun the tests using Biopython from git for me? This commit is now on our master branch, https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a Thanks, Peter From anaryin at gmail.com Fri Aug 2 16:13:04 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 2 Aug 2013 09:13:04 -0700 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: Hi Peter, Glen, I'll be going (or trying to at least). Cheers, Jo?o 2013/8/2 Peter Cock > Thanks for forwarding that Glen - it would be great if any of > our structural Biopython folk could go. > > Is anyone interested & reasonably close to Cambridge UK? > > Peter > > On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel > wrote: > > Hi all, > > > > Given Lenna's recent work on the mmCIF parser I thought this might be of > > interest. > > > > Kind regards, > > > > Glen > > > > wwPDB Workshop on mmCIF/PDBx for Programmers > > -------------------------------------------- > > > > What, why and how? > > ------------------ > > The world of the PDB will be changing rapidly and profoundly over the > next > > few > > years. A major change will involve the transition from PDB to mmCIF/PDBx > as > > the principal deposition and dissemination format (see > > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and > > http://wwpdb.org/workshop/wgroup.html). To help software developers in > the > > area of structural biology to make the transition and begin supporting > the > > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is > > organising a programmers workshop. This two-day event will include > lectures > > by > > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of > > language-specific libraries or packages (C/C++, Java, Python). Ample time > > will > > be devoted to tutorials and individual "code hacking", with the experts > > available to assist the workshop participants. Confirmed tutors include > Paul > > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), > Andreas > > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). > > > > When and where? > > --------------- > > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in > Hinxton, > > Cambridge, UK, on 20 and 21 November 2013. > > > > How much? > > --------- > > If you are selected as a participant, we expect you to pay for your own > > travel > > to and from Cambridge. However, there is no fee for this workshop, and we > > will > > provide accommodation (at the HolidayInn Express in nearby Duxford; > > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop > dinner > > on > > the 20th (all thanks to generous funding from the Wellcome Trust to > PDBe). > > > > Who can apply and how? > > ---------------------- > > This workshop is intended for "high-powered" software developers in any > area > > of structural biology and structural bioinformatics whose products > process > > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid > methods, > > visualisation, validation, modelling, docking, structure prediction, > etc. To > > ensure a high ratio of tutors to workshop participants, the number of > > participants is limited to 15. > > > > You can apply for the workshop by sending an e-mail to Sameer Velankar at > > PDBe > > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: > > > > - a brief description of the software program(s) or package(s) you have > > developed or are developing, what it does, in which field, how many > users, > > relevant publications, etc.; > > - what programming language(s) you are specifically interested in; > > - how you would benefit from this workshop; > > - any specific topics or questions you would like to see addressed in the > > workshop. > > > > If the workshop is oversubscribed, we will use the information and > > motivation > > provided by the applicants to select the participants. > > > > Participants are expected to bring their own laptop with compilers etc. > > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but > > participants who are aware of the basic principles of the format will > > probably > > gain more from the workshop. > > > > Applicants will be informed by mid-September if they have been selected > or > > not, or if they are on the stand-by list. > > > > For informal inquiries about the workshop, please contact Sameer > Velankar at > > PDBe (sameer at ebi.ac.uk). > > > > Please feel free to distribute this announcement to other interested > people > > or > > fora! > > > > > > --Gerard Kleywegt & Sameer Velankar > > Protein Data Bank in Europe > > A member of the Worldwide Protein Data Bank > > > > --- > > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK > > gerard at ebi.ac.uk ..................... pdbe.org > > Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk > > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see > > https://lists.sdsc.edu/mailman/listinfo/pdb-l . > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Fri Aug 2 16:20:02 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 2 Aug 2013 17:20:02 +0100 Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK) In-Reply-To: References: <51FB69C6.3040200@ebi.ac.uk> Message-ID: That's good new Jo?o - thanks! Peter. On Fri, Aug 2, 2013 at 5:13 PM, Jo?o Rodrigues wrote: > Hi Peter, Glen, > > I'll be going (or trying to at least). > > Cheers, > > Jo?o > > > 2013/8/2 Peter Cock >> >> Thanks for forwarding that Glen - it would be great if any of >> our structural Biopython folk could go. >> >> Is anyone interested & reasonably close to Cambridge UK? >> >> Peter >> >> On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel >> wrote: >> > Hi all, >> > >> > Given Lenna's recent work on the mmCIF parser I thought this might be of >> > interest. >> > >> > Kind regards, >> > >> > Glen >> > >> > wwPDB Workshop on mmCIF/PDBx for Programmers >> > -------------------------------------------- >> > >> > What, why and how? >> > ------------------ >> > The world of the PDB will be changing rapidly and profoundly over the >> > next >> > few >> > years. A major change will involve the transition from PDB to mmCIF/PDBx >> > as >> > the principal deposition and dissemination format (see >> > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and >> > http://wwpdb.org/workshop/wgroup.html). To help software developers in >> > the >> > area of structural biology to make the transition and begin supporting >> > the >> > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is >> > organising a programmers workshop. This two-day event will include >> > lectures >> > by >> > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of >> > language-specific libraries or packages (C/C++, Java, Python). Ample >> > time >> > will >> > be devoted to tutorials and individual "code hacking", with the experts >> > available to assist the workshop participants. Confirmed tutors include >> > Paul >> > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), >> > Andreas >> > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB). >> > >> > When and where? >> > --------------- >> > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in >> > Hinxton, >> > Cambridge, UK, on 20 and 21 November 2013. >> > >> > How much? >> > --------- >> > If you are selected as a participant, we expect you to pay for your own >> > travel >> > to and from Cambridge. However, there is no fee for this workshop, and >> > we >> > will >> > provide accommodation (at the HolidayInn Express in nearby Duxford; >> > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop >> > dinner >> > on >> > the 20th (all thanks to generous funding from the Wellcome Trust to >> > PDBe). >> > >> > Who can apply and how? >> > ---------------------- >> > This workshop is intended for "high-powered" software developers in any >> > area >> > of structural biology and structural bioinformatics whose products >> > process >> > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid >> > methods, >> > visualisation, validation, modelling, docking, structure prediction, >> > etc. To >> > ensure a high ratio of tutors to workshop participants, the number of >> > participants is limited to 15. >> > >> > You can apply for the workshop by sending an e-mail to Sameer Velankar >> > at >> > PDBe >> > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include: >> > >> > - a brief description of the software program(s) or package(s) you have >> > developed or are developing, what it does, in which field, how many >> > users, >> > relevant publications, etc.; >> > - what programming language(s) you are specifically interested in; >> > - how you would benefit from this workshop; >> > - any specific topics or questions you would like to see addressed in >> > the >> > workshop. >> > >> > If the workshop is oversubscribed, we will use the information and >> > motivation >> > provided by the applicants to select the participants. >> > >> > Participants are expected to bring their own laptop with compilers etc. >> > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but >> > participants who are aware of the basic principles of the format will >> > probably >> > gain more from the workshop. >> > >> > Applicants will be informed by mid-September if they have been selected >> > or >> > not, or if they are on the stand-by list. >> > >> > For informal inquiries about the workshop, please contact Sameer >> > Velankar at >> > PDBe (sameer at ebi.ac.uk). >> > >> > Please feel free to distribute this announcement to other interested >> > people >> > or >> > fora! >> > >> > >> > --Gerard Kleywegt & Sameer Velankar >> > Protein Data Bank in Europe >> > A member of the Worldwide Protein Data Bank >> > >> > --- >> > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK >> > gerard at ebi.ac.uk ..................... pdbe.org >> > Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk >> > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see >> > https://lists.sdsc.edu/mailman/listinfo/pdb-l . >> > _______________________________________________ >> > Biopython-dev mailing list >> > Biopython-dev at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biopython-dev >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev > > From ben at benfulton.net Mon Aug 5 01:28:34 2013 From: ben at benfulton.net (Ben Fulton) Date: Sun, 4 Aug 2013 21:28:34 -0400 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: Fixed the following: I had installed Mafft version 6.850-1 from apt-get, which apparently is more than a year old and doesn't work. The tests ran after I installed it from source. I had not gotten a path set up properly for XXMotif; once I did the tests all ran. The DiAlign tests passed after I removed the precondition checks. Did not fix: The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't find anywhere else to install the PopGen software from. So with all of those modifications, I ran coverage against the latest code from GitHub. Results are once again available on my website, http://benfulton.net/BioPython162_Coverage , and the following issues remain: EmbossPhylipNew - skipped, too hard to install Fasttree - error, apparently a versioning issue PopGen_FDist and PopGen_DFdist - skipped, unavailable Prank - failed, recent versions of the tool have some kind of output change On Fri, Aug 2, 2013 at 8:00 AM, Peter Cock wrote: > On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock > wrote: > > > >> The NCBI_BLAST error involves rpsblast not being in the install. > >> Version 2.2.25-7 using apt-get. > > > > I believe this is down to an NCBI stupidity with binary name > > clashes, both the old 'legacy' C BLAST and the new C++ > > BLAST+ suite have a binary called rpsblast. > > > > Our test code copes with this by searching the path and checking > > each rpsblast binary found - looking for the new version only. > > > > However, Debian policy is to resolve ambiguities like this with > > a unilateral renaming - in this case I *think* they called the new > > binary rpsblast+ instead. Can you confirm that? I don't have > > access to a Debian machine right now. > > Certainly this was their plan and was done on Bio-Linux, > http://lists.debian.org/debian-med/2011/05/msg00025.html > > > So, strictly speaking the Biopython test is correct - you don't > > have the new rpsblast installed. However, it would be more > > helpful if we also checked for the Debian alias rpsblast+ too. > > > > That shouldn't be too complicated to do - especially if you > > could rerun the tests using Biopython from git for me? > > This commit is now on our master branch, > > > https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a > > Thanks, > > Peter > From yeyanbo289 at gmail.com Mon Aug 5 08:57:34 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 5 Aug 2013 16:57:34 +0800 Subject: [Biopython-dev] GSOC weekly update 8 Message-ID: Hi all, I post an update for the Biopython.Phylo project here: http://blog.yeyanbo.com/posts/google-summer-of-code-8.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From p.j.a.cock at googlemail.com Mon Aug 5 11:46:00 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 12:46:00 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: > > The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't > find anywhere else to install the PopGen software from. > There seems to be a fairly recent snapshot on archive.org, http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html Meanwhile, I have emailed Dr. Mark Beaumont at Reading University to ask about the server status. Regards, Peter From p.j.a.cock at googlemail.com Mon Aug 5 12:14:04 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 13:14:04 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock wrote: > On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: >> >> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't >> find anywhere else to install the PopGen software from. >> > > There seems to be a fairly recent snapshot on archive.org, > http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html > > Meanwhile, I have emailed Dr. Mark Beaumont at Reading > University to ask about the server status. Mark has moved to Bristol: http://www.maths.bris.ac.uk/people/profile/mamab FDist and DFDist are available here now: http://www.maths.bris.ac.uk/~mamab/ We need to update the Biopython documentation (and check those versions from Bristol still work with our tests). Tiago, could you handle that? Thanks, Peter From arklenna at gmail.com Mon Aug 5 13:11:19 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Mon, 5 Aug 2013 09:11:19 -0400 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: Peter, It's been a few days that I can't connect to redmine. I just got a error page saying RoR couldn't start or connect to the MySQL server. Cheers, Lenna On Mon, Jul 22, 2013 at 10:36 AM, Peter Cock wrote: > On Mon, Jul 22, 2013 at 12:43 PM, Peter Cock > wrote: > > > > Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now. > > > > In the absence of any dissenting views, and the fact that RedMine is > > also offline right now (which I've raised with the OBF admin volunteers), > > Fixed again :) > > > I've enabled GitHub issues & linked to this from the main page: > > > > https://github.com/biopython/biopython/issues > > > > You'll notice there are already lots of issues there - all pull request > > related. This is one reason why an automated import of the old > > Bugzilla/RedMine issues could be complicated. > > > > Various other bits of our documentation will need to be updated... > > Hopefully done now, e.g. > > https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 > > https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 > > Note that there doesn't seem to be a way to turn off new issues in > a RedMine project - there are hacks via removing the ability from > the roles, but I fear that would affect the other projects still using > the RedMine server (e.g. BioPerl). > > Instead we may just have to do the triage/migration and then > drop the links to the old RedMine server from the website etc. > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Mon Aug 5 13:43:19 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 14:43:19 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock wrote: > On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock wrote: >> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: >>> >>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't >>> find anywhere else to install the PopGen software from. >>> >> >> There seems to be a fairly recent snapshot on archive.org, >> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html >> >> Meanwhile, I have emailed Dr. Mark Beaumont at Reading >> University to ask about the server status. > > Mark has moved to Bristol: > http://www.maths.bris.ac.uk/people/profile/mamab > > FDist and DFDist are available here now: > http://www.maths.bris.ac.uk/~mamab/ > > We need to update the Biopython documentation (and check > those versions from Bristol still work with our tests). > > Tiago, could you handle that? According to his email auto-reply, Tiago is away right now. I've updated a couple of URLs in the source code: https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850 Ben - did you see anything else which needs updating here? Thanks, Peter From p.j.a.cock at googlemail.com Mon Aug 5 14:01:12 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 15:01:12 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Mon, Aug 5, 2013 at 2:11 PM, Lenna Peterson wrote: > Peter, > > It's been a few days that I can't connect to redmine. I just got a error > page saying RoR couldn't start or connect to the MySQL server. > > Cheers, > > Lenna OK, Chris Dag has got RedMine to work again, and told me what he did in case I need to restart if this happens again. If any RedMine guru is reading and has some thoughts on the cause and long term solution, drop us an email please. As to issue triage - I suggest you start with anything you filed or commented on, then things you are familiar with. But any order is fine really. I suggest for "moving" an issue, we file the new GitHub issue (linking to the old issue, but also trying to capture any relevant information from the old bug tracker to be self sufficient), and then close the old RedMine issue with a link to its replacement. Thanks, Peter From p.j.a.cock at googlemail.com Mon Aug 5 14:26:32 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Aug 2013 15:26:32 +0100 Subject: [Biopython-dev] Bio.XXX.Applications vs Bio.motifs.applications Message-ID: Hi all, I've noticed that as part of migrating from Bio.Motif to Bio.motifs, the Applications module has acquired a lower case name. Lower case module names are in principle a good thing (PEP8) but elsewhere in Biopython the Applications modules are all using title case. Would a lower case shorter name be better, such as apps (i.e. Bio.motifs.apps in this case)? This could also be adopted in other modules for a gradual conversion if desired (e.g. introduce Bio.Phylo.apps as an alias for Bio.Phylo.Applications). What do people think? Thanks, Peter From dalke at dalkescientific.com Tue Aug 6 01:18:06 2013 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 6 Aug 2013 03:18:06 +0200 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com> On Jul 24, 2013, at 11:13 AM, Peter Cock wrote: > The current Biopython License is very short and liberal, and I have > long described it as an MIT/BSD type licence. However the actual > wording matches neither of these exactly (as far as I could tell): That's my doing. When Jeff and I started Biopython in 1999 we needed to choose a license. We started with the Python license, which (for 1.5.2) was: Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the names of Stichting Mathematisch Centrum or CWI or Corporation for National Research Initiatives or CNRI not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. While CWI is the initial source for this software, a modified version is made available by the Corporation for National Research Initiatives (CNRI) at the Internet address ftp://ftp.python.org. STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. Compare that to the Biopython license, with the alterations marked: Permission to use, copy, modify, and distribute this software and its documentation >>>with or without modifications<< and for any purpose and without fee is hereby granted, provided that >>any copyright notices<<< appear in all copies and that both >>>those copyright notices<<< and this permission notice appear in supporting documentation, and that the names of >>>the contributors or copyright holders<<< not be used in advertising or publicity pertaining to distribution of the software without specific prior permission. [2nd paragraph of original Python license omitted] >>>THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE<<< DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL >>>THE CONTRIBUTORS OR COPYRIGHT HOLDERS<<< BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. This was called a "Python-style license", and you can see an example at http://effbot.org/zone/copyright.htm . Indeed, his PIL package is an example of a current Python module which still uses that license: http://www.pythonware.com/products/pil/license.htm You'll see that Fredrik Lundh refers to it as the "Historical Permission Notice and Disclaimer", and points to: http://opensource.org/licenses/historical.php Further note that the OSI comments that "This License has been voluntarily deprecated by its author" .. whatever that means ... and that that http://opensource.org/proliferation-report describes it as "redundant with more popular licenses", and more specifically the BSD. > In theory we could ask the OSI to approve our current license, but as > they explain "yet another license" is not a good thing to encourage: > http://opensource.org/proliferation It wouldn't be a "yet another license" as it's already registered with the OSI ... almost. The one odd alteration I made was to add "with or without modifications", because some people on comp.lang.python expressed concern that "use, copy, modify, and distribute" could be interpreted to be restrictive, as in "you can modify it original source code, or distribute the original source code, but you can't distribute the modified source code. I've since learned that this is a hyper-picky interpretation with no legal bearing. I don't know if that "with or without modifications" is enough different that the OSI would say it's doesn't fall under the 'Historical Permission Notice and Disclaimer', In any case, I agree with a relicensing. The current license is from a bygone era. Nowadays I just pick the MIT license. If there's anything copyright by me still remaining in Biopython, I hereby relicense it under the MIT and/or one of the standard n-clause BSD licenses, at your choice. Cheers, Andrew dalke at dalkescientific.com From p.j.a.cock at googlemail.com Tue Aug 6 09:11:33 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 6 Aug 2013 10:11:33 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com> References: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com> Message-ID: On Tue, Aug 6, 2013 at 2:18 AM, Andrew Dalke wrote: > On Jul 24, 2013, at 11:13 AM, Peter Cock wrote: >> The current Biopython License is very short and liberal, and I have >> long described it as an MIT/BSD type licence. However the actual >> wording matches neither of these exactly (as far as I could tell): > > That's my doing. When Jeff and I started Biopython in 1999 we > needed to choose a license. We started with the Python license, > which (for 1.5.2) was: > > ... Ah - with hindsight I should have checked the older Python licenses, but I was thinking more of their current very long version. > You'll see that Fredrik Lundh refers to it as the "Historical > Permission Notice and Disclaimer", and points to: > > http://opensource.org/licenses/historical.php > > Further note that the OSI comments that "This License has been > voluntarily deprecated by its author" .. whatever that > means ... and that that http://opensource.org/proliferation-report > describes it as "redundant with more popular licenses", and > more specifically the BSD. > >> In theory we could ask the OSI to approve our current license, but as >> they explain "yet another license" is not a good thing to encourage: >> http://opensource.org/proliferation > > It wouldn't be a "yet another license" as it's already > registered with the OSI ... almost. > > The one odd alteration I made was to add "with or without > modifications", because some people on comp.lang.python > expressed concern that "use, copy, modify, and distribute" > could be interpreted to be restrictive, as in "you can > modify it original source code, or distribute the original > source code, but you can't distribute the modified source > code. I've since learned that this is a hyper-picky > interpretation with no legal bearing. > > I don't know if that "with or without modifications" is > enough different that the OSI would say it's doesn't fall > under the 'Historical Permission Notice and Disclaimer', Thanks for that background information. Educational. > In any case, I agree with a relicensing. The current > license is from a bygone era. Nowadays I just pick the MIT > license. > > If there's anything copyright by me still remaining in > Biopython, I hereby relicense it under the MIT and/or one > of the standard n-clause BSD licenses, at your choice. That's great Andrew - thank you, Peter From p.j.a.cock at googlemail.com Tue Aug 6 22:51:22 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 6 Aug 2013 23:51:22 +0100 Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application plans Message-ID: Hi Christian et al., I've just noticed something in the XXmotif wrapper which I should have raised back in November 2012 when it was committed. This is to do with the way the options were define, e.g. _Option(["--negSet", "negSet", "negset", "NEGSET"], "sequence set which has to be used as a reference set", filename = True, equate = False), The first argument is a list of names, aliases which can be used via the (legacy) set_parameter method. Of these the first is what goes in the actual command string, and the last must be a valid Python identifier and becomes a property and a keyword argument for the __init__ method (and ideally follow PEP8 guidelines). Normally the _Option would just have TWO alias, in this case ["--negSeq, "negset"] would seem best. Clearly I'd not documented this well enough, but I've tried to make this more explicit now: https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293 Was there a special reason for all these case variants in the XXmotif options?? We could perhaps just change this now in the newer Bio.motifs module, despite this being live in the Biopython 1.61 release... since right now the nasty all upper case aliases are being used as the property names and keyword names. But that could break a few scripts already using Bio.motifs.application's XXmotif wrapper. Looking ahead, other than set_parameter, all the other legacy bits in Bio.Application have all been removed - so we could take a fresh look at if we can transition to a more explicit application definition, which I hope is possible with the class files defining these properties explicitly (perhaps with decorators for things like validation methods) - rather than implicitly as now via the __init__ method which doesn't suit things like autogenerated API docs. There may be a catch in how to best make the parameter order explicit (currently done via the parameters being in a list) which can be vital for many command line tools. Regards, Peter From christian at brueffer.de Thu Aug 8 10:37:19 2013 From: christian at brueffer.de (Christian Brueffer) Date: Thu, 08 Aug 2013 12:37:19 +0200 Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application plans In-Reply-To: References: Message-ID: <520374DF.9070301@brueffer.de> On 8/7/13 0:51 , Peter Cock wrote: > Hi Christian et al., > > I've just noticed something in the XXmotif wrapper which > I should have raised back in November 2012 when it was > committed. This is to do with the way the options were > define, e.g. > > _Option(["--negSet", "negSet", "negset", "NEGSET"], > "sequence set which has to be used as a reference set", > filename = True, > equate = False), > > The first argument is a list of names, aliases which can > be used via the (legacy) set_parameter method. Of > these the first is what goes in the actual command > string, and the last must be a valid Python identifier > and becomes a property and a keyword argument > for the __init__ method (and ideally follow PEP8 > guidelines). > Yeah, unfortunately I wasn't aware of this detail. > Normally the _Option would just have TWO alias, > in this case ["--negSeq, "negset"] would seem best. > > Clearly I'd not documented this well enough, but > I've tried to make this more explicit now: > https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293 > > Was there a special reason for all these case variants > in the XXmotif options?? > I basically followed the example set by Bio/Align/Applications/_Clustalw.py. The "rationale" was to allow for people to use their favourite spelling variety. I guess it was bad luck this happened to serve as an example, as it was the first piece of code I ever touched in BioPython. It would be nice to streamline all application wrappers in this regard sometime... Chris From p.j.a.cock at googlemail.com Thu Aug 8 11:00:22 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Aug 2013 12:00:22 +0100 Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application plans In-Reply-To: <520374DF.9070301@brueffer.de> References: <520374DF.9070301@brueffer.de> Message-ID: On Thu, Aug 8, 2013 at 11:37 AM, Christian Brueffer wrote: >> >> Was there a special reason for all these case variants >> in the XXmotif options?? > > I basically followed the example set by > Bio/Align/Applications/_Clustalw.py. Ah. Without checking I think maybe the ClustalW documentation used both cases - but the order was deliberately with the lower case one last as that was used in the Python object as the property name and keyword. > The "rationale" was to allow for people to use their favourite > spelling variety. > > I guess it was bad luck this happened to serve as an example, as it > was the first piece of code I ever touched in BioPython. > > It would be nice to streamline all application wrappers in this regard > sometime... Yeah, perhaps we can formally deprecate set_parameter in the next release which means all the aliases 'go away' and that leaves us with just the final entry exposed as the usable property name and keyword. Peter From arklenna at gmail.com Thu Aug 8 19:54:58 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 8 Aug 2013 15:54:58 -0400 Subject: [Biopython-dev] PDB occupancy behavior Message-ID: Hi all, I just submitted a pull request I'd like wider feedback on. https://github.com/biopython/biopython/pull/207 In summary, I am using software-produced PDB files that simply stop after the coordinate data, so occupancy data is missing. Currently, the Biopython PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing this to 1.0. I would like to see if anyone knows of situations in which this would be a bad idea. Cheers, Lenna From anaryin at gmail.com Thu Aug 8 20:02:39 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Thu, 8 Aug 2013 13:02:39 -0700 Subject: [Biopython-dev] [Biopython] PDB occupancy behavior In-Reply-To: References: Message-ID: Hi Lenna, As I mentioned in the Github email, I think it's fine. It doesn't matter if the occupancy is 0 or 1 in case of a model most of the time. I agree with it. The only bad thing I can think about is having occupancy for a certain atom larger than 1 in some bogus cases but to be honest, no software that I know of bothers checking that... Cheers, Jo?o 2013/8/8 Lenna Peterson > Hi all, > > I just submitted a pull request I'd like wider feedback on. > > https://github.com/biopython/biopython/pull/207 > > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the Biopython > PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing > this to 1.0. > > I would like to see if anyone knows of situations in which this would be a > bad idea. > > Cheers, > > Lenna > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From p.j.a.cock at googlemail.com Thu Aug 8 22:37:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Aug 2013 23:37:27 +0100 Subject: [Biopython-dev] [Biopython] PDB occupancy behavior In-Reply-To: References: Message-ID: Thanks everyone - that seems like a clear consensus, patch applied :) Peter On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared wrote: > Thanks, Lenna and Jo?o - > > I also agree, 1.0 is a better default occupancy value. For most > structural manipulation purposes, unless specified otherwise, we must assume > the atoms listed are present in the structure at full occupancy. Setting a > reduced occupancy can be useful for partially bound ligands, disordered > loops, and so forth, but doing so is the exception, not the rule. > > Cheers, > Jared > > -- > Jared Sampson > Xiangpeng Kong Lab > NYU Langone Medical Center > Old Public Health Building, Room 610 > 341 East 25th Street > New York, NY 10016 > 212-263-7898 > http://kong.med.nyu.edu/ > > > > > On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues > > wrote: > > Hi Lenna, > > As I mentioned in the Github email, I think it's fine. It doesn't matter > if the occupancy is 0 or 1 in case of a model most of the time. I agree > with it. The only bad thing I can think about is having occupancy for > a certain atom larger than 1 in some bogus cases but to be honest, > no software that I know of bothers checking that... > > Cheers, > > Jo?o > > > 2013/8/8 Lenna Peterson > > > Hi all, > > I just submitted a pull request I'd like wider feedback on. > > https://github.com/biopython/biopython/pull/207 > > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the > Biopython PDBParser sets missing or blank occupancy to 0.0. I am > suggesting changing this to 1.0. > > I would like to see if anyone knows of situations in which this would be a > bad idea. > > Cheers, > > Lenna From p.j.a.cock at googlemail.com Thu Aug 8 22:37:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Aug 2013 23:37:27 +0100 Subject: [Biopython-dev] [Biopython] PDB occupancy behavior In-Reply-To: References: Message-ID: Thanks everyone - that seems like a clear consensus, patch applied :) Peter On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared wrote: > Thanks, Lenna and Jo?o - > > I also agree, 1.0 is a better default occupancy value. For most > structural manipulation purposes, unless specified otherwise, we must assume > the atoms listed are present in the structure at full occupancy. Setting a > reduced occupancy can be useful for partially bound ligands, disordered > loops, and so forth, but doing so is the exception, not the rule. > > Cheers, > Jared > > -- > Jared Sampson > Xiangpeng Kong Lab > NYU Langone Medical Center > Old Public Health Building, Room 610 > 341 East 25th Street > New York, NY 10016 > 212-263-7898 > http://kong.med.nyu.edu/ > > > > > On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues > > wrote: > > Hi Lenna, > > As I mentioned in the Github email, I think it's fine. It doesn't matter > if the occupancy is 0 or 1 in case of a model most of the time. I agree > with it. The only bad thing I can think about is having occupancy for > a certain atom larger than 1 in some bogus cases but to be honest, > no software that I know of bothers checking that... > > Cheers, > > Jo?o > > > 2013/8/8 Lenna Peterson > > > Hi all, > > I just submitted a pull request I'd like wider feedback on. > > https://github.com/biopython/biopython/pull/207 > > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the > Biopython PDBParser sets missing or blank occupancy to 0.0. I am > suggesting changing this to 1.0. > > I would like to see if anyone knows of situations in which this would be a > bad idea. > > Cheers, > > Lenna From ben at benfulton.net Fri Aug 9 01:03:10 2013 From: ben at benfulton.net (Ben Fulton) Date: Thu, 8 Aug 2013 21:03:10 -0400 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: Everything else is passing. The PopGen files pass as well after installing them from source. On Mon, Aug 5, 2013 at 9:43 AM, Peter Cock wrote: > On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock > wrote: > > On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock > wrote: > >> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton wrote: > >>> > >>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I > can't > >>> find anywhere else to install the PopGen software from. > >>> > >> > >> There seems to be a fairly recent snapshot on archive.org, > >> > http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html > >> > >> Meanwhile, I have emailed Dr. Mark Beaumont at Reading > >> University to ask about the server status. > > > > Mark has moved to Bristol: > > http://www.maths.bris.ac.uk/people/profile/mamab > > > > FDist and DFDist are available here now: > > http://www.maths.bris.ac.uk/~mamab/ > > > > We need to update the Biopython documentation (and check > > those versions from Bristol still work with our tests). > > > > Tiago, could you handle that? > > According to his email auto-reply, Tiago is away right now. > > I've updated a couple of URLs in the source code: > > https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850 > > Ben - did you see anything else which needs updating here? > > Thanks, > > Peter > From mok at bioxray.dk Fri Aug 9 08:39:55 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 10:39:55 +0200 Subject: [Biopython-dev] PDB occupancy behavior Message-ID: Lenna wrote: > In summary, I am using software-produced PDB files that simply stop after > the coordinate data, so occupancy data is missing. Currently, the Biopython > PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing > this to 1.0. I think it is an incorrect default behaviour to set the occupancy to 1 if it's not present in the file. If the occupancy is not there, you can't say anything about it, and it should be set to 0, so the current defaults are correct IMO. If, for some reason, you NEED the occupancy to be 1, and it is not, it is very simple to write a loop modifying it. I.e. special needs should be taken care of in the users program, not Bio.PDB. Cheers, Morten -- Morten Kjeldgaard, asc. professor, MSc, PhD Dept. of Molecular Biology and Genetics, Aarhus University Gustav Wieds Vej 10C, Building 3135, DK-8000 Aarhus C, Denmark. From mok at bioxray.dk Fri Aug 9 08:33:37 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 10:33:37 +0200 Subject: [Biopython-dev] Redmine issue 2727 ready for pull Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Hi, I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record. In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done. Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example. My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future. Cheers, Morten [1] https://redmine.open-bio.org/issues/2727 [2] https://github.com/mok0/biopython/tree/pdbwork From p.j.a.cock at googlemail.com Fri Aug 9 08:47:15 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 09:47:15 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: Message-ID: On Fri, Aug 9, 2013 at 9:39 AM, Morten Kjeldgaard wrote: > Lenna wrote: > > > In summary, I am using software-produced PDB files that simply stop > after > > the coordinate data, so occupancy data is missing. Currently, the > Biopython > > PDBParser sets missing or blank occupancy to 0.0. I am suggesting > changing > > this to 1.0. > > I think it is an incorrect default behaviour to set the occupancy to 1 if it's not present in the file. If the occupancy is not there, you can't say anything about it, and it should be set to 0, so the current defaults are correct IMO. > > If, for some reason, you NEED the occupancy to be 1, and it is not, it is very simple to write a loop modifying it. I.e. special needs should be taken care of in the users program, not Bio.PDB. > > Cheers, > Morten > > How about the special float values NaN or NA instead? Or the Python special value None? Peter From mok at bioxray.dk Fri Aug 9 08:33:37 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 10:33:37 +0200 Subject: [Biopython-dev] Redmine issue 2727 ready for pull Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Hi, I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record. In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done. Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example. My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future. Cheers, Morten [1] https://redmine.open-bio.org/issues/2727 [2] https://github.com/mok0/biopython/tree/pdbwork From mok at bioxray.dk Fri Aug 9 09:07:13 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Fri, 9 Aug 2013 11:07:13 +0200 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: Message-ID: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> On 09/08/2013, at 10:47, Peter Cock wrote: > How about the special float values NaN or NA instead? > Or the Python special value None? TBH I don't think there is any good reason to change the current defaults. On the contrary, we should be careful when changing default values since this might break users' programs. My point is, that Lenna wants to read files that does not follow the PDB standard, and so she needs to make provisions for that in her own program, not the toolkit. Putting None in the value of a field that isn't there, but should be according the format specification is more reasonable, since it alerts the user to the fact that something is fishy. However, it should only be done this way if that is a philosophy used throughout the Biopython toolkit. Is it? I would warn against using NaN since it is non-pythonic and a nightmare to deal with in practice. Cheers, Morten From p.j.a.cock at googlemail.com Fri Aug 9 11:06:46 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 12:06:46 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard wrote: > On 09/08/2013, at 10:47, Peter Cock wrote: > > > How about the special float values NaN or NA instead? > > Or the Python special value None? > > TBH I don't think there is any good reason to change the current defaults. > On the contrary, we should be careful when changing default values since > this might break users' programs. > > My point is, that Lenna wants to read files that does not follow the PDB > standard, and so she needs to make provisions for that in her own program, > not the toolkit. > > Do you think this should be something handled differently in strict and permissive mode? Should missing occupancy give a warning or error in strict mode? Peter From arklenna at gmail.com Fri Aug 9 13:07:41 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Fri, 9 Aug 2013 09:07:41 -0400 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Friday, 9 August 2013, Peter Cock wrote: > On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard > > wrote: > > > On 09/08/2013, at 10:47, Peter Cock > > wrote: > > > > > How about the special float values NaN or NA instead? > > > Or the Python special value None? > > > > TBH I don't think there is any good reason to change the current > defaults. > > On the contrary, we should be careful when changing default values since > > this might break users' programs. > > > > My point is, that Lenna wants to read files that does not follow the PDB > > standard, and so she needs to make provisions for that in her own > program, > > not the toolkit. > > > > > Do you think this should be something handled differently in strict and > permissive mode? Should missing occupancy give a warning or error in strict > mode? (Resending to dev list) None in permissive mode makes a lot of sense to me. Missing occupancy is a fatal error in strict mode. Lenna From p.j.a.cock at googlemail.com Fri Aug 9 13:14:44 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 14:14:44 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson wrote: > On Friday, 9 August 2013, Peter Cock wrote: > >> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard > >> wrote: >> >> > On 09/08/2013, at 10:47, Peter Cock > >> wrote: >> > >> > > How about the special float values NaN or NA instead? >> > > Or the Python special value None? >> > >> > TBH I don't think there is any good reason to change the current >> defaults. >> > On the contrary, we should be careful when changing default values since >> > this might break users' programs. >> > >> > My point is, that Lenna wants to read files that does not follow the PDB >> > standard, and so she needs to make provisions for that in her own >> > program, not the toolkit. >> > >> > >> Do you think this should be something handled differently in strict and >> permissive mode? Should missing occupancy give a warning or error in strict >> mode? > > (Resending to dev list) > > None in permissive mode makes a lot of sense to me. > > Missing occupancy is a fatal error in strict mode. > > Lenna Good (error in strict mode). Do you think a warning in permissive mode for missing occupancy is also worth adding, or would using None as the value indicate that nicely? Peter From arklenna at gmail.com Fri Aug 9 13:46:54 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Fri, 9 Aug 2013 09:46:54 -0400 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Friday, 9 August 2013, Peter Cock wrote: > On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson > > wrote: > > On Friday, 9 August 2013, Peter Cock wrote: > > > >> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard > > > >> wrote: > >> > >> > On 09/08/2013, at 10:47, Peter Cock > > > >> wrote: > >> > > >> > > How about the special float values NaN or NA instead? > >> > > Or the Python special value None? > >> > > >> > TBH I don't think there is any good reason to change the current > >> defaults. > >> > On the contrary, we should be careful when changing default values > since > >> > this might break users' programs. > >> > > >> > My point is, that Lenna wants to read files that does not follow the > PDB > >> > standard, and so she needs to make provisions for that in her own > >> > program, not the toolkit. > >> > > >> > > >> Do you think this should be something handled differently in strict and > >> permissive mode? Should missing occupancy give a warning or error in > strict > >> mode? > > > > (Resending to dev list) > > > > None in permissive mode makes a lot of sense to me. > > > > Missing occupancy is a fatal error in strict mode. > > > > Lenna > > Good (error in strict mode). > > Do you think a warning in permissive mode for missing occupancy > is also worth adding, or would using None as the value indicate > that nicely? > > Peter > I have some concern about changing the type of an attribute but I imagine any end user who cares about occupancy doesn't want spurious values of either 1.0 or 0.0 anyway. I'm not at a computer right now but I believe most problems in the PDB parser are fatal in strict and warnings in permissive. So there should already be a warning in place. It occurred to me it would also be possible o create an "ultra-permissive" mode designed for parsing computationally produced files, and suppress some of the warnings (e.g. missing occupancy and B-factor). That way, the current behavior could be left unchanged. Possibly a permissiveness level (0 for strict, 1 for current permissive, 2 for even more permissive). Anyway, I'd be happy to implement any of these options (current parser to None, restore previous behavior and None in a new permissiveness level, other?) and of course update the unit test. Cheers, Lenna From p.j.a.cock at googlemail.com Fri Aug 9 14:22:29 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 15:22:29 +0100 Subject: [Biopython-dev] PDB occupancy behavior In-Reply-To: References: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk> Message-ID: On Fri, Aug 9, 2013 at 2:46 PM, Lenna Peterson wrote: > On Friday, 9 August 2013, Peter Cock wrote: >> >> Good (error in strict mode). >> >> Do you think a warning in permissive mode for missing occupancy >> is also worth adding, or would using None as the value indicate >> that nicely? >> >> Peter > > > > I have some concern about changing the type of an attribute but I imagine > any end user who cares about occupancy doesn't want spurious values of > either 1.0 or 0.0 anyway. > > I'm not at a computer right now but I believe most problems in the PDB > parser are fatal in strict and warnings in permissive. So there should > already be a warning in place. > > It occurred to me it would also be possible o create an "ultra-permissive" > mode designed for parsing computationally produced files, and suppress some > of the warnings (e.g. missing occupancy and B-factor). That way, the current > behavior could be left unchanged. Possibly a permissiveness level (0 for > strict, 1 for current permissive, 2 for even more permissive). > > Anyway, I'd be happy to implement any of these options (current parser to > None, restore previous behavior and None in a new permissiveness level, > other?) and of course update the unit test. You should be able to silence the PDB warnings in two lines anyway, so I don't think we really need an ultra-permissive no-warnings mode. Peter From anaryin at gmail.com Fri Aug 9 17:26:59 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 9 Aug 2013 10:26:59 -0700 Subject: [Biopython-dev] Moratorium on commits? Message-ID: Dear all, The situation with the occupancy in the PDBParser led to think of one thing. Since not everybody is in the same timezone, has the same availability, etc, what about we introduce a brief moratorium over commits of say 3 days (except for critical bug fixes)? This will give everybody probably enough time to read the email and give their opinion. The downside is that it will make things roll a bit slower but then again, 3 days is not so much.. Cheers, Jo?o From p.j.a.cock at googlemail.com Fri Aug 9 19:06:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 9 Aug 2013 20:06:21 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues wrote: > Dear all, > > The situation with the occupancy in the PDBParser led to think of one > thing. > > Since not everybody is in the same timezone, has the same availability, > etc, what about we introduce a brief moratorium over commits of say 3 days > (except for critical bug fixes)? This will give everybody probably enough > time to read the email and give their opinion. > > The downside is that it will make things roll a bit slower but then again, > 3 days is not so much.. > > Cheers, > > Jo?o I don't think that's really needed for small commits like this which are simple to interpret. In this case there were three opinions in favour of the idea, with a fourth counter view appearing later, resulting in a further tweak. Longer periods of discussion are far more important on large code additions or major changes. Peter From arklenna at gmail.com Sun Aug 11 00:43:36 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Sat, 10 Aug 2013 20:43:36 -0400 Subject: [Biopython-dev] Redmine issue 2727 ready for pull In-Reply-To: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Message-ID: Hi Morten, I think this looks great. Why not submit a pull request? Cheers, Lenna On Fri, Aug 9, 2013 at 4:33 AM, Morten Kjeldgaard wrote: > Hi, > > I've finally gotten around to following up to a very old patch I sent to > the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB > does not parse the important CRYST1 record. In the bug comments, Peter > Cock asked to include the explanation of the new keys in the docstring. > That has now been done. > > Peter also asks about the default values chosen (if the CRYST1 header is > not present). These are probably universally chosen default values in > various crystallographic programs, and these values are also used in PDB > entries containinging NMR entries, for example. > > My github branch containing the patch #2727 is in [2]. I am using Bio.PDB > quite a lot, and I would like to contribute more to it in the future. > > Cheers, > Morten > > > [1] https://redmine.open-bio.org/issues/2727 > [2] https://github.com/mok0/biopython/tree/pdbwork > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From mok at bioxray.dk Sun Aug 11 18:33:05 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Sun, 11 Aug 2013 20:33:05 +0200 Subject: [Biopython-dev] Redmine issue 2727 ready for pull In-Reply-To: References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Message-ID: On 11/08/2013, at 02:43, Lenna Peterson wrote: > I think this looks great. Why not submit a pull request? Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I thought that is what I did by posting to the -dev list). Cheers, Morten From mok at bioxray.dk Sun Aug 11 18:28:36 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Sun, 11 Aug 2013 20:28:36 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: On 09/08/2013, at 21:06, Peter Cock wrote: > On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues wrote: >> Dear all, >> >> The situation with the occupancy in the PDBParser led to think of one >> thing. >> >> Since not everybody is in the same timezone, has the same availability, >> etc, what about we introduce a brief moratorium over commits of say 3 days >> (except for critical bug fixes)? This will give everybody probably enough >> time to read the email and give their opinion. >> >> The downside is that it will make things roll a bit slower but then again, >> 3 days is not so much.. >> >> Cheers, >> >> Jo?o > > I don't think that's really needed for small commits like > this which are simple to interpret. In this case there were > three opinions in favour of the idea, with a fourth counter > view appearing later, resulting in a further tweak. > > Longer periods of discussion are far more important on > large code additions or major changes. Sorry, but I don't agree that this is a "small commit". It may not be large in terms of number of bytes, but it is large in terms of impact, since it affects users' programs in unpredictable ways. Whenever a change is made that affects values returned to the user, it is worth spending a few days discussing it, to let people have a chance to think through the consequences of the change. Cheers, Morten From arklenna at gmail.com Sun Aug 11 18:40:38 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Sun, 11 Aug 2013 14:40:38 -0400 Subject: [Biopython-dev] Redmine issue 2727 ready for pull In-Reply-To: References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk> Message-ID: On Sun, Aug 11, 2013 at 2:33 PM, Morten Kjeldgaard wrote: > > On 11/08/2013, at 02:43, Lenna Peterson wrote: > > > I think this looks great. Why not submit a pull request? > > Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I > thought that is what I did by posting to the -dev list). > > Cheers, > Morten Hey Morten, It's good to let the dev list know you have code ready to merge in, but if you do it on github, it will show up here too: https://github.com/biopython/biopython/pulls Here's github's instructions: https://help.github.com/articles/creating-a-pull-request Cheers, Lenna From p.j.a.cock at googlemail.com Sun Aug 11 20:50:46 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 11 Aug 2013 21:50:46 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: On Sun, Aug 11, 2013 at 7:28 PM, Morten Kjeldgaard wrote: > > On 09/08/2013, at 21:06, Peter Cock wrote: > >> On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues wrote: >>> Dear all, >>> >>> The situation with the occupancy in the PDBParser led to think of one >>> thing. >>> >>> Since not everybody is in the same timezone, has the same availability, >>> etc, what about we introduce a brief moratorium over commits of say 3 >>> days (except for critical bug fixes)? This will give everybody probably >>> enough time to read the email and give their opinion. >>> >>> The downside is that it will make things roll a bit slower but then >>> again, 3 days is not so much.. >>> >>> Cheers, >>> >>> Jo?o >> >> I don't think that's really needed for small commits like >> this which are simple to interpret. In this case there were >> three opinions in favour of the idea, with a fourth counter >> view appearing later, resulting in a further tweak. >> >> Longer periods of discussion are far more important on >> large code additions or major changes. > > Sorry, but I don't agree that this is a "small commit". It may > not be large in terms of number of bytes, but it is large in > terms of impact, since it affects users' programs in > unpredictable ways. Hello again Morten, I did mean small in number of code change, which I tried to make clear from the rest of the email, but as discussed below, I also think the PDB occupancy change was also small in terms of behaviour. > Whenever a change is made that affects values > returned to the user, it is worth spending a few days > discussing it, to let people have a chance to think > through the consequences of the change. Almost any change impacts the user in some way. I still feel this was a minor change (although of course important to some, including you). This is parsing of malformed PDF files where the user ALREADY gets a warning (or error in strict mode, where there would be no functional change) that there is a problem with the occupancy data. One reason why I specifically talked about small commits (in the sense of a simple diff) above is they are trivial to revert if the need arises, or as in this case, modify: https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e This change was suggested and supported by people who've been actively contributing to the Biopython structural module for some time, so I had reason to trust their good judgement, and as I wrote at the time there was a clear consensus with three people in all happy with the idea: http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html Changes where there isn't clear agreement are generally discussed over a longer time period. Note that Biopython is already relatively strict about not breaking things and preserving backwards compatibility (to the point where it does delay new features). We do care about not breaking existing scripts without warning - so when people speak up on the list that something is likely to cause them trouble, we do listen. Is that any clearer? Regards, Peter From zruan1991 at gmail.com Sun Aug 11 22:04:10 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sun, 11 Aug 2013 18:04:10 -0400 Subject: [Biopython-dev] Codon Alignment GSoC Update Message-ID: Hi all, An update of Codon Alignment Project can be found at (http://zruanweb.com/). In the next week, I will be implementing the Maximum Likelihood method for dN/dS ratio estimation. I do not anticipate to write any code for the optimization and Scipy's functionality is most suitable to be used here. This might be a new dependency for Biopython. Is it okay to add this? Or are there some other functions in Biopython for optimization problems? Thanks! Best, Zheng Ruan From kai.blin at biotech.uni-tuebingen.de Mon Aug 12 10:53:17 2013 From: kai.blin at biotech.uni-tuebingen.de (Kai Blin) Date: Mon, 12 Aug 2013 12:53:17 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: <5208BE9D.1090900@biotech.uni-tuebingen.de> On 2013-08-09 19:26, Jo?o Rodrigues wrote: Dear biopython devs, > Since not everybody is in the same timezone, has the same availability, > etc, what about we introduce a brief moratorium over commits of say 3 days > (except for critical bug fixes)? This will give everybody probably enough > time to read the email and give their opinion. I've been through discussions like this before, in a lot of open source projects I'm involved in. I don't think this is a good step to take. Saying that "all patches need to wait unless they're special" will eventually lead to a dilution of what is considered special, and then lead to a point where most patches by core contributors happen to be special and patches by new contributors aren't. Because the policy doesn't explicitly state this, you then create a very unwelcoming atmosphere for the project. I would recommend to consider if avoiding the occasional revert is worth that cost. Personally, one of the things I like about BioPython is how fast I'm able to get bugfixes in. My two cents, Kai -- Dipl.-Inform. Kai Blin kai.blin at biotech.uni-tuebingen.de Institute for Microbiology and Infection Medicine Division of Microbiology/Biotechnology Eberhard-Karls-Universit?t T?bingen Auf der Morgenstelle 28 Phone : ++49 7071 29-78841 D-72076 T?bingen Fax : ++49 7071 29-5979 Germany Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben From tiagoantao at gmail.com Mon Aug 12 11:33:40 2013 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 12 Aug 2013 12:33:40 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: <5208BE9D.1090900@biotech.uni-tuebingen.de> References: <5208BE9D.1090900@biotech.uni-tuebingen.de> Message-ID: Hi, On 12 August 2013 11:53, Kai Blin wrote: > Personally, one of the things I like about BioPython is how fast I'm able > to get bugfixes in. > > I agree that the light approach to process is great. 99% of the patches are pacific and would suffer from a heavier process. For the rare cases where there are problems, revert can be used. My code has been reverted a couple of times and I am fine with that (when one commits to a public project with shared ownership one should expect peer-review, sometimes heated discussion and corrections - it is normal). If one thinks a change can be problematic, an initial discussion would be a good idea. Of course, some times we do not know until after the fact, then again, the good thing about version control is that we can undo things... Generally things have been working very well and I would not change the process to something heavier just because of a single case. Single cases should be sorted on a case-by-case basis, with no stress. My 2p, Tiago From yeyanbo289 at gmail.com Mon Aug 12 13:25:22 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 12 Aug 2013 21:25:22 +0800 Subject: [Biopython-dev] GSOC weekly update 8 Message-ID: Hi all, My update about Biopython.Phylo project can be found here: http://blog.yeyanbo.com/posts/google-summer-of-code-9.html Best, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From mok at bioxray.dk Mon Aug 12 18:33:26 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Mon, 12 Aug 2013 20:33:26 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: Message-ID: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> On 11/08/2013, at 22:50, Peter Cock wrote: > I still feel this was a minor change (although of > course important to some, including you). This is > parsing of malformed PDF files where the user > ALREADY gets a warning (or error in strict mode, > where there would be no functional change) that > there is a problem with the occupancy data. > > One reason why I specifically talked about small > commits (in the sense of a simple diff) above is > they are trivial to revert if the need arises, or as > in this case, modify: > https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e > > This change was suggested and supported by > people who've been actively contributing to the > Biopython structural module for some time, so I > had reason to trust their good judgement, and as > I wrote at the time there was a clear consensus > with three people in all happy with the idea: > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html I respect that you listen more to developers that have been contributing for a long time. That is quite understandable, but I hope that does not prevent me from contributing my opinions. What prompted my response was the suggestion that the occupancy should be set to 1.0 if it is abscent from the file, i.e. if the PDB file is malformed. I think that is an incorrect behavior, and I say that not as a core developer, but as a crystallographer. If invalid data is present in the file, you do not want the toolkit transforming it to valid data. After thinking about it, the suggestion to set values to None when they are not defined in a malformed file now appears quite reasonable, but if it is done this way with occupancies, it should also done this way with B-factors, chain identifiers and other values that are mandatory in the file according to the format specs. From the users perspective, if the values returned are None, you are alerted to the fact that something is wrong, and you should make an appropriate choice, whatever that may be. Cheers, Morten From arklenna at gmail.com Mon Aug 12 19:25:20 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Mon, 12 Aug 2013 15:25:20 -0400 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> References: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> Message-ID: On Mon, Aug 12, 2013 at 2:33 PM, Morten Kjeldgaard wrote: > > On 11/08/2013, at 22:50, Peter Cock wrote: > > > I still feel this was a minor change (although of > > course important to some, including you). This is > > parsing of malformed PDF files where the user > > ALREADY gets a warning (or error in strict mode, > > where there would be no functional change) that > > there is a problem with the occupancy data. > > > > One reason why I specifically talked about small > > commits (in the sense of a simple diff) above is > > they are trivial to revert if the need arises, or as > > in this case, modify: > > > https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e > > > > This change was suggested and supported by > > people who've been actively contributing to the > > Biopython structural module for some time, so I > > had reason to trust their good judgement, and as > > I wrote at the time there was a clear consensus > > with three people in all happy with the idea: > > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html > > > I respect that you listen more to developers that have been contributing > for a long time. That is quite understandable, but I hope that does not > prevent me from contributing my opinions. > > What prompted my response was the suggestion that the occupancy should be > set to 1.0 if it is abscent from the file, i.e. if the PDB file is > malformed. I think that is an incorrect behavior, and I say that not as a > core developer, but as a crystallographer. If invalid data is present in > the file, you do not want the toolkit transforming it to valid data. > I appreciate the physical/practical feedback about the commits. After thinking about it, the suggestion to set values to None when they are > not defined in a malformed file now appears quite reasonable, but if it is > done this way with occupancies, it should also done this way with > B-factors, chain identifiers and other values that are mandatory in the > file according to the format specs. From the users perspective, if the > values returned are None, you are alerted to the fact that something is > wrong, and you should make an appropriate choice, whatever that may be. > > I agree that `None` is a good warning value for missing data. I just skimmed the code and summarized how some of the missing values are handled: * Serial number: 0 * Chain: fatal in both strict and permissive modes (i.e. no try/except) * Coordinates: fatal in both strict and permissive modes * Occupancy: we recently decided to set as None in permissive * B-factor: 0.0 in permissive (code comment states this is PDB default) * Model seq id: 0 The StructureBuilder class also has certain ways of handling duplicate residues and atoms that I'm not particularly familiar with. For example, I'm not quite sure what will happen if successive atoms have missing serial numbers. PDB is a format where there's always a balance between absolute adherence to the format and enough flexibility to deal with the wide range of malformed files. Lenna From mok at bioxray.dk Mon Aug 12 19:42:28 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Mon, 12 Aug 2013 21:42:28 +0200 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: References: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> Message-ID: <0F6D9BF5-BFAA-4118-8D90-936AC44A29FA@bioxray.dk> On 12/08/2013, at 21:25, Lenna Peterson wrote: > * B-factor: 0.0 in permissive (code comment states this is PDB default) The default referred to in that code comment is what the PDB annotators put in that field if the information is not provided by the depositor (which could be the case for i.e. an NMR model). From the PDB Atomic Coordinate Entry Format Description, Version 3.30: * If the depositor provides the data, then the isotropic B value is given for the temperature factor. * If there are neither isotropic B values from the depositor, nor anisotropic temperature factors in ANISOU, then the default value of 0.0 is used for the temperature factor. In other words, the PDB format specification has no recommendations for what default values should be used if the field is blank in a malformed file, only what their staff should put in the entry when they receive it from the depositor. So IMO Biopython is free to use None if the B-value is missing in a malformed file. (I haven't checked the other items that Lenna mentions.) Cheers, Morten From anaryin at gmail.com Mon Aug 12 19:51:03 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 12 Aug 2013 12:51:03 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) Message-ID: Hi all, Moving to a new thread because this is a very specific issue. I think that, from a programming point of view (but I'm a biologist so correct me if I'm wrong) having None values upon parsing is probably a better idea. Then, when writing, these should be translated to whatever default there is in the PDB documentation. Cheers, Jo?o From anaryin at gmail.com Mon Aug 12 19:51:03 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 12 Aug 2013 12:51:03 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) Message-ID: Hi all, Moving to a new thread because this is a very specific issue. I think that, from a programming point of view (but I'm a biologist so correct me if I'm wrong) having None values upon parsing is probably a better idea. Then, when writing, these should be translated to whatever default there is in the PDB documentation. Cheers, Jo?o From p.j.a.cock at googlemail.com Mon Aug 12 20:36:15 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 12 Aug 2013 21:36:15 +0100 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Monday, August 12, 2013, Jo?o Rodrigues wrote: > Hi all, > > Moving to a new thread because this is a very specific issue. > > I think that, from a programming point of view (but I'm a biologist so > correct me if I'm wrong) having None values upon parsing is probably a > better idea. Then, when writing, these should be translated to whatever > default there is in the PDB documentation. > Or throw an error to force the user to fix it? Or write a blank occupancy to allow preservation of the (flawed) input? (Thank you for raising the output question now, it is a logically consequence of putting None in the parsed structure) Peter From anaryin at gmail.com Mon Aug 12 20:39:30 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 12 Aug 2013 13:39:30 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: Throwing an error might not be a good idea because when dealing with models they sometimes have missing fields... then we'd have to fix them all somehow before parsing them. The None value seems a good indicator that something is amiss, while not putting any value there. There should also be a warning upon writing that the value is being replaced by a default value. Blank is also good actually, maybe we could add an option to the writer/parser to "preserve" values? Cheers, Jo?o 2013/8/12 Peter Cock > > > On Monday, August 12, 2013, Jo?o Rodrigues wrote: > >> Hi all, >> >> Moving to a new thread because this is a very specific issue. >> >> I think that, from a programming point of view (but I'm a biologist so >> correct me if I'm wrong) having None values upon parsing is probably a >> better idea. Then, when writing, these should be translated to whatever >> default there is in the PDB documentation. >> > > Or throw an error to force the user to fix it? > > Or write a blank occupancy to allow preservation of the > (flawed) input? > > (Thank you for raising the output question now, it is a logically > consequence of putting None in the parsed structure) > > Peter > > From p.j.a.cock at googlemail.com Mon Aug 12 20:40:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 12 Aug 2013 21:40:24 +0100 Subject: [Biopython-dev] Moratorium on commits? In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> References: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk> Message-ID: On Monday, August 12, 2013, Morten Kjeldgaard wrote: > > On 11/08/2013, at 22:50, Peter Cock > > wrote: > > > I still feel this was a minor change (although of > > course important to some, including you). This is > > parsing of malformed PDF files where the user > > ALREADY gets a warning (or error in strict mode, > > where there would be no functional change) that > > there is a problem with the occupancy data. > > > > One reason why I specifically talked about small > > commits (in the sense of a simple diff) above is > > they are trivial to revert if the need arises, or as > > in this case, modify: > > > https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e > > > > This change was suggested and supported by > > people who've been actively contributing to the > > Biopython structural module for some time, so I > > had reason to trust their good judgement, and as > > I wrote at the time there was a clear consensus > > with three people in all happy with the idea: > > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html > > > I respect that you listen more to developers that have been contributing for a long time. That is quite understandable, but I hope that does not prevent me from contributing my opinions. Of course not - your input (which was after the initial change) has already resulted in a review of that change and the adoption of None instead. So thank you for speaking up, Peter From eric.talevich at gmail.com Mon Aug 12 22:35:05 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 12 Aug 2013 15:35:05 -0700 Subject: [Biopython-dev] Codon Alignment GSoC Update In-Reply-To: References: Message-ID: Hi Zheng, Nice work this week. For the next tasks: 1. It's probably not a high priority to implement all of the dN/dS approaches described in Yang's book (i.e. LWL85m, LPB93, Ina95), beyond the simple early methods (NG86, LWL85) and the finale, YN00. If you get around to doing them all, cool, but if you only have time to do one more I'd pick YN00. 2. SciPy is a relatively large dependency, so I recommend making it a runtime import -- do the import from within the function that needs it, rather than at the top-level scope of the module. E.g.: Bio.Phylo._utils.to_networkx 3. Where are you focusing your documentation efforts? If you're keeping most of the descriptions in the docstrings, it would be convenient to format the text as reStructuredText for processing with Epydoc and Sphinx. Time permitting, it would also be nice to have a chapter on this work in the Tutorial, see Doc/Tutorial.tex (also fine to write this up as a separate LaTeX document first and roll it in later). Cheers, Eric On Sun, Aug 11, 2013 at 3:04 PM, Zheng Ruan wrote: > Hi all, > > An update of Codon Alignment Project can be found at (http://zruanweb.com/). > In the next week, I will be implementing the Maximum Likelihood method for > dN/dS ratio estimation. I do not anticipate to write any code for the > optimization and Scipy's functionality is most suitable to be used here. > This might be a new dependency for Biopython. Is it okay to add this? Or > are there some other functions in Biopython for optimization problems? > Thanks! > > Best, > Zheng Ruan > From eric.talevich at gmail.com Mon Aug 12 23:03:07 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 12 Aug 2013 16:03:07 -0700 Subject: [Biopython-dev] GSOC weekly update 8 In-Reply-To: References: Message-ID: Hi Yanbo, Looks like excellent progress. At some point, would you mind documenting how the bit array operations are used to represent trees, e.g. how a bit array (BitString instance) should be interpreted in terms of taxa and tree topologies? Thanks, Eric On Mon, Aug 12, 2013 at 6:25 AM, Yanbo Ye wrote: > Hi all, > > My update about Biopython.Phylo project can be found here: > http://blog.yeyanbo.com/posts/google-summer-of-code-9.html > > Best, > Yanbo > > -- > > *Yanbo Ye* > *Guangzhou Institutes of Biomedicine and Health, * > *Chinese Academy of Sciences* > *190 Kaiyuan Avenue, Science Park, Guangzhou, China** > * > * > * > *Email: ye_yanbo at gibh.ac.cn* > *Web: http://www.yeyanbo.com* > *Phone: (86)-020-32093810* > From p.j.a.cock at googlemail.com Wed Aug 14 09:44:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 14 Aug 2013 10:44:24 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com> References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com> Message-ID: On Friday, July 26, 2013 Peter wrote: > On Wed, Jul 24, 2013 Peter Cock wrote: >> On Wed, Jul 24, 2013 Brad Chapman wrote: >>> >>> Peter and Michiel; >>> >>>>> Do we actually need setuptools? >>>>> Looking at setup.py, it seems that distutils is sufficient for our >>>>> needs. >>>>> If so, let's remove the dependency on setuptools. >>> >>> We used setuptools/distribute to install dependencies, although >>> practically this doesn't work well since pip doesn't finish NumPy >>> installation before installing Biopython. So I'm fine with taking it out >>> if you want to simplify the setup and avoid the extra dependency. >> >> Sounds like a plan - but we should all test this change, especially >> users of PIP, easy_install, virtual env etc. >> > > So who's going to do the commit - Brad or Michiel? > > Peter > On Fri, Jul 26, 2013 at 1:09 AM, Michiel de Hoon wrote: > Brad, can you do it? > Best, > -Michiel. I've done it: https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9 This needs some more testing now - particularly with the non-standard install options like pip, easy_install, etc. Peter From p.j.a.cock at googlemail.com Thu Aug 15 11:28:47 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 15 Aug 2013 12:28:47 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 next week? Message-ID: Hello all, Are there any remaining issues people think need to be resolve prior to releasing Biopython 1.62? If not, unless anyone else volunteers, I will make time for this next week. Possible issues worth reviewing - please reply on the existing threads: Changes to setup.py to remove use of setuptools, this would benefit from wider testing: https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9 http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010806.html Changes to PDB occupancy, do we need to change PDB writing in light of this? http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010802.html Update the Prank tool test to work with recent versions: http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010757.html Note that PyPy now have a beta out support Python 3, it would be nice to fully test with that as well... http://morepypy.blogspot.co.uk/2013/07/pypy3-21-beta-1.html Thanks, Peter From arklenna at gmail.com Thu Aug 15 13:18:35 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 15 Aug 2013 09:18:35 -0400 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Monday, 12 August 2013, Jo?o Rodrigues wrote: > Throwing an error might not be a good idea because when dealing with models > they sometimes have missing fields... then we'd have to fix them all > somehow before parsing them. > > The None value seems a good indicator that something is amiss, while not > putting any value there. There should also be a warning upon writing that > the value is being replaced by a default value. Blank is also good > actually, maybe we could add an option to the writer/parser to "preserve" > values? > > I don't think writing string "None" into a fixed width field would be a good idea. So it's probably best to change occupancy (and any other missing values set to None) to blank, correct width fields for writing. I've never tangled with the writer and I have incoming PhD students this week but I can attempt to add this functionality early next week. Lenna From p.j.a.cock at googlemail.com Thu Aug 15 13:23:50 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 15 Aug 2013 14:23:50 +0100 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Thu, Aug 15, 2013 at 2:18 PM, Lenna Peterson wrote: > On Monday, 12 August 2013, Jo?o Rodrigues wrote: >> >> Throwing an error might not be a good idea because when dealing with >> models >> they sometimes have missing fields... then we'd have to fix them all >> somehow before parsing them. >> >> The None value seems a good indicator that something is amiss, while not >> putting any value there. There should also be a warning upon writing that >> the value is being replaced by a default value. Blank is also good >> actually, maybe we could add an option to the writer/parser to "preserve" >> values? >> > > I don't think writing string "None" into a fixed width field would be a good > idea. So it's probably best to change occupancy (and any other missing > values set to None) to blank, correct width fields for writing. I didn't mean to suggest writing the string "None" in the field, and I'm not sure if Jo?o did - it would certainly be an invalid PDB file. I agree that where the data structure has None (e.g. from our parser) then the writer could use a blank string (of the appropriate width). For mandatory fields like occupancy, this should give a warning. > I've never tangled with the writer and I have incoming PhD students this > week but I can attempt to add this functionality early next week. That would be great (assuming no-one else want to tackle it sooner). Thanks, Peter From arklenna at gmail.com Thu Aug 15 14:54:53 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 15 Aug 2013 10:54:53 -0400 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: > > I don't think writing string "None" into a fixed width field would be a > good > > idea. So it's probably best to change occupancy (and any other missing > > values set to None) to blank, correct width fields for writing. > > I didn't mean to suggest writing the string "None" in the field, and > I'm not sure if Jo?o did - it would certainly be an invalid PDB file. > > I didn't mean anyone was suggesting we intentionally do this, but I bet that's what the writer is doing now! From eric.talevich at gmail.com Thu Aug 15 17:35:00 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Thu, 15 Aug 2013 10:35:00 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: > Thanks for these details Ben - it sounds like a mixture of real > test failures, and mere warnings that an external tool wasn't > found. > > On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > > My test machine was running Ubuntu 12.04. > > > > For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and > > got this error: > > ApplicationError: Command 'fasttree -out temp_test.tree > > Quality/example.fasta' returned non-zero exit status 1, 'Unknown or > > incorrect use of option -out' > > I don't seem to have fasttree installed at all, and from the > test and wrapper it is not explicit about which version is > was originally written for. > I pushed a patch to not use the potentially problematic '-out' flag: https://github.com/biopython/biopython/commit/771c1ed23bbb39dcf37805b4cb7bb23ffcb0c46a According to FastTree's changelog ( http://www.microbesonline.org/fasttree/ChangeLog), the -out option was added in version 2.1.5, released August 30, 2012. So the 'fasttree' package on the stable Ubuntu (12.04) does not have the -out flag, but the package in subsequent Ubuntus and other Debian derivatives does. -Eric From eric.talevich at gmail.com Thu Aug 15 23:44:38 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Thu, 15 Aug 2013 16:44:38 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: > Thanks for these details Ben - it sounds like a mixture of real > test failures, and mere warnings that an external tool wasn't > found. > > On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > > My test machine was running Ubuntu 12.04. > [...] > > I downloaded version 130708 of Prank from > > http://code.google.com/p/prank-msa/downloads/list. The error is on line > 165 > > of the test file: > > > > AssertionError: > > ----------------- > > PRANK v.130708: > > ----------------- > > > > Input for the analysis > > - converting 'Quality/example.fasta' to 'temp with space.phy' > > This sounds like a minor change in the stdout with recent > versions of PRANK. > > It's more exciting than that: Old versions of Prank created .xml and .dnd files by default, and had "-noxml" and "-notree" options to avoid creating them (or clean them up, whichever). New Pranks do not create these files by default, but do have "-showxml" and "-showtree" flags if you want them. I removed the use of these flags in the unit test. One of the tests used the set_parameter method, so I substituted the "-dots" flag for "-notree". It passes on my machine now: https://github.com/biopython/biopython/commit/30d7bcfb6eab8283a53372b2ad64b59be7461eb3 The doctests in Bio/Align/Applications/_Prank.py should probably change, too, since the same flags are used there. (I have not done this.) -Eric From w.arindrarto at gmail.com Fri Aug 16 07:14:24 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Fri, 16 Aug 2013 09:14:24 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: Hi Michiel, Peter, In preparation for the 1.62 release, I've made the following changes to Bio.NCBIStandalone and Bio.ParserSupport: * Migrated the two modules under Bio.SearchIO._legacy * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning I've pushed the changes to this branch: https://github.com/bow/biopython/tree/bio_blast_migrate Tests seem to be running fine still, but now there is the awkward situation where if users import Bio.NCBIStandalone and/or Bio.ParserSupport directly they will be greeted with two warnings: the BiopythonWarning for the modules' deprecation and the BiopythonExperimentalWarning for SearchIO. We could suppress the SearchIO warning in Bio.NCBIStandalone and Bio.ParserSupport. But before this is done, I was wondering if we have a defined timeline for removing a BiopythonExperimentalWarning? (i.e. if it will be removed in this release, then we could do that instead). Any opinions on this :)? Cheers, Bow On Sat, Jul 13, 2013 at 12:54 PM, Michiel de Hoon wrote: > Hi Bow, > > >> Would it be ok if we move parts that are used by SearchIO into their own >> private classes in >> Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current >> files? > > That sounds fine to me. Any other opinions, anybody? > > Best, > -Michiel. > > ________________________________ > From: Wibowo Arindrarto > To: Michiel de Hoon > Cc: Peter Cock ; Eric Talevich > ; Zheng Ruan ; Biopython-Dev > Mailing List > Sent: Saturday, July 13, 2013 3:58 PM > Subject: Re: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? > > Hi Michiel, > > There are two classes from Bio.Blast.NCBIStandalone still being used > by Bio.SearchIO internally (for the BLAST text parser): the > BlastParser and the Iterator classes. The BlastParser class itself > still relies on Bio.ParserSupport. Would it be ok if we move parts > that are used by SearchIO into their own private classes in > Bio.SearchIO, while putting the BiopythonDeprecationWarning on the > current files? > > Best regards, > Bow > > On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon > wrote: >> The following pieces of code had a PendingDeprecationWarning in Biopython >> release 1.61, and can be upgraded to a BiopythonDeprecationWarning: >> >> Bio.Blast.NCBIStandalone (entire module). This module has had a >> PendingDeprecationWarning since September 2010. >> >> Bio.Motif (entire module). Its functionality is available from Bio.motifs, >> so Bio.Motif can be deprecated. >> >> Bio.ParserSupport (entire module). This module is currently only being >> used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning >> since September 2011. >> >> Any final objections? >> >> Best, >> -Michiel >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev > > From p.j.a.cock at googlemail.com Fri Aug 16 09:31:13 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 16 Aug 2013 10:31:13 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote: > On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: >> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: >> > I downloaded version 130708 of Prank from >> > http://code.google.com/p/prank-msa/downloads/list. >> > The error is on line 165 of the test file: >> > >> > AssertionError: >> > ----------------- >> > PRANK v.130708: >> > ----------------- >> > >> > Input for the analysis >> > - converting 'Quality/example.fasta' to 'temp with space.phy' >> >> This sounds like a minor change in the stdout with recent >> versions of PRANK. >> > > It's more exciting than that: Old versions of Prank created .xml and .dnd > files by default, and had "-noxml" and "-notree" options to avoid creating > them (or clean them up, whichever). New Pranks do not create these files by > default, but do have "-showxml" and "-showtree" flags if you want them. Well that API break is a bit annoying, but your test changes make sense. Do we need to add these new switches to the wrapper itself? Peter From eric.talevich at gmail.com Sun Aug 18 18:14:13 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Sun, 18 Aug 2013 11:14:13 -0700 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock wrote: > On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote: > > On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote: > >> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote: > >> > I downloaded version 130708 of Prank from > >> > http://code.google.com/p/prank-msa/downloads/list. > >> > The error is on line 165 of the test file: > >> > > >> > AssertionError: > >> > ----------------- > >> > PRANK v.130708: > >> > ----------------- > >> > > >> > Input for the analysis > >> > - converting 'Quality/example.fasta' to 'temp with space.phy' > >> > >> This sounds like a minor change in the stdout with recent > >> versions of PRANK. > >> > > > > It's more exciting than that: Old versions of Prank created .xml and .dnd > > files by default, and had "-noxml" and "-notree" options to avoid > creating > > them (or clean them up, whichever). New Pranks do not create these files > by > > default, but do have "-showxml" and "-showtree" flags if you want them. > > Well that API break is a bit annoying, but your test changes make sense. > > Do we need to add these new switches to the wrapper itself? > Here's the commit to add those switches to the wrapper: https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1 I suppose it would be helpful if the wrapper detected the version of Prank and handled the show(tree|xml) flags appropriately to avoid errors. But that would require running the executable first, I think, which is not something our wrappers normally do. (And then it would make sense to cache the result for the duration of the running process.) -Eric From p.j.a.cock at googlemail.com Sun Aug 18 18:39:08 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 18 Aug 2013 19:39:08 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Sun, Aug 18, 2013 at 7:14 PM, Eric Talevich wrote: > On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock wrote: >> >> Well that API break is a bit annoying, but your test changes make sense. >> >> Do we need to add these new switches to the wrapper itself? > > > Here's the commit to add those switches to the wrapper: > https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1 > > I suppose it would be helpful if the wrapper detected the version of Prank > and handled the show(tree|xml) flags appropriately to avoid errors. But that > would require running the executable first, I think, which is not something > our wrappers normally do. (And then it would make sense to cache the result > for the duration of the running process.) > > -Eric Historically we've just documented this kind of issue in the parameter docstring - the idea of auto-running the tool in the background to check the version just sounds like Trouble. Peter From yeyanbo289 at gmail.com Mon Aug 19 07:36:00 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 19 Aug 2013 15:36:00 +0800 Subject: [Biopython-dev] GSOC weekly update 10 Message-ID: Hi all, Biopython.Phylo project update of last week is here: http://blog.yeyanbo.com/posts/google-summer-of-code-10.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From zruan1991 at gmail.com Mon Aug 19 15:06:05 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 19 Aug 2013 11:06:05 -0400 Subject: [Biopython-dev] Codon Alignment GSoC Weekly Update Message-ID: Hi all, An update of CodonAlignment GSoC can be found at (http://zruanweb.com/). Thanks for your comments and suggestions. Best, Zheng Ruan From michael.maher at ucsf.edu Mon Aug 19 19:24:04 2013 From: michael.maher at ucsf.edu (Cyrus Maher) Date: Mon, 19 Aug 2013 12:24:04 -0700 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: Message-ID: Hi everybody!!- My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... I am writing because I'm interested in submitting a new Biopython module. Since this is likely a one-time event, the wiki recommends proceeding through a developer. After speaking with Peter Cock, he recommended that I open things up for discussion on the mailing list. Attached is a draft that describes a new method, termed MOSAIC, which integrates multiple sequence alignments from an arbitrary number number of sources. We show that it greatly increases the number of orthologs that we are able to detect while maintaining or improving functional-, phylogenetic-, and sequence identity-based measures of ortholog quality. Code and documentation may be found here: https://dl.dropboxusercontent.com/u/43327584/html/index.html Looking forward to hearing what you think! Best, -Cyrus -------------- next part -------------- A non-text attachment was scrubbed... Name: OD_fullpaper_8_5_13.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 1666812 bytes Desc: not available URL: From davidjosephcain at gmail.com Mon Aug 19 21:18:48 2013 From: davidjosephcain at gmail.com (David Cain) Date: Mon, 19 Aug 2013 17:18:48 -0400 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: Message-ID: Hi, Cyrus! Before the constructive criticism, I just wanted to say your module looks excellent and thank you for opening it up as free software! I'm by no means a developer (just interested in Biopython's development), but I noticed your code generally doesn't adhere to PEP8. If you're interested in getting feedback from others, it's quite valuable to format your code by the standards. (Proper PEP 8 code has a look and feel that's easier for the trained eye to view). Key things that detract from your module's readability: - CamelCase method, module, and field names (when a Python developer sees these, they're prone to assuming the name is for a class). Of course, Biopython doesn't provide the best example here, but there are reasons for that (it'll be fixed eventually). All-caps names are either refrained from use, or used for constants (i.e. you may wish to rename your module `mosaic`). - Very long line wrapping - you should really try to keep your lines to 79 characters - Using integers as booleans (you should stick to True/False, e.g. `while True` in lieu of `while 1`) - module renamings: it's much easier to see `random.shuffle` over `r.shuffle`, as one can assume `random` is the standard module, whereas `r` might be completely different. Also, your module should definitely remove usage of pdb if you wish to publish it as part of an official Python package. Would you be open to hosting a development branch of your code on GitHub or a similar community-editable resource? Any acceptance to the official Biopython distribution would of course be up to the main devs, but I'd be more than happy to test your code and make suggestions, regardless of its integration to a third-party package. David From christian at brueffer.de Tue Aug 20 11:36:09 2013 From: christian at brueffer.de (Christian Brueffer) Date: Tue, 20 Aug 2013 13:36:09 +0200 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: Message-ID: <521354A9.6020701@brueffer.de> On 8/19/13 21:24 , Cyrus Maher wrote: > Hi everybody!!- > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > I am writing because I'm interested in submitting a new Biopython module. > Since this is likely a one-time event, the wiki recommends proceeding > through a developer. After speaking with Peter Cock, he recommended that I > open things up for discussion on the mailing list. > > Attached is a draft that describes a new method, termed MOSAIC, which > integrates multiple sequence alignments from an arbitrary number number of > sources. We show that it greatly increases the number of orthologs that we > are able to detect while maintaining or improving functional-, > phylogenetic-, and sequence identity-based measures of ortholog quality. > > Code and documentation may be found here: > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > Looking forward to hearing what you think! > Hi Cyrus, I agree with David on the PEP8 issue. A very nice tool to use is the pep8 checker, https://pypi.python.org/pypi/pep8 I see that you use MSAProbs. I have an MSAProbs application wrapper in the works. I haven't submitted it yet due to incomplete unit tests, but maybe it's useful to you: https://github.com/cbrueffer/biopython/tree/msaprobs Cheers, Chris From michael.maher at ucsf.edu Tue Aug 20 18:24:43 2013 From: michael.maher at ucsf.edu (Cyrus Maher) Date: Tue, 20 Aug 2013 11:24:43 -0700 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: <521354A9.6020701@brueffer.de> References: <521354A9.6020701@brueffer.de> Message-ID: Thanks for your feedback, guys!! I did a bit of general clean-up and I've made all the recommended PEP8 changes, with the exception that I kept capital letters if they were part of an acronym. I've also switched the link in the documentation over to github and configured mosaic to use the MSAProbs application wrapper if it's installed. Let me know what you think!! Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html Code: https://github.com/cyrusmaher/mosaic Cheers, -Cyrus On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer wrote: > On 8/19/13 21:24 , Cyrus Maher wrote: > > Hi everybody!!- > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > lab > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > I am writing because I'm interested in submitting a new Biopython module. > > Since this is likely a one-time event, the wiki recommends proceeding > > through a developer. After speaking with Peter Cock, he recommended that > I > > open things up for discussion on the mailing list. > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > integrates multiple sequence alignments from an arbitrary number number > of > > sources. We show that it greatly increases the number of orthologs that > we > > are able to detect while maintaining or improving functional-, > > phylogenetic-, and sequence identity-based measures of ortholog quality. > > > > Code and documentation may be found here: > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > Looking forward to hearing what you think! > > > > Hi Cyrus, > > I agree with David on the PEP8 issue. A very nice tool to use is the > pep8 checker, https://pypi.python.org/pypi/pep8 > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > the works. I haven't submitted it yet due to incomplete unit tests, > but maybe it's useful to you: > > https://github.com/cbrueffer/biopython/tree/msaprobs > > Cheers, > > Chris > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From mok at bioxray.dk Tue Aug 20 18:35:14 2013 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Tue, 20 Aug 2013 20:35:14 +0200 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: <43FD0A6C-ED54-4861-AADA-9F3E8FB6172A@bioxray.dk> On 15/08/2013, at 16:54, Lenna Peterson wrote: >>> I don't think writing string "None" into a fixed width field would be a >> good >>> idea. So it's probably best to change occupancy (and any other missing >>> values set to None) to blank, correct width fields for writing. >> >> I didn't mean to suggest writing the string "None" in the field, and >> I'm not sure if Jo?o did - it would certainly be an invalid PDB file. >> >> > I didn't mean anyone was suggesting we intentionally do this, but I bet > that's what the writer is doing now! I think the output should be identical to the input if a PDB file is read and then written again (apart from the fact that Bio.PDB currently doesn't save all headers.) Cheers, Morten From davidjosephcain at gmail.com Tue Aug 20 21:25:07 2013 From: davidjosephcain at gmail.com (David Cain) Date: Tue, 20 Aug 2013 17:25:07 -0400 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: <521354A9.6020701@brueffer.de> Message-ID: Hi, Cyrus - I took a quick look at your code on GitHub. Did you publish a different version of MOSAIC? By my linter, there are 309 PEP8 errors on mosaic.py. Also, as a general comment, your code seems to rely on sys.exit extensively. Python's exception framework is pretty handy - maybe your module could raise its own custom exceptions (Biopython's PDB parser is a good example of this design strategy). David Cain +1 (339) 222 4452 On Tue, Aug 20, 2013 at 2:24 PM, Cyrus Maher wrote: > Thanks for your feedback, guys!! I did a bit of general clean-up and I've > made all the recommended PEP8 changes, with the exception that I kept > capital letters if they were part of an acronym. I've also switched the > link in the documentation over to github and configured mosaic to use the > MSAProbs application wrapper if it's installed. Let me know what you > think!! > > Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html > Code: https://github.com/cyrusmaher/mosaic > > Cheers, > > -Cyrus > > > On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer > wrote: > > > On 8/19/13 21:24 , Cyrus Maher wrote: > > > Hi everybody!!- > > > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > > lab > > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > > > I am writing because I'm interested in submitting a new Biopython > module. > > > Since this is likely a one-time event, the wiki recommends proceeding > > > through a developer. After speaking with Peter Cock, he recommended > that > > I > > > open things up for discussion on the mailing list. > > > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > > integrates multiple sequence alignments from an arbitrary number number > > of > > > sources. We show that it greatly increases the number of orthologs that > > we > > > are able to detect while maintaining or improving functional-, > > > phylogenetic-, and sequence identity-based measures of ortholog > quality. > > > > > > Code and documentation may be found here: > > > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > > > Looking forward to hearing what you think! > > > > > > > Hi Cyrus, > > > > I agree with David on the PEP8 issue. A very nice tool to use is the > > pep8 checker, https://pypi.python.org/pypi/pep8 > > > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > > the works. I haven't submitted it yet due to incomplete unit tests, > > but maybe it's useful to you: > > > > https://github.com/cbrueffer/biopython/tree/msaprobs > > > > Cheers, > > > > Chris > > > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From arklenna at gmail.com Tue Aug 20 21:31:40 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Tue, 20 Aug 2013 17:31:40 -0400 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: <521354A9.6020701@brueffer.de> References: <521354A9.6020701@brueffer.de> Message-ID: Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8 (it can be a bit aggressive but that's what version control is for, right?) Cheers, Lenna On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer wrote: > On 8/19/13 21:24 , Cyrus Maher wrote: > > Hi everybody!!- > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > lab > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > I am writing because I'm interested in submitting a new Biopython module. > > Since this is likely a one-time event, the wiki recommends proceeding > > through a developer. After speaking with Peter Cock, he recommended that > I > > open things up for discussion on the mailing list. > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > integrates multiple sequence alignments from an arbitrary number number > of > > sources. We show that it greatly increases the number of orthologs that > we > > are able to detect while maintaining or improving functional-, > > phylogenetic-, and sequence identity-based measures of ortholog quality. > > > > Code and documentation may be found here: > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > Looking forward to hearing what you think! > > > > Hi Cyrus, > > I agree with David on the PEP8 issue. A very nice tool to use is the > pep8 checker, https://pypi.python.org/pypi/pep8 > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > the works. I haven't submitted it yet due to incomplete unit tests, > but maybe it's useful to you: > > https://github.com/cbrueffer/biopython/tree/msaprobs > > Cheers, > > Chris > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From arklenna at gmail.com Tue Aug 20 22:16:18 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Tue, 20 Aug 2013 18:16:18 -0400 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock wrote: > > > I didn't mean to suggest writing the string "None" in the field, and > I'm not sure if Jo?o did - it would certainly be an invalid PDB file. > > I agree that where the data structure has None (e.g. from our parser) > then the writer could use a blank string (of the appropriate width). > For mandatory fields like occupancy, this should give a warning. > > As I suspected, the writer currently fails on None (it's expecting a float). Test-driven development! However, I don't see a simple or elegant way to force writing of a blank occupancy. ATOM lines are currently written using C-style string formatting, and the occupancy field is `%6.2f`. Off the top of my head, I'd: 1. Store the original format string 2. Modify the format string to have "%6s" at the appropriate position 3. Modify the occupancy to be an empty string or a space 4. Set the return value to the formatted string 5. Restore the original format string 6. Return the return value However, this seems...ugly at best. I don't know that switching formatting styles (e.g. to string.format() or others) will help. And in most circumstances, the type checking of the format string is useful. Any thoughts? Cheers, Lenna From anaryin at gmail.com Tue Aug 20 22:25:57 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Tue, 20 Aug 2013 15:25:57 -0700 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: Hi, We should probably change it to str.format() regardless of advantages. If we indeed have None in the parser then writing becomes a bit more complicated. But I guess it's more correct? I'd vote for having a small check/conversion on the writer, besides on the formatting of the string. As a biologist, I don't care if it is none of empty string, or whatever, but for scripting maybe it makes more sense to be None? That's what I mean with more correct. Cheers, Jo?o From michael.maher at ucsf.edu Wed Aug 21 22:00:04 2013 From: michael.maher at ucsf.edu (Cyrus Maher) Date: Wed, 21 Aug 2013 15:00:04 -0700 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: <521354A9.6020701@brueffer.de> Message-ID: Thanks for sending that along Lenna! And thanks everybody for being patient with me! This is my first experience sharing software, so it's great to learn from you guys... As far as updates: -I've fixed all pep8 errors, with the exception of some finicky continuation indent complaints. -I've also uploaded example files so that the file "mosaic_example.py" can be run without modification. From the mosaic directory, just type: python mosaic_example.py testfiles.txt -The documentation has be updated as well. I would of course be open to any additional feedback you guys could offer for improving the code. That said, I was also hoping to get your thoughts on whether this seemed like the type of project that would fit in with Biopython. Peter said that Eric might have some good comments on this matter? Cheers, -Cyrus On Tue, Aug 20, 2013 at 2:31 PM, Lenna Peterson wrote: > Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8 > (it can be a bit aggressive but that's what version control is for, right?) > > Cheers, > > Lenna > > > On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer > wrote: > > > On 8/19/13 21:24 , Cyrus Maher wrote: > > > Hi everybody!!- > > > > > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the > > lab > > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)... > > > > > > I am writing because I'm interested in submitting a new Biopython > module. > > > Since this is likely a one-time event, the wiki recommends proceeding > > > through a developer. After speaking with Peter Cock, he recommended > that > > I > > > open things up for discussion on the mailing list. > > > > > > Attached is a draft that describes a new method, termed MOSAIC, which > > > integrates multiple sequence alignments from an arbitrary number number > > of > > > sources. We show that it greatly increases the number of orthologs that > > we > > > are able to detect while maintaining or improving functional-, > > > phylogenetic-, and sequence identity-based measures of ortholog > quality. > > > > > > Code and documentation may be found here: > > > > > > https://dl.dropboxusercontent.com/u/43327584/html/index.html > > > > > > Looking forward to hearing what you think! > > > > > > > Hi Cyrus, > > > > I agree with David on the PEP8 issue. A very nice tool to use is the > > pep8 checker, https://pypi.python.org/pypi/pep8 > > > > I see that you use MSAProbs. I have an MSAProbs application wrapper in > > the works. I haven't submitted it yet due to incomplete unit tests, > > but maybe it's useful to you: > > > > https://github.com/cbrueffer/biopython/tree/msaprobs > > > > Cheers, > > > > Chris > > > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Thu Aug 22 13:01:27 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 22 Aug 2013 14:01:27 +0100 Subject: [Biopython-dev] Fwd: New Biopython (sub)module? In-Reply-To: References: <521354A9.6020701@brueffer.de> Message-ID: On Wed, Aug 21, 2013 at 11:00 PM, Cyrus Maher wrote: > > That said, I was also hoping to get your thoughts on whether this seemed > like the type of project that would fit in with Biopython. Peter said that > Eric might have some good comments on this matter? Right - I was thinking Eric and this year's phylogenetic focused GSoC students should have some good comments, e.g. about adding something like pal2nal into Biopython. Peter From p.j.a.cock at googlemail.com Fri Aug 23 08:54:35 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 23 Aug 2013 09:54:35 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Fri, Aug 16, 2013 at 8:14 AM, Wibowo Arindrarto wrote: > Hi Michiel, Peter, > > In preparation for the 1.62 release, I've made the following changes > to Bio.NCBIStandalone and Bio.ParserSupport: > > * Migrated the two modules under Bio.SearchIO._legacy > * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning So basically you're proposing formally deprecating parsing plain text BLAST output (via NCBIStandalone and Bio.ParserSupport) but continuing to support this format via SearchIO (using a copy of the current parser as a private module)? This then gives you the freedom to rewrite the old text parser more simply (e.g. assuming only recent versions of the BLAST suite), which might be nice. > I've pushed the changes to this branch: > https://github.com/bow/biopython/tree/bio_blast_migrate > > Tests seem to be running fine still, but now there is the awkward > situation where if users import Bio.NCBIStandalone and/or > Bio.ParserSupport directly they will be greeted with two warnings: the > BiopythonWarning for the modules' deprecation and the > BiopythonExperimentalWarning for SearchIO. > > We could suppress the SearchIO warning in Bio.NCBIStandalone and > Bio.ParserSupport. But before this is done, I was wondering if we have > a defined timeline for removing a BiopythonExperimentalWarning? (i.e. > if it will be removed in this release, then we could do that instead). It doesn't make sense to have a defined timetime for removing a BiopythonExperimentalWarning - it will be on a case by case basis. Do you think SearchIO is ready for that now (or in Biopython 1.63)? Peter From p.j.a.cock at googlemail.com Fri Aug 23 09:05:02 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 23 Aug 2013 10:05:02 +0100 Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on commits?) In-Reply-To: References: Message-ID: On Tue, Aug 20, 2013 at 11:16 PM, Lenna Peterson wrote: > > On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock > wrote: >> >> >> I didn't mean to suggest writing the string "None" in the field, and >> I'm not sure if Jo?o did - it would certainly be an invalid PDB file. >> >> I agree that where the data structure has None (e.g. from our parser) >> then the writer could use a blank string (of the appropriate width). >> For mandatory fields like occupancy, this should give a warning. >> > > As I suspected, the writer currently fails on None (it's expecting a float). > Test-driven development! > > However, I don't see a simple or elegant way to force writing of a blank > occupancy. ATOM lines are currently written using C-style string formatting, > and the occupancy field is `%6.2f`. > > Off the top of my head, I'd: > > 1. Store the original format string > 2. Modify the format string to have "%6s" at the appropriate position > 3. Modify the occupancy to be an empty string or a space > 4. Set the return value to the formatted string > 5. Restore the original format string > 6. Return the return value > > However, this seems...ugly at best. I don't know that switching formatting > styles (e.g. to string.format() or others) will help. And in most > circumstances, the type checking of the format string is useful. > > Any thoughts? I would suggest something like this (untested): $ git diff diff --git a/Bio/PDB/PDBIO.py b/Bio/PDB/PDBIO.py index 2f64571..11a52ca 100644 --- a/Bio/PDB/PDBIO.py +++ b/Bio/PDB/PDBIO.py @@ -8,7 +8,7 @@ from Bio.PDB.StructureBuilder import StructureBuilder # To allow saving of chains, residues, etc.. from Bio.Data.IUPACData import atom_weights # Allowed Elements -_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c %8.3f%8.3f%8.3f%6.2f%6.2f %4s%2s%2s\n" +_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c %8.3f%8.3f%8.3f%s%6.2f %4s%2s%2s\n" class Select(object): @@ -85,8 +85,21 @@ class PDBIO(object): x, y, z=atom.get_coord() bfactor=atom.get_bfactor() occupancy=atom.get_occupancy() + # Handle a missing occupancy (None) with a blank entry: + try: + occupancy_str = "%6.2f" % occupancy + except TypeError: + if occupancy is None: + occupancy_str = " " * 6 + import warnings + from Bio import BiopythonWarning + # TODO - Introduce exception BiopythonWriterWarning? + warning.warn("Missing occupancy will be recorded as blank", + BiopythonWarning) + else: + raise TypeError("Invalid occupancy %r in atom %r" % (occupancy, atom)) args=(record_type, atom_number, name, altloc, resname, chain_id, - resseq, icode, x, y, z, occupancy, bfactor, segid, + resseq, icode, x, y, z, occupancy_str, bfactor, segid, element, charge) return _ATOM_FORMAT_STRING % args The error message could be improved (e.g. a more helpful identification of the ATOM at fault)? Peter From w.arindrarto at gmail.com Sat Aug 24 10:22:56 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Sat, 24 Aug 2013 12:22:56 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: Hi Peter, everyone, >> In preparation for the 1.62 release, I've made the following changes >> to Bio.NCBIStandalone and Bio.ParserSupport: >> >> * Migrated the two modules under Bio.SearchIO._legacy >> * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning > > So basically you're proposing formally deprecating parsing plain > text BLAST output (via NCBIStandalone and Bio.ParserSupport) > but continuing to support this format via SearchIO (using a copy > of the current parser as a private module)? > > This then gives you the freedom to rewrite the old text parser > more simply (e.g. assuming only recent versions of the BLAST > suite), which might be nice. Yes. This seems like a sensible thing to do now. >> I've pushed the changes to this branch: >> https://github.com/bow/biopython/tree/bio_blast_migrate >> >> Tests seem to be running fine still, but now there is the awkward >> situation where if users import Bio.NCBIStandalone and/or >> Bio.ParserSupport directly they will be greeted with two warnings: the >> BiopythonWarning for the modules' deprecation and the >> BiopythonExperimentalWarning for SearchIO. >> >> We could suppress the SearchIO warning in Bio.NCBIStandalone and >> Bio.ParserSupport. But before this is done, I was wondering if we have >> a defined timeline for removing a BiopythonExperimentalWarning? (i.e. >> if it will be removed in this release, then we could do that instead). > > It doesn't make sense to have a defined timetime for removing a > BiopythonExperimentalWarning - it will be on a case by case basis. > > Do you think SearchIO is ready for that now (or in Biopython 1.63)? Hmm..what I have in mind is actually as soon as we lift SearchIO's BiopythonExperimentalWarning, we give Bio.Blast a PendingDeprecationWarning. I think this gives users a clearer / firmer choice, since it could be confusing to have two different modules that handle BLAST parsing in Biopython. As for the readiness, I think the important features that we planned have been implemented in SearchIO. I don't have any major feature change that I would like to implement anytime soon, too. So yes, I think it is ready. Best, Bow From yeyanbo289 at gmail.com Mon Aug 26 03:53:50 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 26 Aug 2013 11:53:50 +0800 Subject: [Biopython-dev] GSOC weekly update 11 Message-ID: Hi all, Biopython.Phylo project update for last week is here: http://blog.yeyanbo.com/posts/google-summer-of-code-11.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From p.j.a.cock at googlemail.com Mon Aug 26 14:04:35 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 26 Aug 2013 15:04:35 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto wrote: > Hi Peter, everyone, > > As for the readiness, I think the important features that we planned > have been implemented in SearchIO. I don't have any major feature > change that I would like to implement anytime soon, too. So yes, I > think it is ready. So you'd be comfortable with removing the experimental warning for SearchIO in Biopython 1.62 final (this week if the PDB occupancy thing is resolved)? And you would like to officially support plain text BLAST parsing (despite it not being recommend by the NCBI, and known to have been quite a lot of work in the past to keep the parser working)? We should probably also give you (Bow) commit rights too, so you can handle basic parser updates within SearchIO directly - assuming you're happy with that? Regards, Peter From w.arindrarto at gmail.com Mon Aug 26 16:04:38 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Mon, 26 Aug 2013 18:04:38 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 26, 2013 at 4:04 PM, Peter Cock wrote: > On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto > wrote: >> Hi Peter, everyone, >> >> As for the readiness, I think the important features that we planned >> have been implemented in SearchIO. I don't have any major feature >> change that I would like to implement anytime soon, too. So yes, I >> think it is ready. > > So you'd be comfortable with removing the experimental warning > for SearchIO in Biopython 1.62 final (this week if the PDB occupancy > thing is resolved)? Yes. I think all public-facing modules are ok now. There are still two issue which I consider minor, but I think should be mentioned before we lift the warning: 1. Storing [T]FAST[X|Y] query and hit strand information (see https://redmine.open-bio.org/issues/3419). I'm not sure yet if I should do the commit, but Jason's patch look sensible (and I can probably add some more so that the parser knows whether to set the strand on hit or query sequence). 2. Collapsing / merging overlapping HSPs. I've received one (or two) mail(s) asking whether it is possible to merge overlapping HSPs (apparently BLAST sometimes do this). I haven't figured a way to cleanly implement this, so this is on hold for now. In addition, we had a discussion some months ago about the Bio._utils module that SearchIO uses (see http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010219.html, http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010240.html, and http://lists.open-bio.org/pipermail/biopython-dev/2013-February/010290.html). We had an extensive discussion about this last time, which went as far as considering a change on how we run our tests. Since the Bio._utils module itself is private, however, no public-facing functions in SearchIO is affected. Other than these, some planned features are implementing the HMMER3.1 parser (which I think should not interfere with lifting the warning). > And you would like to officially support plain text BLAST parsing > (despite it not being recommend by the NCBI, and known to have > been quite a lot of work in the past to keep the parser working)? Looking at http://lists.open-bio.org/pipermail/biopython/2012-September/008166.html, the most sensible approach seems to be to put the current parser under SearchIO (hence the module reorganization I did; so we can deprecate Bio.Blast as a whole without losing functionality), without actually advertising that we have full support of parsing the text output (perhaps put a disclaimer that plain text is not guaranteed to work?). I feel like some people may still want to use previous BLAST versions anyway, and we do have a functioning parser tested up to 2.2.26+, so throwing it away doesn't seem to be the best thing to do here. And in the case that someone does want to extend the parser (could be me, could be someone else) to work with the latest BLAST version, (s)he can then extend the existing parser. > We should probably also give you (Bow) commit rights too, so you > can handle basic parser updates within SearchIO directly - assuming > you're happy with that? This is fine with me. Best, Bow P.S. I made the pull request for the reorganization here: https://github.com/biopython/biopython/pull/223, comments are welcomed :). From p.j.a.cock at googlemail.com Tue Aug 27 08:41:39 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 27 Aug 2013 09:41:39 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Message-ID: On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: > >> So you'd be comfortable with removing the experimental warning >> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy >> thing is resolved)? > > Yes. I think all public-facing modules are ok now. There are still two > issue which I consider minor, but I think should be mentioned before > we lift the warning: > > ... > > Other than these, some planned features are implementing the HMMER3.1 > parser (which I think should not interfere with lifting the warning). We'll also want to update the Tutorial as well, merging the BLAST and SearchIO chapters. Let's start work on this just after releasing Biopython 1.62 then, which I think we can now go ahead with :) Lenna has sorted out the PDB occupancy issue, and Eric has updated the PRANK unit tests. I think this means we are OK to do the release in the next day or two? Any objections? Regards, Peter From p.j.a.cock at googlemail.com Tue Aug 27 08:43:17 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 27 Aug 2013 09:43:17 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? Message-ID: Continuing this thread under a new title, as below, I would like to do the Biopython 1.62 release in the next day or two: http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html Peter On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote: > On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: >> >>> So you'd be comfortable with removing the experimental warning >>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy >>> thing is resolved)? >> >> Yes. I think all public-facing modules are ok now. There are still two >> issue which I consider minor, but I think should be mentioned before >> we lift the warning: >> >> ... >> >> Other than these, some planned features are implementing the HMMER3.1 >> parser (which I think should not interfere with lifting the warning). > > We'll also want to update the Tutorial as well, merging the BLAST > and SearchIO chapters. Let's start work on this just after releasing > Biopython 1.62 then, which I think we can now go ahead with :) > > Lenna has sorted out the PDB occupancy issue, and Eric has > updated the PRANK unit tests. > > I think this means we are OK to do the release in the next day > or two? Any objections? > > Regards, > > Peter From w.arindrarto at gmail.com Tue Aug 27 09:41:32 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Tue, 27 Aug 2013 11:41:32 +0200 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: Hi Peter, everyone, On Tue, Aug 27, 2013 at 10:43 AM, Peter Cock wrote: > Continuing this thread under a new title, as below, I would > like to do the Biopython 1.62 release in the next day or two: > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html > > Peter > > On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote: >> On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: >>> >>>> So you'd be comfortable with removing the experimental warning >>>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy >>>> thing is resolved)? >>> >>> Yes. I think all public-facing modules are ok now. There are still two >>> issue which I consider minor, but I think should be mentioned before >>> we lift the warning: >>> >>> ... >>> >>> Other than these, some planned features are implementing the HMMER3.1 >>> parser (which I think should not interfere with lifting the warning). >> >> We'll also want to update the Tutorial as well, merging the BLAST >> and SearchIO chapters. Let's start work on this just after releasing >> Biopython 1.62 then, which I think we can now go ahead with :) Ah yes. I missed the tutorial. Then yes, it should be updated as well. If we are doing this after 1.62 is released, is worth it to aim for a larger change (I recall there was a discussion some time ago about porting the tutorial to Sphinx). >> Lenna has sorted out the PDB occupancy issue, and Eric has >> updated the PRANK unit tests. >> >> I think this means we are OK to do the release in the next day >> or two? Any objections? No objections from me :). Best, Bow From eric.talevich at gmail.com Tue Aug 27 18:45:58 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Tue, 27 Aug 2013 11:45:58 -0700 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: On Tue, Aug 27, 2013 at 1:43 AM, Peter Cock wrote: > Continuing this thread under a new title, as below, I would > like to do the Biopython 1.62 release in the next day or two: > > http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html > > Peter > > On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote: > > On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote: > >> > >>> So you'd be comfortable with removing the experimental warning > >>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy > >>> thing is resolved)? > >> > >> Yes. I think all public-facing modules are ok now. There are still two > >> issue which I consider minor, but I think should be mentioned before > >> we lift the warning: > >> > >> ... > >> > >> Other than these, some planned features are implementing the HMMER3.1 > >> parser (which I think should not interfere with lifting the warning). > > > > We'll also want to update the Tutorial as well, merging the BLAST > > and SearchIO chapters. Let's start work on this just after releasing > > Biopython 1.62 then, which I think we can now go ahead with :) > > > > Lenna has sorted out the PDB occupancy issue, and Eric has > > updated the PRANK unit tests. > > > > I think this means we are OK to do the release in the next day > > or two? Any objections? > > > > Regards, > > > > Peter > Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the Tutorial to mention CDAO support? Also, has anything else noteworthy been added since the beta that we can announce in the NEWS file? Thanks, Eric From p.j.a.cock at googlemail.com Tue Aug 27 19:27:48 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 27 Aug 2013 20:27:48 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich wrote: > > Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the > Tutorial to mention CDAO support? Go for it - I need to retest the DSSP unit test tomorrow anyway. > Also, has anything else noteworthy been added since the beta that we can > announce in the NEWS file? Minor bug fixes and more tests? Perhaps the PDB occupancy change? Peter From w.arindrarto at gmail.com Wed Aug 28 12:12:24 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Wed, 28 Aug 2013 14:12:24 +0200 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: Hi Peter, everyone, On Tue, Aug 27, 2013 at 9:27 PM, Peter Cock wrote: > On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich wrote: >> >> Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the >> Tutorial to mention CDAO support? > > Go for it - I need to retest the DSSP unit test tomorrow anyway. > >> Also, has anything else noteworthy been added since the beta that we can >> announce in the NEWS file? > > Minor bug fixes and more tests? Perhaps the PDB occupancy change? > > Peter I don't like to believe in coincidences, but just last night a user emailed me about an issue in SearchIO's exonerate parser which I feel should be mentioned here (exchange attached on his permission). He stumbled on an error where an exonerate output file is unparseable because of split codon alignments. In short, I feel we should not lift the BiopythonExperimentalWarning for the 1.62 release. The issue is caused by protein to genome alignments in exonerate (in the protein2genome alignment mode) that has split codons in it. When split codons are present, SearchIO splits these HSPs into fragments which are basically a single contiguous sequence alignment. These fragments have their own Seq objects (representing hit and query sequences). The problem is, these Seq objects have to be full sequences and the query sequence fragment (protein) do not represent a full sequence here (since the underlying codon is split). Currently, SearchIO raises an AssertionError when this type of alignment is found and simply says it can not deal with it. This should not remain the case, though. A test case was actually put up for this (https://github.com/biopython/biopython/blob/master/Tests/Exonerate/exn_22_m_protein2genome.exn#L173). However, since I have yet to find a way to properly represent these fragments with Seq objects, the actual test have not been written (and I missed this when doing the last review). I have thought of several alternatives: * I saw a ThreeLetterProtein Alphabet in https://github.com/biopython/biopython/blob/master/Bio/Alphabet/__init__.py#L136, maybe we could use this to create Seq objects that allows partial codons? * Change HSPFragment to not use full Seq objects anymore (which may require some rework on the HSP objects as well) But have not explored them thoroughly. I should note that Zheng Ruan's GSoC project on Codon alignments (http://zruanweb.com/category/gsoc.html) may prove useful as well here. While I don't expect the issue to pop up often (it shows up only when exonerate is used with the protein2genome mode out of the many modes it has and when the alignment hits a split codon), I feel like it should be discussed (if not, mentioned) here first since dealing with the issue may require some more reworking. So I'm sorry for the late warning and missing this. I hope this is not too late :). Best, Bow -------------- next part -------------- On Wed, Aug 28, 2013 at 10:31 AM, Wibowo Arindrarto wrote: > Hi Somak, > >> Do you have any idea whether Bioperl based Exonerate parser can handle such cases? >> I'm yet to try Bioperl. > > I tried your file with Bioperl's parser, and while it can parse the > entire file without errors, I don't know whether all the information > in the file (sequence, sequence coordinates) are parsed properly. But > maybe that's just me being less familiar with Bioperl. I suggest > posting to their mailing list > (http://lists.open-bio.org/pipermail/bioperl-l/) or searching the list > archive if you have any questions regarding this. The library also > have an active community behind it. > >> And please feel free to forward this mail to Biopythonlist or any other discussion forum you >> think is appropriate, > > Ok, thanks :). > >> Thanks again >> >> Somak Ray > > Best, > Bow > >> ________________________________________ >> From: w.arindrarto at gmail.com [w.arindrarto at gmail.com] on behalf of Wibowo Arindrarto [bow at bow.web.id] >> Sent: Tuesday, August 27, 2013 8:01 PM >> To: Ray, Somak >> Subject: Re: On parsing of exonerate output >> >> Hi Somak, >> >>> Dear Dr. Arindrarto, >>> >>> I came across your blog about parsing outputs from Exonerate . I have some >>> generated some files using exonarates protein2dna model. However when >>> running your scripts on them I'm getting some assertion error in python 2.7. >>> I'm attaching two of such exonerate outputs.The "Result_goodfile.txt" can >>> be passed by the parser whereas "Result_badfile.txt" can't be parsed. >>> >>> Please let me know if there's any solution to the problem. >>> >>> Thanks in advance >> >> Hmm..looking at the files, it seems that this is caused by a split >> codon in the alignment (Results_badfile.txt, line 25). The problem is, >> the three-letter amino acid sequence needs to be translated into a >> single-letter amino acid sequence since Biopython could not create Seq >> objects with three-letter amino acid codes. However, this conversion >> means that codons that span introns (as the one on line 25) could not >> be dealt with properly since a single fragment expects a full Seq >> object (hence the error you're seeing; it expects the three-letter >> amino acid sequence length to be multiples of three). >> >> So the short answer is no, there is not yet an immediate solution to this issue. >> >> I should mention that this came at an appropriate time, though, so >> thanks for the email :). I am reviewing known SearchIO issues and this >> was apparently an issue that I have lost track of (checking at the >> test suite, there is a test for this case but it has not been included >> in the test suite). >> >> Do you mind if I forward this email to the Biopython list >> (http://biopython.org/wiki/Mailing_lists)? I think other developers / >> users may be interested in this. >> >> Best, >> Bow From p.j.a.cock at googlemail.com Wed Aug 28 17:31:19 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 18:31:19 +0100 Subject: [Biopython-dev] Releasing Biopython 1.62 this week? In-Reply-To: References: Message-ID: Hello all, I'm starting the release 1.62 process now, getting the new DSSP test working cross platform was more work than I expected - thank goodness for the BuildBot server yet again :) Please don't commit anything to the master branch until further notice, Thanks, Peter From p.j.a.cock at googlemail.com Wed Aug 28 18:28:43 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 19:28:43 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress Message-ID: On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock wrote: > Hello all, > > I'm starting the release 1.62 process now, getting the new DSSP > test working cross platform was more work than I expected - > thank goodness for the BuildBot server yet again :) > > Please don't commit anything to the master branch until further > notice, > > Thanks, > > Peter While I finish off the Windows installers etc, and have dinner, would anyone like to volunteer to write a draft for the release announcement to go out on the mailing lists and news blog? http://news.open-bio.org/news/category/obf-projects/biopython/ These are usually based on the rather dry NEWS file information, and the previous announcement for style/links/etc. Thanks, Peter From p.j.a.cock at googlemail.com Wed Aug 28 18:53:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 19:53:21 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 Message-ID: Hello all - especially newcomers, There are going to be several boring but useful things to do to the Biopython code base once we're finished with Python 2.5 (the imminent release of Biopython 1.62 has been clearly described as the final Biopython release to support it). Some of these tasks are quite easy, and might tempt some of our non-core contributors or new-comers to have a go, however to avoid too much duplication of effort I'd suggest **replying in this thread if you want to tackle anything** - and then start working out how to send us your first pull request. Things which will need doing: (0) Disable the Python 2.5 and Jython 2.5 buildbot (this will be done by me or Tiago) (1) Disable the Python 2.5 target in TravisCI, see https://travis-ci.org/biopython/biopython/ (this is a simple one line edit to the .travis.yml file) (2) Remove all the with statement imports (and any comment lines associated with them): from __future__ import with_statement (3) Remove Bio/_py3k/_namedtuple.py and adjust import lines accordingly (4) Scan over the code base looking for any comments about Python 2.5 (e.g. using the grep command), and reviewing them one by one to see if there is an old workaround we can now remove. (5) More advanced code review, for example looking for places we can better take advantage of context managers (with statements) for file handles. Of this list, (1), (2) and (3) are certainly things suitable for relative newcomers - and assuming I'm not away I will happily do the pull request reviews. For the more advances issues (4) and (5) we may need more eyes on the code... Thank you, Peter From p.j.a.cock at googlemail.com Wed Aug 28 19:01:36 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 20:01:36 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 7:28 PM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock wrote: >> Hello all, >> >> I'm starting the release 1.62 process now, getting the new DSSP >> test working cross platform was more work than I expected - >> thank goodness for the BuildBot server yet again :) >> >> Please don't commit anything to the master branch until further >> notice, >> >> Thanks, >> >> Peter > > While I finish off the Windows installers etc, and have dinner, > would anyone like to volunteer to write a draft for the release > announcement to go out on the mailing lists and news blog? > http://news.open-bio.org/news/category/obf-projects/biopython/ > > These are usually based on the rather dry NEWS file information, > and the previous announcement for style/links/etc. > > Thanks, > > Peter A provisional tar-ball, zip file, and four Windows installers are up now (but deliberately not yet listed on the download wiki page): http://biopython.org/DIST/ If anyone would care to sanity test those in the next hour or two, that would be great. Thanks, Peter From p.j.a.cock at googlemail.com Wed Aug 28 20:43:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 21:43:58 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: > Hello all - especially newcomers, > > There are going to be several boring but useful things to do to > the Biopython code base once we're finished with Python 2.5 > (the imminent release of Biopython 1.62 has been clearly > described as the final Biopython release to support it). > > Some of these tasks are quite easy, and might tempt some > of our non-core contributors or new-comers to have a go, > however to avoid too much duplication of effort I'd suggest > **replying in this thread if you want to tackle anything** - and > then start working out how to send us your first pull request. I tweeted this earlier, https://twitter.com/pjacock/status/372796602760855552 > Things which will need doing: > > ... > > (1) Disable the Python 2.5 target in TravisCI, see > https://travis-ci.org/biopython/biopython/ > (this is a simple one line edit to the .travis.yml file) The first easy task has been claimed already: https://github.com/biopython/biopython/pull/226 Wayne wrote: >> Via Twitter, I saw your note" >> (1) Disable the Python 2.5 target in TravisCI, see >> https://travis-ci.org/biopython/biopython/ >> (this is a simple one line edit to the .travis.yml file)" >> >> Turned out it really was as easy as you said. Once the release is out, that fix can go in - thanks :) Wayne (BCC'd), please sign up to the biopython-dev list if you haven't already: http://lists.open-bio.org/mailman/listinfo/biopython-dev Thank you, Peter From arklenna at gmail.com Wed Aug 28 20:57:10 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Wed, 28 Aug 2013 16:57:10 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 2:53 PM, Peter Cock wrote: > > (2) Remove all the with statement imports (and any > comment lines associated with them): > > from __future__ import with_statement > As I demonstrated, I regularly forget that `with` is "new"! > > (4) Scan over the code base looking for any comments > about Python 2.5 (e.g. using the grep command), and > reviewing them one by one to see if there is an old > workaround we can now remove. > If I count: find Bio -name "*.py" -exec grep -H -n ".*#.*2\.5" {} \; I only see 24 - not too bad. Many are `with` related. > > (5) More advanced code review, for example looking > for places we can better take advantage of context > managers (with statements) for file handles. > For this one: find Bio -name "*.py" -exec grep -H -n -P "= ?open\(" {} \; I find 145...although not all `open()` statements can be easily swapped for `with`. I'm currently prepping for my UK trip so I may not be able to do any of this before I get back mid-September. Cheers, Lenna From p.j.a.cock at googlemail.com Wed Aug 28 20:58:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 21:58:58 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 9:43 PM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: >> Hello all - especially newcomers, >> >> There are going to be several boring but useful things to do to >> the Biopython code base once we're finished with Python 2.5 >> (the imminent release of Biopython 1.62 has been clearly >> described as the final Biopython release to support it). >> >> Some of these tasks are quite easy, and might tempt some >> of our non-core contributors or new-comers to have a go, >> however to avoid too much duplication of effort I'd suggest >> **replying in this thread if you want to tackle anything** - and >> then start working out how to send us your first pull request. > > I tweeted this earlier, > https://twitter.com/pjacock/status/372796602760855552 > >> Things which will need doing: >> >> ... >> >> (1) Disable the Python 2.5 target in TravisCI, see >> https://travis-ci.org/biopython/biopython/ >> (this is a simple one line edit to the .travis.yml file) > > The first easy task has been claimed already: > https://github.com/biopython/biopython/pull/226 And task (2) as well on the same pull request - keen! Wayne (BCC'd), could you delay trying task (3) for a few days to give someone else a chance please ;) Maybe have a look for things under (4) instead, Lenna's quick count suggests plenty of things need looking at... Peter From w.arindrarto at gmail.com Wed Aug 28 21:17:57 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Wed, 28 Aug 2013 23:17:57 +0200 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: Hi everyone, I've written a draft of our 1.62 release (below). I'd appreciate it if somebody gives it another look (for typos, etc.). Also, if I miss somebody in the contributors list, please let me know :). --- Biopython 1.62 released ======================= Source distributions and Windows installers for **Biopython** 1.62 are now available from the [downloads page](http://biopython.org/wiki/Download) on the [official Biopython website](http://biopython.org/wiki/Main_Page) and from the [Python Package Index (PyPI)](https://pypi.python.org/pypi/biopython). # Python support This is our first official release that supports Python 3. Specifically, we tested under Python 3.3. Other versions of Python 3 may still work albeit with some issues. We still fully support Python 2.5, 2.6, and 2.7. Support under [Jython](http://www.jython.org/) is available for versions 2.5 and 2.7 and under [PyPy](http://pypy.org/) for versions 1.9 and 2.0. However, unlike CPython, Jython and PyPy support is partial: NumPy and our C extensions are not covered. Please note that this release marks our last official support Python 2.5. Beginning from Biopython 1.63, the minimum supported Python version will be 2.6. # Highlights * The translation functions will give a warning on any partial codons (and this will probably become an error in a future release). If you know you are dealing with partial sequences, either pad with N to extend the sequence length to a multiple of three, or explicitly trim the sequence. * The handling of joins and related complex features in Genbank/EMBL files has been changed with the introduction of a CompoundLocation object. Previously a SeqFeature for something like a multi-exon CDS would have a child SeqFeature (under the sub_features attribute) for each exon. The sub_features property will still be populated for now, but is deprecated and will in future be removed. Please consult the examples in the help (docstrings) and Tutorial. * Thanks to the efforts of Ben Morris, the Phylo module now supports the file formats NeXML and CDAO. The Newick parser is also significantly faster, and can now optionally extract bootstrap values from the Newick comment field (like Molphy and Archaeopteryx do). Nate Sutton added a wrapper for FastTree to Bio.Phylo.Applications. * New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from UniProt-GOA. * The BioSQL module is now supported in Jython. MySQL and PostgreSQL databases can be used. The relevant JDBC driver should be available in the CLASSPATH. * Feature labels on circular GenomeDiagram figures now support the label_position argument (start, middle or end) in addition to the current default placement, and in a change to prior releases these labels are outside the features which is now consistent with the linear diagrams. * The code for parsing 3D structures in mmCIF files was updated to use the Python standard library's shlex module instead of C code using flex. * The Bio.Sequencing.Applications module now includes a BWA command line wrapper. * Bio.motifs supports JASPAR format files with multiple position-frequence matrices. Additionally there have been other minor bug fixes and more unit tests. # Contributors Many thanks to the Biopython developers and community for making this release possible, especially the following contributors: Alexander Campbell (first contribution) Andrea Rizzi (first contribution) Anthony Mathelier (first contribution) Ben Morris (first contribution) Brad Chapman Christian Brueffer David Arenillas (first contribution) David Martin (first contribution) Eric Talevich Iddo Friedberg Jian-Long Huang (first contribution) Joao Rodrigues Kai Blin Michiel de Hoon Nate Sutton (first contribution) Peter Cock Petra Kubincov? (first contribution) Phillip Garland Saket Choudhary (first contribution) Tiago Antao Wibowo 'Bow' Arindrarto Xabier Bello (first contribution) ---- Best, Bow On Wed, Aug 28, 2013 at 8:28 PM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock wrote: >> Hello all, >> >> I'm starting the release 1.62 process now, getting the new DSSP >> test working cross platform was more work than I expected - >> thank goodness for the BuildBot server yet again :) >> >> Please don't commit anything to the master branch until further >> notice, >> >> Thanks, >> >> Peter > > While I finish off the Windows installers etc, and have dinner, > would anyone like to volunteer to write a draft for the release > announcement to go out on the mailing lists and news blog? > http://news.open-bio.org/news/category/obf-projects/biopython/ > > These are usually based on the rather dry NEWS file information, > and the previous announcement for style/links/etc. > > Thanks, > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From p.j.a.cock at googlemail.com Wed Aug 28 21:30:33 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 22:30:33 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 10:17 PM, Wibowo Arindrarto wrote: > Hi everyone, > > I've written a draft of our 1.62 release (below). I'd appreciate it if > somebody gives it another look (for typos, etc.). Also, if I miss > somebody in the contributors list, please let me know :). Thanks Bow - I don't think the WordPress blog understands markdown style markup, but bonus marks anyway :) I'm about to update the tar-ball and zip file to include the NEWS file updated with the two names Bow spotted as missing - hopefully there are no more and this commit will get the release tag: https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff (The Windows installers I prepared earlier should not be affected as they don't include the NEWS file) > # Python support > > This is our first official release that supports Python 3. > Specifically, we tested under Python 3.3. Other versions > of Python 3 may still work albeit with some issues. I'd be a bit more explicit: Specifically, this is supported under Python 3.3. Older versions of Python 3 may still work albeit with some issues, but are *not* supported. > Please note that this release marks our last official support Python > 2.5. Beginning from Biopython 1.63, the minimum supported Python > version will be 2.6. Minor typo, needs a for/of, e.g. Please note that this release marks our last official support for Python 2.5 Thanks Bow, Peter From w.arindrarto at gmail.com Wed Aug 28 22:17:44 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Thu, 29 Aug 2013 00:17:44 +0200 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: Hi Peter, > Thanks Bow - I don't think the WordPress blog understands > markdown style markup, but bonus marks anyway :) Ah yes, I was planning to convert it later to HTML (I find writing markdown first easier ~ and also more mailing-list friendly). > I'm about to update the tar-ball and zip file to include the > NEWS file updated with the two names Bow spotted as > missing - hopefully there are no more and this commit > will get the release tag: > > https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff > > (The Windows installers I prepared earlier should not be > affected as they don't include the NEWS file) > >> # Python support >> >> This is our first official release that supports Python 3. >> Specifically, we tested under Python 3.3. Other versions >> of Python 3 may still work albeit with some issues. > > I'd be a bit more explicit: > > Specifically, this is supported under Python 3.3. Older > versions of Python 3 may still work albeit with some > issues, but are *not* supported. > >> Please note that this release marks our last official support Python >> 2.5. Beginning from Biopython 1.63, the minimum supported Python >> version will be 2.6. > > Minor typo, needs a for/of, e.g. > > Please note that this release marks our last official support for > Python 2.5 > > Thanks Bow, > > Peter Fixes applied, thanks too :). Best, Bow From p.j.a.cock at googlemail.com Wed Aug 28 22:21:54 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 23:21:54 +0100 Subject: [Biopython-dev] Biopython 1.62 release in progress In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 11:17 PM, Wibowo Arindrarto wrote: > Hi Peter, > >> Thanks Bow - I don't think the WordPress blog understands >> markdown style markup, but bonus marks anyway :) > > Ah yes, I was planning to convert it later to HTML (I find writing > markdown first easier ~ and also more mailing-list friendly). Thank you :) This is live now but can be edited - so we can fix any remaining issues before sending round the emails: http://news.open-bio.org/news/2013/08/biopython-1-62-released/ Tagged on GitHub too, https://github.com/biopython/biopython/tree/biopython-162 Note I have not yet pushed to PyPI - I'd like one or two positive reports first before doing that (just in case). Thanks all, Peter From p.j.a.cock at googlemail.com Wed Aug 28 22:47:04 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 28 Aug 2013 23:47:04 +0100 Subject: [Biopython-dev] Biopython 1.62 released Message-ID: Dear Biopythoneers, Source distributions and Windows installers for Biopython 1.62 are now available from the downloads page on the official Biopython website and (soon) from the Python Package Index (PyPI). Python support This is our first release of Biopython which officially supports Python 3. Specifically, this is supported under Python 3.3. Older versions of Python 3 may still work albeit with some issues, but are not supported. We still fully support Python 2.5, 2.6, and 2.7. Support under Jython is available for versions 2.5 and 2.7 and under PyPy for versions 1.9 and 2.0. However, unlike CPython, Jython and PyPy support is partial: NumPy and our C extensions are not covered. Please note that this release marks our last official for support Python 2.5. Beginning from Biopython 1.63, the minimum supported Python version will be 2.6. Highlights The translation functions will give a warning on any partial codons (and this will probably become an error in a future release). If you know you are dealing with partial sequences, either pad with ?N? to extend the sequence length to a multiple of three, or explicitly trim the sequence. The handling of joins and related complex features in Genbank/EMBL files has been changed with the introduction of a CompoundLocation object. Previously a SeqFeaturefor something like a multi-exon CDS would have a child SeqFeature (under thesub_features attribute) for each exon. The sub_features property will still be populated for now, but is deprecated and will in future be removed. Please consult the examples in the help (docstrings) and Tutorial. Thanks to the efforts of Ben Morris, the Phylo module now supports the file formats NeXML and CDAO. The Newick parser is also significantly faster, and can now optionally extract bootstrap values from the Newick comment field (like Molphy and Archaeopteryx do). Nate Sutton added a wrapper for FastTree toBio.Phylo.Applications. New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from UniProt-GOA. The BioSQL module is now supported in Jython. MySQL and PostgreSQL databases can be used. The relevant JDBC driver should be available in the CLASSPATH. Feature labels on circular GenomeDiagram figures now support the label_positionargument (start, middle or end) in addition to the current default placement, and in a change to prior releases these labels are outside the features which is now consistent with the linear diagrams. The code for parsing 3D structures in mmCIF files was updated to use the Python standard library?s shlex module instead of C code using flex. The Bio.Sequencing.Applications module now includes a BWA command line wrapper. Bio.motifs supports JASPAR format files with multiple position-frequence matrices. Additionally there have been other minor bug fixes and more unit tests. Contributors Many thanks to the Biopython developers and community for making this release possible, especially the following contributors: Alexander Campbell (first contribution) Andrea Rizzi (first contribution) Anthony Mathelier (first contribution) Ben Morris (first contribution) Brad Chapman Christian Brueffer David Arenillas (first contribution) David Martin (first contribution) Eric Talevich Iddo Friedberg Jian-Long Huang (first contribution) Joao Rodrigues Kai Blin Lenna Peterson Michiel de Hoon Matsuyuki Shirota (first contribution) Nate Sutton (first contribution) Peter Cock Petra Kubincov? (first contribution) Phillip Garland Saket Choudhary (first contribution) Tiago Antao Wibowo ?Bow? Arindrarto Xabier Bello (first contribution) Thank you all. Release announcement here (RSS feed available): http://news.open-bio.org/news/2013/08/biopython-1-62-released/ P.S. You can follow @Biopython on Twitter https://twitter.com/Biopython From p.j.a.cock at googlemail.com Thu Aug 29 09:04:59 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 10:04:59 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: > Hello all - especially newcomers, > > There are going to be several boring but useful things to do to > the Biopython code base once we're finished with Python 2.5 > (the imminent release of Biopython 1.62 has been clearly > described as the final Biopython release to support it). > > Some of these tasks are quite easy, and might tempt some > of our non-core contributors or new-comers to have a go, > however to avoid too much duplication of effort I'd suggest > **replying in this thread if you want to tackle anything** - and > then start working out how to send us your first pull request. > > Things which will need doing: > > (0) Disable the Python 2.5 and Jython 2.5 buildbot > (this will be done by me or Tiago) Done. > (1) Disable the Python 2.5 target in TravisCI, see > https://travis-ci.org/biopython/biopython/ > (this is a simple one line edit to the .travis.yml file) Done by Wayne, https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de > (2) Remove all the with statement imports (and any > comment lines associated with them): > > from __future__ import with_statement Done by Wayne, https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84 > (3) Remove Bio/_py3k/_namedtuple.py and adjust > import lines accordingly Any new volunteer want to try this? > (4) Scan over the code base looking for any comments > about Python 2.5 (e.g. using the grep command), and > reviewing them one by one to see if there is an old > workaround we can now remove. Lenna had a quick look, there should be some easy one here. > (5) More advanced code review, for example looking > for places we can better take advantage of context > managers (with statements) for file handles. Another new one, related to (5), and fairly easy: (6) Reviewing examples in the docstrings and Tutorial where it would make sense to use a 'with' for file handles. This should also solve many of the ResourceWarning: unclosed file ... warnings visible running the full test suite under Python 3, e.g. see: http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio Peter From chris.mit7 at gmail.com Thu Aug 29 15:20:09 2013 From: chris.mit7 at gmail.com (Chris Mitchell) Date: Thu, 29 Aug 2013 11:20:09 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: I was going to take a stab at (3), but it seems that _namedtuple.py doesn't exist. Looking under _py3k as well as grep -Ri namedtuple ./* fails to find it. I'm pulling from https://github.com/biopython/biopython.git On Thu, Aug 29, 2013 at 5:04 AM, Peter Cock wrote: > On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock > wrote: > > Hello all - especially newcomers, > > > > There are going to be several boring but useful things to do to > > the Biopython code base once we're finished with Python 2.5 > > (the imminent release of Biopython 1.62 has been clearly > > described as the final Biopython release to support it). > > > > Some of these tasks are quite easy, and might tempt some > > of our non-core contributors or new-comers to have a go, > > however to avoid too much duplication of effort I'd suggest > > **replying in this thread if you want to tackle anything** - and > > then start working out how to send us your first pull request. > > > > Things which will need doing: > > > > (0) Disable the Python 2.5 and Jython 2.5 buildbot > > (this will be done by me or Tiago) > > Done. > > > (1) Disable the Python 2.5 target in TravisCI, see > > https://travis-ci.org/biopython/biopython/ > > (this is a simple one line edit to the .travis.yml file) > > Done by Wayne, > > https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de > > > (2) Remove all the with statement imports (and any > > comment lines associated with them): > > > > from __future__ import with_statement > > Done by Wayne, > > https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84 > > > (3) Remove Bio/_py3k/_namedtuple.py and adjust > > import lines accordingly > > Any new volunteer want to try this? > > > (4) Scan over the code base looking for any comments > > about Python 2.5 (e.g. using the grep command), and > > reviewing them one by one to see if there is an old > > workaround we can now remove. > > Lenna had a quick look, there should be some easy one here. > > > (5) More advanced code review, for example looking > > for places we can better take advantage of context > > managers (with statements) for file handles. > > Another new one, related to (5), and fairly easy: > > (6) Reviewing examples in the docstrings and Tutorial > where it would make sense to use a 'with' for file handles. > > This should also solve many of the ResourceWarning: > unclosed file ... warnings visible running the full test > suite under Python 3, e.g. see: > > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Thu Aug 29 15:30:51 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 16:30:51 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell wrote: > I was going to take a stab at (3), but it seems that _namedtuple.py doesn't > exist. > > Looking under _py3k as well as grep -Ri namedtuple ./* > > fails to find it. I'm pulling from > https://github.com/biopython/biopython.git Oops. I wrote that email on my latop - it was a file never checked into source code control. Looking back it was a plan for allowing us to use named tuples on older versions of Python. Sorry! But I have come up with another easy task instead, (7) Update exception style from this, except ErrorClass, variable_name: to this: except ErrorClass as variable_name: The second form is the only allowed syntax in Python 3, but was not possible under Python 2.5. Regards, Peter From chris.mit7 at gmail.com Thu Aug 29 16:03:51 2013 From: chris.mit7 at gmail.com (Chris Mitchell) Date: Thu, 29 Aug 2013 12:03:51 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Sounds good. Just took care of (7), running the test suite and will send a pull request when that passes. Chris On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock wrote: > On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell > wrote: > > I was going to take a stab at (3), but it seems that _namedtuple.py > doesn't > > exist. > > > > Looking under _py3k as well as grep -Ri namedtuple ./* > > > > fails to find it. I'm pulling from > > https://github.com/biopython/biopython.git > > Oops. I wrote that email on my latop - it was a file never checked > into source code control. Looking back it was a plan for allowing > us to use named tuples on older versions of Python. Sorry! > > But I have come up with another easy task instead, > > (7) Update exception style from this, > > except ErrorClass, variable_name: > > to this: > > except ErrorClass as variable_name: > > The second form is the only allowed syntax in Python 3, > but was not possible under Python 2.5. > > Regards, > > Peter > From p.j.a.cock at googlemail.com Thu Aug 29 16:20:51 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 17:20:51 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 5:03 PM, Chris Mitchell wrote: > Sounds good. Just took care of (7), running the test suite and will send a > pull request when that passes. > > Chris https://github.com/biopython/biopython/pull/227 looks good, but has highlighted a bug in Scripts/debug/debug_blast_parser.py (see my comment on GitHub). Good work, Peter From p.j.a.cock at googlemail.com Thu Aug 29 16:33:43 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 17:33:43 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: > On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: >> Hello all - especially newcomers, >> >> There are going to be several boring but useful things to do to >> the Biopython code base once we're finished with Python 2.5 >> (the imminent release of Biopython 1.62 has been clearly >> described as the final Biopython release to support it). >> >> Some of these tasks are quite easy, and might tempt some >> of our non-core contributors or new-comers to have a go, >> however to avoid too much duplication of effort I'd suggest >> **replying in this thread if you want to tackle anything** - and >> then start working out how to send us your first pull request. >> >> Things which will need doing: >> >> (0) Disable the Python 2.5 and Jython 2.5 buildbot >> (this will be done by me or Tiago) > > Done. > >> (1) Disable the Python 2.5 target in TravisCI, see >> https://travis-ci.org/biopython/biopython/ >> (this is a simple one line edit to the .travis.yml file) > > Done by Wayne, > https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de > >> (2) Remove all the with statement imports (and any >> comment lines associated with them): >> >> from __future__ import with_statement > > Done by Wayne, > https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84 > >> (3) Remove Bio/_py3k/_namedtuple.py and adjust >> import lines accordingly (3) was a false alarm, just an old file on my latop confusing me. >> (4) Scan over the code base looking for any comments >> about Python 2.5 (e.g. using the grep command), and >> reviewing them one by one to see if there is an old >> workaround we can now remove. > > Lenna had a quick look, there should be some easy one here. > >> (5) More advanced code review, for example looking >> for places we can better take advantage of context >> managers (with statements) for file handles. > > Another new one, related to (5), and fairly easy: > > (6) Reviewing examples in the docstrings and Tutorial > where it would make sense to use a 'with' for file handles. > > This should also solve many of the ResourceWarning: > unclosed file ... warnings visible running the full test > suite under Python 3, e.g. see: > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock wrote: > ... I have come up with another easy task instead, > > (7) Update exception style from this, > > except ErrorClass, variable_name: > > to this: > > except ErrorClass as variable_name: > > The second form is the only allowed syntax in Python 3, > but was not possible under Python 2.5. (7) is being tackled by Chris Mitchell, https://github.com/biopython/biopython/pull/227 Here's another fairly easy task for another new volunteer?: (8) Excluding doctests and the Tutorial, use print function rather than print statement. e.g. replace this: print variable1, variable2 with this: from __future__ import print_function ... print(variable1, variable2) Note that I am deliberately not suggesting we switch the user visible examples on our documentation yet - that deserves some discussion first. Peter From p.j.a.cock at googlemail.com Thu Aug 29 17:03:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 18:03:24 +0100 Subject: [Biopython-dev] Python 2.6+ support for __dir__ method Message-ID: Hi all, I was reading over the list of what's new in Python 2.6 and wondered about this: > The built-in dir() function now checks for a __dir__() method on the > objects it receives. This method must return a list of strings containing > the names of valid attributes for the object, and lets the object control > the value that dir() produces. Objects that have __getattr__() or > __getattribute__() methods can use this to advertise pseudo-attributes > they will honor. (issue 1591665) http://docs.python.org/2/whatsnew/2.6.html Does that sound useful for some of our more dynamic objects? Peter From arklenna at gmail.com Thu Aug 29 17:18:16 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Thu, 29 Aug 2013 13:18:16 -0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock wrote: > > Here's another fairly easy task for another new volunteer?: > > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: > > print variable1, variable2 > > with this: > > from __future__ import print_function > ... > print(variable1, variable2) > > Note that I am deliberately not suggesting we switch the > user visible examples on our documentation yet - that > deserves some discussion first. > > >From the docs: "When using the 2to3 source-to-source conversion tool, all print statements are automatically converted to print() function calls, so this is mostly a non-issue for larger projects." http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function Which suggests either doing it with the tool or just waiting until the full 3.0 changeover? From p.j.a.cock at googlemail.com Thu Aug 29 17:35:16 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 18:35:16 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thursday, August 29, 2013, Lenna Peterson wrote: > > > On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock > > wrote: > >> >> Here's another fairly easy task for another new volunteer?: >> >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: >> >> print variable1, variable2 >> >> with this: >> >> from __future__ import print_function >> ... >> print(variable1, variable2) >> >> Note that I am deliberately not suggesting we switch the >> user visible examples on our documentation yet - that >> deserves some discussion first. >> >> > From the docs: "When using the 2to3 source-to-source conversion tool, all > print statements are automatically converted to print() function calls, so > this is mostly a non-issue for larger projects." > > http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function > > Which suggests either doing it with the tool or just waiting until the > full 3.0 changeover? > My motivation is a step towards a single codebase for both Python 2 and Python 3 without needing 2to3, see: http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html http://www.slideshare.net/pjacock/biopython-update-bosc2013/ Peter From superbobry at gmail.com Thu Aug 29 20:34:59 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Fri, 30 Aug 2013 00:34:59 +0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock wrote: > Here's another fairly easy task for another new volunteer?: > > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: > > print variable1, variable2 > > with this: > > from __future__ import print_function > ... > print(variable1, variable2) > > Note that I am deliberately not suggesting we switch the > user visible examples on our documentation yet - that > deserves some discussion first. So the task is to remove print statement from the code only, right? I think I can do this, should I use a separate branch? Sergei From p.j.a.cock at googlemail.com Thu Aug 29 20:44:49 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 29 Aug 2013 21:44:49 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev wrote: > On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock > wrote: >> >> Here's another fairly easy task for another new volunteer?: >> >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: >> >> print variable1, variable2 >> >> with this: >> >> from __future__ import print_function >> ... >> print(variable1, variable2) >> >> Note that I am deliberately not suggesting we switch the >> user visible examples on our documentation yet - that >> deserves some discussion first. > > > So the task is to remove print statement from the code only, right? Replacing them with print functions, and testing this worked OK under both Python 2 and Python 3, yes :) > I think I can do this, should I use a separate branch? > > Sergei Yes, I would certainly recommend keeping the default 'master' branch as a copy of the official one, and creating a new 'print-function' branch (or whatever name you prefer) for this work. We probably need to improve this wiki page - so any comments about what is unclear would be great (on a new email thread): http://biopython.org/wiki/GitUsage Thanks, Peter From p.j.a.cock at googlemail.com Fri Aug 30 10:49:23 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 11:49:23 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Hello Biopythoneers, I've outlined another relatively simple improvement for potential new contributors to try below.... On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock wrote: >> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock wrote: >>> Hello all - especially newcomers, >>> >>> There are going to be several boring but useful things to do to >>> the Biopython code base once we're finished with Python 2.5 >>> (the imminent release of Biopython 1.62 has been clearly >>> described as the final Biopython release to support it). >>> >>> ... >>> >>> (4) Scan over the code base looking for any comments >>> about Python 2.5 (e.g. using the grep command), and >>> reviewing them one by one to see if there is an old >>> workaround we can now remove. >> >> Lenna had a quick look, there should be some easy one here. >> >>> (5) More advanced code review, for example looking >>> for places we can better take advantage of context >>> managers (with statements) for file handles. >> >> Another new one, related to (5), and fairly easy: >> >> (6) Reviewing examples in the docstrings and Tutorial >> where it would make sense to use a 'with' for file handles. >> >> This should also solve many of the ResourceWarning: >> unclosed file ... warnings visible running the full test >> suite under Python 3, e.g. see: >> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio > > On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock wrote: >> ... I have come up with another easy task instead, >> >> (7) Update exception style (7) was done by Chris Mitchell, https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942 > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: (8) is being looked at by Sergei Lebedev. ---- Here's another idea, under the general issue (5) of taking advantage of context managers (with statements), which I would judge to be fairly easy (but not trivial). (9) Use context managers (with statements) for temporary warning filters in the unit tests. Currently many of our unit tests add simple filters to ignore a warning, and then restore the old filters using pop(). This mostly works, but is fragile and the filter list is global so this can have strange side effects. See: $ grep "warnings." Tests/*.py The idea here is to replace this: warnings.simplefilter('ignore', PDBConstructionWarning) #some code which may trigger the warning warnings.filters.pop() with this: with warnings.catch_warnings(): warnings.simplefilter("ignore", PDBConstructionWarning) #some code which may trigger the warning Note the indentation - these changes will not give nice clean diffs, so this will not be so easy to review. I would therefore suggest editing just one test file at a time (i.e. limit each commit to changing a single file), as that makes it easier to selectively apply your changes Please make sure you test this Python 2.6 which is most likely to have problems with this "new" style ;) (Again, if anyone plans to work on this, please let the list know to minimised duplicated effort.) If you're not familiar with our test suite, there is a chapter introducing this in the main Tutorial & Cookbook, http://biopython.org/DIST/docs/tutorial/Tutorial.html Thanks, Peter From superbobry at gmail.com Fri Aug 30 12:58:31 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Fri, 30 Aug 2013 16:58:31 +0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: > (8) Excluding doctests and the Tutorial, use print function > rather than print statement. e.g. replace this: Unfortunately we cannot exclude doctests, because 'from __future__' import is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on docstrings with print statement. Sergei On Fri, Aug 30, 2013 at 12:44 AM, Peter Cock wrote: > On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev > wrote: > > On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock > > wrote: > >> > >> Here's another fairly easy task for another new volunteer?: > >> > >> (8) Excluding doctests and the Tutorial, use print function > >> rather than print statement. e.g. replace this: > >> > >> print variable1, variable2 > >> > >> with this: > >> > >> from __future__ import print_function > >> ... > >> print(variable1, variable2) > >> > >> Note that I am deliberately not suggesting we switch the > >> user visible examples on our documentation yet - that > >> deserves some discussion first. > > > > > > So the task is to remove print statement from the code only, right? > > Replacing them with print functions, and testing this > worked OK under both Python 2 and Python 3, yes :) > > > I think I can do this, should I use a separate branch? > > > > Sergei > > Yes, I would certainly recommend keeping the > default 'master' branch as a copy of the official one, > and creating a new 'print-function' branch (or whatever > name you prefer) for this work. > > We probably need to improve this wiki page - so any > comments about what is unclear would be great (on > a new email thread): http://biopython.org/wiki/GitUsage > > Thanks, > > Peter > From p.j.a.cock at googlemail.com Fri Aug 30 13:14:14 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 14:14:14 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev wrote: >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: > > Unfortunately we cannot exclude doctests, because 'from __future__' import > is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on > docstrings with print statement. > > Sergei Could you clarify this? Does this cause a problem via: [Tests]$ python run_tests.py doctest If you have a small example, copy & paste the "git diff" output here. Peter From superbobry at gmail.com Fri Aug 30 13:28:50 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Fri, 30 Aug 2013 17:28:50 +0400 Subject: [Biopython-dev] =?utf-8?q?_Re=3A__Post_Biopython_1=2E62_release?= =?utf-8?q?=2C_clean-up_after_dropping_Python_2=2E5?= In-Reply-To: References: Message-ID: Sure,?a common pattern for a lot of BioPython modules seems to be: ? ? # +from __future__ import print_function ? ? def foo(): ? ? ? ? """A docstring with print statement. ? ? ? ? >>> print "foo" ? ? ? ? foo ? ? ? ? """ ? ? ? ? print "Running foo ..." ? ? ? ? # +print("Running foo ...") ? ? if __name__ == "__main__": ? ? ? ? import doctest ? ? ? ? doctest.testmod() where foo is some function, which uses print statement in its body. Since we want to switch from print statements to print function we replace?print "Running foo ..."?with a?print()?call and add from?__future__ import ...?to the?beginning?of the module.? What happens if we try to run the doctests after we've switched to?print_function? ? ? $ python /tmp/foo.py ? ? ********************************************************************** ? ? File "/tmp/foo.py", line 7, in __main__.foo ? ? Failed example: ? ? ? ? print "foo" ? ? Exception raised: ? ? ? ? Traceback (most recent call last): ? ? ? ? ? File ".../doctest.py", line 1254, in __run ? ? ? ? ? ? compileflags, 1) in test.globs ? ? ? ? ? File "", line 1 ? ? ? ? ? ? print "foo" ? ? ? ? ? ? ? ? ? ? ? ^ ? ? ? ? SyntaxError: invalid syntax ? ? ********************************************************************** ? ? 1 items had failures: ? ? ? ?1 of ? 1 in __main__.foo ? ? ***Test Failed*** 1 failures. So, enabling?print_function?makes doctests using print statement fail with a SyntaxError, as shown by the example above. Thus, if we want to get rid of print statement in the code we have no other choice but to do the same it in the doctests. Sergei? On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com) wrote: On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev wrote: >> (8) Excluding doctests and the Tutorial, use print function >> rather than print statement. e.g. replace this: > > Unfortunately we cannot exclude doctests, because 'from __future__' import > is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on > docstrings with print statement. > > Sergei Could you clarify this? Does this cause a problem via: [Tests]$ python run_tests.py doctest If you have a small example, copy & paste the "git diff" output here. Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.j.a.cock at googlemail.com Fri Aug 30 14:22:26 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 15:22:26 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Thanks Sergei - that clarified things. Unfortunately this doesn't just break our convenience __main__ trick for running the doctests in any single module, it also breaks doing it via: $ python run_tests.py doctest This means we'd have to update the doctests to also use Python 3 style print functions... which may be premature (we'll need to do this at some point though). How about the less ambitious plan of replacing lines like this: print variable with: print(variable) This will be understood as a print function call on Python 3 (and work), and will also work on Python 2 (without the future import) where it will be parsed as redundant parentheses. Note you can't use this trick where more than one variable is printed, because then on Python 2 the brackets will create a tuple instead. Peter On Fri, Aug 30, 2013 at 2:28 PM, Sergei Lebedev wrote: > Sure, a common pattern for a lot of BioPython modules seems to be: > > # +from __future__ import print_function > > > def foo(): > """A docstring with print statement. > > >>> print "foo" > foo > """ > print "Running foo ..." > # +print("Running foo ...") > > > if __name__ == "__main__": > import doctest > doctest.testmod() > > where foo is some function, which uses print statement in its body. Since we > want to switch from print statements to print function we replace print > "Running foo ..." with a print() call and add from __future__ import ... to > the beginning of the module. > > What happens if we try to run the doctests after we've switched to > print_function? > > $ python /tmp/foo.py > ********************************************************************** > File "/tmp/foo.py", line 7, in __main__.foo > Failed example: > print "foo" > Exception raised: > Traceback (most recent call last): > File ".../doctest.py", line 1254, in __run > compileflags, 1) in test.globs > File "", line 1 > print "foo" > ^ > SyntaxError: invalid syntax > ********************************************************************** > 1 items had failures: > 1 of 1 in __main__.foo > ***Test Failed*** 1 failures. > > So, enabling print_function makes doctests using print statement fail with a > SyntaxError, as shown by the example above. Thus, if we want to get rid of > print statement in the code we have no other choice but to do the same it in > the doctests. > > Sergei > > > > On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com) > wrote: > > On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev > wrote: >>> (8) Excluding doctests and the Tutorial, use print function >>> rather than print statement. e.g. replace this: >> >> Unfortunately we cannot exclude doctests, because 'from __future__' import >> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on >> docstrings with print statement. >> >> Sergei > > Could you clarify this? Does this cause a problem via: > > [Tests]$ python run_tests.py doctest > > If you have a small example, copy & paste the "git diff" output here. > > Peter From p.j.a.cock at googlemail.com Fri Aug 30 15:46:59 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 30 Aug 2013 16:46:59 +0100 Subject: [Biopython-dev] Fwd: [biopython] Potential error in mass calculations for RNA/DNA? (#229) In-Reply-To: References: Message-ID: Who are our sequence mass experts? https://github.com/biopython/biopython/issues/229 ---------- Forwarded message ---------- From: nruggero Date: Thu, Aug 29, 2013 at 11:03 PM Subject: [biopython] Potential error in mass calculations for RNA/DNA? (#229) To: biopython/biopython In Bio/Data/IUPACData.py the molecular weights of unambiguous DNA are listed as: unambiguous_dna_weights = { "A": 347., "C": 323., "G": 363., "T": 322., } As far as I can tell these are the molecular weights for the non-deoxy bases instead of the deoxy bases. For example, AMP (347.22) instead of dAMP (331.22) is listed. I've looked at the original BioPearl code that these numbers were taken from and I think they were just copied incorrectly. I have also looked at the code which uses this dict in Bio/SeqUtils/__init__.py called molecular_weight() and it just takes the sum of these values over the sequence (no correction made). So, is this an error or am I missing something basic? Thanks ? Reply to this email directly or view it on GitHub . From superbobry at gmail.com Fri Aug 30 22:53:53 2013 From: superbobry at gmail.com (Sergei Lebedev) Date: Sat, 31 Aug 2013 02:53:53 +0400 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which does all the job, so I think I can take #9. Sergei [*] https://github.com/biopython/biopython/pull/230 On Fri, Aug 30, 2013 at 2:49 PM, Peter Cock wrote: > Hello Biopythoneers, > > I've outlined another relatively simple improvement for potential > new contributors to try below.... > > On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock > wrote: > >> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock > wrote: > >>> Hello all - especially newcomers, > >>> > >>> There are going to be several boring but useful things to do to > >>> the Biopython code base once we're finished with Python 2.5 > >>> (the imminent release of Biopython 1.62 has been clearly > >>> described as the final Biopython release to support it). > >>> > >>> ... > >>> > >>> (4) Scan over the code base looking for any comments > >>> about Python 2.5 (e.g. using the grep command), and > >>> reviewing them one by one to see if there is an old > >>> workaround we can now remove. > >> > >> Lenna had a quick look, there should be some easy one here. > >> > >>> (5) More advanced code review, for example looking > >>> for places we can better take advantage of context > >>> managers (with statements) for file handles. > >> > >> Another new one, related to (5), and fairly easy: > >> > >> (6) Reviewing examples in the docstrings and Tutorial > >> where it would make sense to use a 'with' for file handles. > >> > >> This should also solve many of the ResourceWarning: > >> unclosed file ... warnings visible running the full test > >> suite under Python 3, e.g. see: > >> > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio > > > > On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock > wrote: > >> ... I have come up with another easy task instead, > >> > >> (7) Update exception style > > (7) was done by Chris Mitchell, > > https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942 > > > (8) Excluding doctests and the Tutorial, use print function > > rather than print statement. e.g. replace this: > > (8) is being looked at by Sergei Lebedev. > > ---- > > Here's another idea, under the general issue (5) of taking > advantage of context managers (with statements), which > I would judge to be fairly easy (but not trivial). > > (9) Use context managers (with statements) for temporary > warning filters in the unit tests. > > Currently many of our unit tests add simple filters to ignore > a warning, and then restore the old filters using pop(). This > mostly works, but is fragile and the filter list is global so this > can have strange side effects. See: > > $ grep "warnings." Tests/*.py > > The idea here is to replace this: > > warnings.simplefilter('ignore', PDBConstructionWarning) > #some code which may trigger the warning > warnings.filters.pop() > > with this: > > with warnings.catch_warnings(): > warnings.simplefilter("ignore", PDBConstructionWarning) > #some code which may trigger the warning > > Note the indentation - these changes will not give nice > clean diffs, so this will not be so easy to review. > > I would therefore suggest editing just one test file at a > time (i.e. limit each commit to changing a single file), as > that makes it easier to selectively apply your changes > > Please make sure you test this Python 2.6 which is most > likely to have problems with this "new" style ;) > > (Again, if anyone plans to work on this, please let the list > know to minimised duplicated effort.) > > If you're not familiar with our test suite, there is a chapter > introducing this in the main Tutorial & Cookbook, > http://biopython.org/DIST/docs/tutorial/Tutorial.html > > Thanks, > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Sat Aug 31 09:31:53 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 31 Aug 2013 10:31:53 +0100 Subject: [Biopython-dev] Post Biopython 1.62 release, clean-up after dropping Python 2.5 In-Reply-To: References: Message-ID: On Fri, Aug 30, 2013 at 11:53 PM, Sergei Lebedev wrote: > Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which > does all the job, so I think I can take #9. > > Sergei > > [*] https://github.com/biopython/biopython/pull/230 Print-function-like syntax committed for (8), thank you. We'll need to come back to this later as there are still lots of print statements left in the codebase... time for a more general discussion about what people would prefer to see in the user-facing documentation. If you'd like to try some context managers for the warnings in the unit tests (9), that would be great. Note some of the tests will require you to install a command line tool - it should be clear, but if we need to add more documentation (e.g. URLs) please let us know. Thanks, Peter