From eric.talevich at gmail.com  Thu Aug  1 16:04:29 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Thu, 1 Aug 2013 13:04:29 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
Message-ID: <CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>

On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Wednesday, July 31, 2013, Ben Fulton wrote:
>
> > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see
> how
> > much code is covered by tests. The overall total was 74% which is pretty
> > respectable.
> >
> > I ran the tests on a fairly fresh machine, which meant I had to install a
> > lot of software, some of which I either didn't get installed properly, or
> > the tests are out of date, or there were failures for some other reason.
> I
> > ended up having to skip seven test files:
> >
> > Dialign_Tool
> > EmbossPhylipNew
> > Mafft
> > PopGen_DFDist
> > PopGen_FDist
> > XXMotif
> > phyml
>
>
> I'm pretty sure I have some or all of those setup on at least one
> of my test machines, so with a little more work together we
> can try to resolve those (which may mean updating the docs).
>

I just fixed the error in test_phyml_tool.py, it was a simple one:
https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f


> There were three tests I managed to get running but still had failures:
> >
> > FastTree
> > NCBI_BLAST
> > Prank_too
>

The FastTree test is not based on the unittest framework, so the output
contains the word "Failed" in three places to describe error-handling tests
that worked correctly. Can we see the output for this one? (It works on my
machine.)

The test is also fairly new, so there could be some version-compatibility
issues there too.

Thanks,
Eric

From ben at benfulton.net  Thu Aug  1 22:20:49 2013
From: ben at benfulton.net (Ben Fulton)
Date: Thu, 1 Aug 2013 22:20:49 -0400
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
Message-ID: <CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>

My test machine was running Ubuntu 12.04.

For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and
got this error:
ApplicationError: Command 'fasttree -out temp_test.tree
Quality/example.fasta' returned non-zero exit status 1, 'Unknown or
incorrect use of option -out'

The NCBI_BLAST error involves rpsblast not being in the install. Version
2.2.25-7 using apt-get.

Dialign is version 2.2.1-5 using apt-get. I got two errors: first,
DIALIGN2_DIR not being set. It was installed to /usr/bin so I set
DIALIGN2_DIR to that directory; then I got "Environment variable
DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these
items are needed, though I may have missed them in the documentation.

I downloaded version 130708 of Prank from
http://code.google.com/p/prank-msa/downloads/list. The error is on line 165
of the test file:

AssertionError:
-----------------
 PRANK v.130708:
-----------------

Input for the analysis
 - converting 'Quality/example.fasta' to 'temp with space.phy'

EmbossPhylipNew I tried to install from source, but it was complicated and
I didn't get it finished.

I'll send some notes on the other errors when I get a few minutes.


On Thu, Aug 1, 2013 at 4:04 PM, Eric Talevich <eric.talevich at gmail.com>wrote:

> On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:
>
>> On Wednesday, July 31, 2013, Ben Fulton wrote:
>>
>> > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see
>> how
>> > much code is covered by tests. The overall total was 74% which is pretty
>> > respectable.
>> >
>> > I ran the tests on a fairly fresh machine, which meant I had to install
>> a
>> > lot of software, some of which I either didn't get installed properly,
>> or
>> > the tests are out of date, or there were failures for some other
>> reason. I
>> > ended up having to skip seven test files:
>> >
>> > Dialign_Tool
>> > EmbossPhylipNew
>> > Mafft
>> > PopGen_DFDist
>> > PopGen_FDist
>> > XXMotif
>> > phyml
>>
>>
>> I'm pretty sure I have some or all of those setup on at least one
>> of my test machines, so with a little more work together we
>> can try to resolve those (which may mean updating the docs).
>>
>
> I just fixed the error in test_phyml_tool.py, it was a simple one:
>
> https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f
>
>
> > There were three tests I managed to get running but still had failures:
>> >
>> > FastTree
>> > NCBI_BLAST
>> > Prank_too
>>
>
> The FastTree test is not based on the unittest framework, so the output
> contains the word "Failed" in three places to describe error-handling tests
> that worked correctly. Can we see the output for this one? (It works on my
> machine.)
>
> The test is also fairly new, so there could be some version-compatibility
> issues there too.
>
> Thanks,
> Eric
>

From glenveegee at gmail.com  Fri Aug  2 04:17:14 2013
From: glenveegee at gmail.com (Glen van Ginkel)
Date: Fri, 2 Aug 2013 09:17:14 +0100
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <51FB69C6.3040200@ebi.ac.uk>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
Message-ID: <CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>

Hi all,

Given Lenna's recent work on the mmCIF parser I thought this might be of
interest.

Kind regards,

Glen

wwPDB Workshop on mmCIF/PDBx for Programmers
--------------------------------------------

What, why and how?
------------------
The world of the PDB will be changing rapidly and profoundly over the next
few
years. A major change will involve the transition from PDB to mmCIF/PDBx as
the principal deposition and dissemination format (see
http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
http://wwpdb.org/workshop/wgroup.html). To help software developers in the
area of structural biology to make the transition and begin supporting the
mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
organising a programmers workshop. This two-day event will include lectures
by
experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
language-specific libraries or packages (C/C++, Java, Python). Ample time
will
be devoted to tutorials and individual "code hacking", with the experts
available to assist the workshop participants. Confirmed tutors include Paul
Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas
Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).

When and where?
---------------
The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton,
Cambridge, UK, on 20 and 21 November 2013.

How much?
---------
If you are selected as a participant, we expect you to pay for your own
travel
to and from Cambridge. However, there is no fee for this workshop, and we
will
provide accommodation (at the HolidayInn Express in nearby Duxford;
http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner
on
the 20th (all thanks to generous funding from the Wellcome Trust to PDBe).

Who can apply and how?
----------------------
This workshop is intended for "high-powered" software developers in any area
of structural biology and structural bioinformatics whose products process
(read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods,
visualisation, validation, modelling, docking, structure prediction, etc. To
ensure a high ratio of tutors to workshop participants, the number of
participants is limited to 15.

You can apply for the workshop by sending an e-mail to Sameer Velankar at
PDBe
(sameer at ebi.ac.uk) no later than 31 August 2013. Please include:

- a brief description of the software program(s) or package(s) you have
developed or are developing, what it does, in which field, how many users,
relevant publications, etc.;
- what programming language(s) you are specifically interested in;
- how you would benefit from this workshop;
- any specific topics or questions you would like to see addressed in the
workshop.

If the workshop is oversubscribed, we will use the information and
motivation
provided by the applicants to select the participants.

Participants are expected to bring their own laptop with compilers etc.
installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
participants who are aware of the basic principles of the format will
probably
gain more from the workshop.

Applicants will be informed by mid-September if they have been selected or
not, or if they are on the stand-by list.

For informal inquiries about the workshop, please contact Sameer Velankar at
PDBe (sameer at ebi.ac.uk).

Please feel free to distribute this announcement to other interested people
or
fora!


--Gerard Kleywegt & Sameer Velankar
   Protein Data Bank in Europe
   A member of the Worldwide Protein Data Bank

---
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
gerard at ebi.ac.uk ..................... pdbe.org
Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
https://lists.sdsc.edu/mailman/listinfo/pdb-l .

From p.j.a.cock at googlemail.com  Fri Aug  2 05:16:53 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 10:16:53 +0100
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
	<CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
Message-ID: <CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>

Thanks for forwarding that Glen - it would be great if any of
our structural Biopython folk could go.

Is anyone interested & reasonably close to Cambridge UK?

Peter

On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel <glenveegee at gmail.com> wrote:
> Hi all,
>
> Given Lenna's recent work on the mmCIF parser I thought this might be of
> interest.
>
> Kind regards,
>
> Glen
>
> wwPDB Workshop on mmCIF/PDBx for Programmers
> --------------------------------------------
>
> What, why and how?
> ------------------
> The world of the PDB will be changing rapidly and profoundly over the next
> few
> years. A major change will involve the transition from PDB to mmCIF/PDBx as
> the principal deposition and dissemination format (see
> http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
> http://wwpdb.org/workshop/wgroup.html). To help software developers in the
> area of structural biology to make the transition and begin supporting the
> mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
> organising a programmers workshop. This two-day event will include lectures
> by
> experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
> language-specific libraries or packages (C/C++, Java, Python). Ample time
> will
> be devoted to tutorials and individual "code hacking", with the experts
> available to assist the workshop participants. Confirmed tutors include Paul
> Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas
> Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).
>
> When and where?
> ---------------
> The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton,
> Cambridge, UK, on 20 and 21 November 2013.
>
> How much?
> ---------
> If you are selected as a participant, we expect you to pay for your own
> travel
> to and from Cambridge. However, there is no fee for this workshop, and we
> will
> provide accommodation (at the HolidayInn Express in nearby Duxford;
> http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner
> on
> the 20th (all thanks to generous funding from the Wellcome Trust to PDBe).
>
> Who can apply and how?
> ----------------------
> This workshop is intended for "high-powered" software developers in any area
> of structural biology and structural bioinformatics whose products process
> (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods,
> visualisation, validation, modelling, docking, structure prediction, etc. To
> ensure a high ratio of tutors to workshop participants, the number of
> participants is limited to 15.
>
> You can apply for the workshop by sending an e-mail to Sameer Velankar at
> PDBe
> (sameer at ebi.ac.uk) no later than 31 August 2013. Please include:
>
> - a brief description of the software program(s) or package(s) you have
> developed or are developing, what it does, in which field, how many users,
> relevant publications, etc.;
> - what programming language(s) you are specifically interested in;
> - how you would benefit from this workshop;
> - any specific topics or questions you would like to see addressed in the
> workshop.
>
> If the workshop is oversubscribed, we will use the information and
> motivation
> provided by the applicants to select the participants.
>
> Participants are expected to bring their own laptop with compilers etc.
> installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
> participants who are aware of the basic principles of the format will
> probably
> gain more from the workshop.
>
> Applicants will be informed by mid-September if they have been selected or
> not, or if they are on the stand-by list.
>
> For informal inquiries about the workshop, please contact Sameer Velankar at
> PDBe (sameer at ebi.ac.uk).
>
> Please feel free to distribute this announcement to other interested people
> or
> fora!
>
>
> --Gerard Kleywegt & Sameer Velankar
>    Protein Data Bank in Europe
>    A member of the Worldwide Protein Data Bank
>
> ---
> Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
> gerard at ebi.ac.uk ..................... pdbe.org
> Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
> TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
> https://lists.sdsc.edu/mailman/listinfo/pdb-l .
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev

From p.j.a.cock at googlemail.com  Fri Aug  2 05:31:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 10:31:27 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
Message-ID: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>

Thanks for these details Ben - it sounds like a mixture of real
test failures, and mere warnings that an external tool wasn't
found.

On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton <ben at benfulton.net> wrote:
> My test machine was running Ubuntu 12.04.
>
> For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and
> got this error:
> ApplicationError: Command 'fasttree -out temp_test.tree
> Quality/example.fasta' returned non-zero exit status 1, 'Unknown or
> incorrect use of option -out'

I don't seem to have fasttree installed at all, and from the
test and wrapper it is not explicit about which version is
was originally written for.

> The NCBI_BLAST error involves rpsblast not being in the install.
> Version 2.2.25-7 using apt-get.

I believe this is down to an NCBI stupidity with binary name
clashes, both the old 'legacy' C BLAST and the new C++
BLAST+ suite have a binary called rpsblast.

Our test code copes with this by searching the path and checking
each rpsblast binary found - looking for the new version only.

However, Debian policy is to resolve ambiguities like this with
a unilateral renaming - in this case I *think* they called the new
binary rpsblast+ instead. Can you confirm that? I don't have
access to a Debian machine right now.

So, strictly speaking the Biopython test is correct - you don't
have the new rpsblast installed. However, it would be more
helpful if we also checked for the Debian alias rpsblast+ too.

That shouldn't be too complicated to do - especially if you
could rerun the tests using Biopython from git for me?

> Dialign is version 2.2.1-5 using apt-get. I got two errors: first,
> DIALIGN2_DIR not being set. It was installed to /usr/bin so I set
> DIALIGN2_DIR to that directory; then I got "Environment variable
> DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these
> items are needed, though I may have missed them in the documentation.

This again looks like a Debian packaging issue versus the
manual install instructions for Dialign. Perhaps they have
fixed Dialign to find its matrix under a data folder...

You could try simple commenting out the check on the
environment variable in test_Dialign_tool.py and seeing
if the tests pass or not.

> I downloaded version 130708 of Prank from
> http://code.google.com/p/prank-msa/downloads/list. The error is on line 165
> of the test file:
>
> AssertionError:
> -----------------
>  PRANK v.130708:
> -----------------
>
> Input for the analysis
>  - converting 'Quality/example.fasta' to 'temp with space.phy'

This sounds like a minor change in the stdout with recent
versions of PRANK.

> EmbossPhylipNew I tried to install from source, but it was complicated and I
> didn't get it finished.
>
> I'll send some notes on the other errors when I get a few minutes.

Thanks,

Peter

From p.j.a.cock at googlemail.com  Fri Aug  2 08:00:54 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 13:00:54 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
Message-ID: <CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>

On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
>> The NCBI_BLAST error involves rpsblast not being in the install.
>> Version 2.2.25-7 using apt-get.
>
> I believe this is down to an NCBI stupidity with binary name
> clashes, both the old 'legacy' C BLAST and the new C++
> BLAST+ suite have a binary called rpsblast.
>
> Our test code copes with this by searching the path and checking
> each rpsblast binary found - looking for the new version only.
>
> However, Debian policy is to resolve ambiguities like this with
> a unilateral renaming - in this case I *think* they called the new
> binary rpsblast+ instead. Can you confirm that? I don't have
> access to a Debian machine right now.

Certainly this was their plan and was done on Bio-Linux,
http://lists.debian.org/debian-med/2011/05/msg00025.html

> So, strictly speaking the Biopython test is correct - you don't
> have the new rpsblast installed. However, it would be more
> helpful if we also checked for the Debian alias rpsblast+ too.
>
> That shouldn't be too complicated to do - especially if you
> could rerun the tests using Biopython from git for me?

This commit is now on our master branch,

https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a

Thanks,

Peter

From anaryin at gmail.com  Fri Aug  2 12:13:04 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Fri, 2 Aug 2013 09:13:04 -0700
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
	<CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
	<CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>
Message-ID: <CAJ9sUYNs+LYfpDcX-1eQVGry9SuKXqe=Q5yg-7HiXZbEWjsTaA@mail.gmail.com>

Hi Peter, Glen,

I'll be going (or trying to at least).

Cheers,

Jo?o


2013/8/2 Peter Cock <p.j.a.cock at googlemail.com>

> Thanks for forwarding that Glen - it would be great if any of
> our structural Biopython folk could go.
>
> Is anyone interested & reasonably close to Cambridge UK?
>
> Peter
>
> On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel <glenveegee at gmail.com>
> wrote:
> > Hi all,
> >
> > Given Lenna's recent work on the mmCIF parser I thought this might be of
> > interest.
> >
> > Kind regards,
> >
> > Glen
> >
> > wwPDB Workshop on mmCIF/PDBx for Programmers
> > --------------------------------------------
> >
> > What, why and how?
> > ------------------
> > The world of the PDB will be changing rapidly and profoundly over the
> next
> > few
> > years. A major change will involve the transition from PDB to mmCIF/PDBx
> as
> > the principal deposition and dissemination format (see
> > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
> > http://wwpdb.org/workshop/wgroup.html). To help software developers in
> the
> > area of structural biology to make the transition and begin supporting
> the
> > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
> > organising a programmers workshop. This two-day event will include
> lectures
> > by
> > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
> > language-specific libraries or packages (C/C++, Java, Python). Ample time
> > will
> > be devoted to tutorials and individual "code hacking", with the experts
> > available to assist the workshop participants. Confirmed tutors include
> Paul
> > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac),
> Andreas
> > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).
> >
> > When and where?
> > ---------------
> > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in
> Hinxton,
> > Cambridge, UK, on 20 and 21 November 2013.
> >
> > How much?
> > ---------
> > If you are selected as a participant, we expect you to pay for your own
> > travel
> > to and from Cambridge. However, there is no fee for this workshop, and we
> > will
> > provide accommodation (at the HolidayInn Express in nearby Duxford;
> > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop
> dinner
> > on
> > the 20th (all thanks to generous funding from the Wellcome Trust to
> PDBe).
> >
> > Who can apply and how?
> > ----------------------
> > This workshop is intended for "high-powered" software developers in any
> area
> > of structural biology and structural bioinformatics whose products
> process
> > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid
> methods,
> > visualisation, validation, modelling, docking, structure prediction,
> etc. To
> > ensure a high ratio of tutors to workshop participants, the number of
> > participants is limited to 15.
> >
> > You can apply for the workshop by sending an e-mail to Sameer Velankar at
> > PDBe
> > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include:
> >
> > - a brief description of the software program(s) or package(s) you have
> > developed or are developing, what it does, in which field, how many
> users,
> > relevant publications, etc.;
> > - what programming language(s) you are specifically interested in;
> > - how you would benefit from this workshop;
> > - any specific topics or questions you would like to see addressed in the
> > workshop.
> >
> > If the workshop is oversubscribed, we will use the information and
> > motivation
> > provided by the applicants to select the participants.
> >
> > Participants are expected to bring their own laptop with compilers etc.
> > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
> > participants who are aware of the basic principles of the format will
> > probably
> > gain more from the workshop.
> >
> > Applicants will be informed by mid-September if they have been selected
> or
> > not, or if they are on the stand-by list.
> >
> > For informal inquiries about the workshop, please contact Sameer
> Velankar at
> > PDBe (sameer at ebi.ac.uk).
> >
> > Please feel free to distribute this announcement to other interested
> people
> > or
> > fora!
> >
> >
> > --Gerard Kleywegt & Sameer Velankar
> >    Protein Data Bank in Europe
> >    A member of the Worldwide Protein Data Bank
> >
> > ---
> > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
> > gerard at ebi.ac.uk ..................... pdbe.org
> > Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
> > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
> > https://lists.sdsc.edu/mailman/listinfo/pdb-l .
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From p.j.a.cock at googlemail.com  Fri Aug  2 12:20:02 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 17:20:02 +0100
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <CAJ9sUYNs+LYfpDcX-1eQVGry9SuKXqe=Q5yg-7HiXZbEWjsTaA@mail.gmail.com>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
	<CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
	<CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>
	<CAJ9sUYNs+LYfpDcX-1eQVGry9SuKXqe=Q5yg-7HiXZbEWjsTaA@mail.gmail.com>
Message-ID: <CAKVJ-_4j_9Bih3isF=q=pzAUVFyAWLuwWZ7H_xCQJ_EC+b_6CA@mail.gmail.com>

That's good new Jo?o - thanks! Peter.

On Fri, Aug 2, 2013 at 5:13 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
> Hi Peter, Glen,
>
> I'll be going (or trying to at least).
>
> Cheers,
>
> Jo?o
>
>
> 2013/8/2 Peter Cock <p.j.a.cock at googlemail.com>
>>
>> Thanks for forwarding that Glen - it would be great if any of
>> our structural Biopython folk could go.
>>
>> Is anyone interested & reasonably close to Cambridge UK?
>>
>> Peter
>>
>> On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel <glenveegee at gmail.com>
>> wrote:
>> > Hi all,
>> >
>> > Given Lenna's recent work on the mmCIF parser I thought this might be of
>> > interest.
>> >
>> > Kind regards,
>> >
>> > Glen
>> >
>> > wwPDB Workshop on mmCIF/PDBx for Programmers
>> > --------------------------------------------
>> >
>> > What, why and how?
>> > ------------------
>> > The world of the PDB will be changing rapidly and profoundly over the
>> > next
>> > few
>> > years. A major change will involve the transition from PDB to mmCIF/PDBx
>> > as
>> > the principal deposition and dissemination format (see
>> > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
>> > http://wwpdb.org/workshop/wgroup.html). To help software developers in
>> > the
>> > area of structural biology to make the transition and begin supporting
>> > the
>> > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
>> > organising a programmers workshop. This two-day event will include
>> > lectures
>> > by
>> > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
>> > language-specific libraries or packages (C/C++, Java, Python). Ample
>> > time
>> > will
>> > be devoted to tutorials and individual "code hacking", with the experts
>> > available to assist the workshop participants. Confirmed tutors include
>> > Paul
>> > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac),
>> > Andreas
>> > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).
>> >
>> > When and where?
>> > ---------------
>> > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in
>> > Hinxton,
>> > Cambridge, UK, on 20 and 21 November 2013.
>> >
>> > How much?
>> > ---------
>> > If you are selected as a participant, we expect you to pay for your own
>> > travel
>> > to and from Cambridge. However, there is no fee for this workshop, and
>> > we
>> > will
>> > provide accommodation (at the HolidayInn Express in nearby Duxford;
>> > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop
>> > dinner
>> > on
>> > the 20th (all thanks to generous funding from the Wellcome Trust to
>> > PDBe).
>> >
>> > Who can apply and how?
>> > ----------------------
>> > This workshop is intended for "high-powered" software developers in any
>> > area
>> > of structural biology and structural bioinformatics whose products
>> > process
>> > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid
>> > methods,
>> > visualisation, validation, modelling, docking, structure prediction,
>> > etc. To
>> > ensure a high ratio of tutors to workshop participants, the number of
>> > participants is limited to 15.
>> >
>> > You can apply for the workshop by sending an e-mail to Sameer Velankar
>> > at
>> > PDBe
>> > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include:
>> >
>> > - a brief description of the software program(s) or package(s) you have
>> > developed or are developing, what it does, in which field, how many
>> > users,
>> > relevant publications, etc.;
>> > - what programming language(s) you are specifically interested in;
>> > - how you would benefit from this workshop;
>> > - any specific topics or questions you would like to see addressed in
>> > the
>> > workshop.
>> >
>> > If the workshop is oversubscribed, we will use the information and
>> > motivation
>> > provided by the applicants to select the participants.
>> >
>> > Participants are expected to bring their own laptop with compilers etc.
>> > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
>> > participants who are aware of the basic principles of the format will
>> > probably
>> > gain more from the workshop.
>> >
>> > Applicants will be informed by mid-September if they have been selected
>> > or
>> > not, or if they are on the stand-by list.
>> >
>> > For informal inquiries about the workshop, please contact Sameer
>> > Velankar at
>> > PDBe (sameer at ebi.ac.uk).
>> >
>> > Please feel free to distribute this announcement to other interested
>> > people
>> > or
>> > fora!
>> >
>> >
>> > --Gerard Kleywegt & Sameer Velankar
>> >    Protein Data Bank in Europe
>> >    A member of the Worldwide Protein Data Bank
>> >
>> > ---
>> > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
>> > gerard at ebi.ac.uk ..................... pdbe.org
>> > Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
>> > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
>> > https://lists.sdsc.edu/mailman/listinfo/pdb-l .
>> > _______________________________________________
>> > Biopython-dev mailing list
>> > Biopython-dev at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
>


From ben at benfulton.net  Sun Aug  4 21:28:34 2013
From: ben at benfulton.net (Ben Fulton)
Date: Sun, 4 Aug 2013 21:28:34 -0400
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
Message-ID: <CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>

Fixed the following:

I had installed Mafft version 6.850-1 from apt-get, which apparently is
more than a year old and doesn't work. The tests ran after I installed it
from source.

I had not gotten a path set up properly for XXMotif; once I did the tests
all ran.

The DiAlign tests passed after I removed the precondition checks.

Did not fix:

The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
find anywhere else to install the PopGen software from.


So with all of those modifications, I ran coverage against the latest code
from GitHub. Results are once again available on my website,
http://benfulton.net/BioPython162_Coverage , and the following issues
remain:

EmbossPhylipNew - skipped, too hard to install
Fasttree - error, apparently a versioning issue
PopGen_FDist and PopGen_DFdist - skipped, unavailable
Prank - failed, recent versions of the tool have some kind of output change


On Fri, Aug 2, 2013 at 8:00 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >
> >> The NCBI_BLAST error involves rpsblast not being in the install.
> >> Version 2.2.25-7 using apt-get.
> >
> > I believe this is down to an NCBI stupidity with binary name
> > clashes, both the old 'legacy' C BLAST and the new C++
> > BLAST+ suite have a binary called rpsblast.
> >
> > Our test code copes with this by searching the path and checking
> > each rpsblast binary found - looking for the new version only.
> >
> > However, Debian policy is to resolve ambiguities like this with
> > a unilateral renaming - in this case I *think* they called the new
> > binary rpsblast+ instead. Can you confirm that? I don't have
> > access to a Debian machine right now.
>
> Certainly this was their plan and was done on Bio-Linux,
> http://lists.debian.org/debian-med/2011/05/msg00025.html
>
> > So, strictly speaking the Biopython test is correct - you don't
> > have the new rpsblast installed. However, it would be more
> > helpful if we also checked for the Debian alias rpsblast+ too.
> >
> > That shouldn't be too complicated to do - especially if you
> > could rerun the tests using Biopython from git for me?
>
> This commit is now on our master branch,
>
>
> https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a
>
> Thanks,
>
> Peter
>

From yeyanbo289 at gmail.com  Mon Aug  5 04:57:34 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 5 Aug 2013 16:57:34 +0800
Subject: [Biopython-dev] GSOC weekly update 8
Message-ID: <CADoMHjxT7pQ81T8KSkTfb+-LKtOM-5dATVcf5EACdxiN0TU4Qw@mail.gmail.com>

Hi all,

I post an update for the Biopython.Phylo project here:
http://blog.yeyanbo.com/posts/google-summer-of-code-8.html

Thanks,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*

From p.j.a.cock at googlemail.com  Mon Aug  5 07:46:00 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 12:46:00 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
Message-ID: <CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>

On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
>
> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
> find anywhere else to install the PopGen software from.
>

There seems to be a fairly recent snapshot on archive.org,
http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html

Meanwhile, I have emailed Dr. Mark Beaumont at Reading
University to ask about the server status.

Regards,

Peter

From p.j.a.cock at googlemail.com  Mon Aug  5 08:14:04 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 13:14:04 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
	<CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
Message-ID: <CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>

On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
>>
>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
>> find anywhere else to install the PopGen software from.
>>
>
> There seems to be a fairly recent snapshot on archive.org,
> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html
>
> Meanwhile, I have emailed Dr. Mark Beaumont at Reading
> University to ask about the server status.

Mark has moved to Bristol:
http://www.maths.bris.ac.uk/people/profile/mamab

FDist and DFDist are available here now:
http://www.maths.bris.ac.uk/~mamab/

We need to update the Biopython documentation (and check
those versions from Bristol still work with our tests).

Tiago, could you handle that?

Thanks,

Peter

From arklenna at gmail.com  Mon Aug  5 09:11:19 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Mon, 5 Aug 2013 09:11:19 -0400
Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues?
In-Reply-To: <CAKVJ-_7oiLZM0EOEE_Y_Z6Ob-sYdk2KM518DoaQbohRxyhiXyA@mail.gmail.com>
References: <CAKVJ-_7U8HW4wa657oEYsR=vC=+cXV1O1nREps118O6F1uYjTQ@mail.gmail.com>
	<CADEGkF6LAVmLd1SmVp5UNexaWe5irzxD+9NHm2kAvR7r0KmxXA@mail.gmail.com>
	<CAKVJ-_7Ae2wsZurbNDjTHnAEEwtESDjqKMDbqVOcy66-emeW3w@mail.gmail.com>
	<CAHQkFddA8VHDvAmt_ThwfhRHTjF5HptZCE6xayvJH1aW2nLqYg@mail.gmail.com>
	<CAKVJ-_4C-Qf4qSsWCCasc5Mv6r93rDgiZX1f-imyF5joN+PjvA@mail.gmail.com>
	<CAKVJ-_4DLafSQ6NPnda_BUf1eRQhVLUHGdi834K-4RpBfS9uEg@mail.gmail.com>
	<CAKVJ-_7oiLZM0EOEE_Y_Z6Ob-sYdk2KM518DoaQbohRxyhiXyA@mail.gmail.com>
Message-ID: <CAHQkFdc=7NUYsaCjy5cgAHL1Bo14idGonxcETRSo04gaOzjjZA@mail.gmail.com>

Peter,

It's been a few days that I can't connect to redmine. I just got a error
page saying RoR couldn't start or connect to the MySQL server.

Cheers,

Lenna


On Mon, Jul 22, 2013 at 10:36 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Mon, Jul 22, 2013 at 12:43 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >
> > Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now.
> >
> > In the absence of any dissenting views, and the fact that RedMine is
> > also offline right now (which I've raised with the OBF admin volunteers),
>
> Fixed again :)
>
> > I've enabled GitHub issues & linked to this from the main page:
> >
> > https://github.com/biopython/biopython/issues
> >
> > You'll notice there are already lots of issues there - all pull request
> > related. This is one reason why an automated import of the old
> > Bugzilla/RedMine issues could be complicated.
> >
> > Various other bits of our documentation will need to be updated...
>
> Hopefully done now, e.g.
>
> https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079
>
> https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079
>
> Note that there doesn't seem to be a way to turn off new issues in
> a RedMine project - there are hacks via removing the ability from
> the roles, but I fear that would affect the other projects still using
> the RedMine server (e.g. BioPerl).
>
> Instead we may just have to do the triage/migration and then
> drop the links to the old RedMine server from the website etc.
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From p.j.a.cock at googlemail.com  Mon Aug  5 09:43:19 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 14:43:19 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
	<CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
	<CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>
Message-ID: <CAKVJ-_5dL-Fsd3UtDneGgLR2PJVDfHaFFbi6+tSeYqV1DZYNNw@mail.gmail.com>

On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
>>>
>>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
>>> find anywhere else to install the PopGen software from.
>>>
>>
>> There seems to be a fairly recent snapshot on archive.org,
>> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html
>>
>> Meanwhile, I have emailed Dr. Mark Beaumont at Reading
>> University to ask about the server status.
>
> Mark has moved to Bristol:
> http://www.maths.bris.ac.uk/people/profile/mamab
>
> FDist and DFDist are available here now:
> http://www.maths.bris.ac.uk/~mamab/
>
> We need to update the Biopython documentation (and check
> those versions from Bristol still work with our tests).
>
> Tiago, could you handle that?

According to his email auto-reply, Tiago is away right now.

I've updated a couple of URLs in the source code:
https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850

Ben - did you see anything else which needs updating here?

Thanks,

Peter

From p.j.a.cock at googlemail.com  Mon Aug  5 10:01:12 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 15:01:12 +0100
Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues?
In-Reply-To: <CAHQkFdc=7NUYsaCjy5cgAHL1Bo14idGonxcETRSo04gaOzjjZA@mail.gmail.com>
References: <CAKVJ-_7U8HW4wa657oEYsR=vC=+cXV1O1nREps118O6F1uYjTQ@mail.gmail.com>
	<CADEGkF6LAVmLd1SmVp5UNexaWe5irzxD+9NHm2kAvR7r0KmxXA@mail.gmail.com>
	<CAKVJ-_7Ae2wsZurbNDjTHnAEEwtESDjqKMDbqVOcy66-emeW3w@mail.gmail.com>
	<CAHQkFddA8VHDvAmt_ThwfhRHTjF5HptZCE6xayvJH1aW2nLqYg@mail.gmail.com>
	<CAKVJ-_4C-Qf4qSsWCCasc5Mv6r93rDgiZX1f-imyF5joN+PjvA@mail.gmail.com>
	<CAKVJ-_4DLafSQ6NPnda_BUf1eRQhVLUHGdi834K-4RpBfS9uEg@mail.gmail.com>
	<CAKVJ-_7oiLZM0EOEE_Y_Z6Ob-sYdk2KM518DoaQbohRxyhiXyA@mail.gmail.com>
	<CAHQkFdc=7NUYsaCjy5cgAHL1Bo14idGonxcETRSo04gaOzjjZA@mail.gmail.com>
Message-ID: <CAKVJ-_6EHt1Y-hjYSVg5HkCd=N-6Yacym4qRGAE+w6m+5svjxA@mail.gmail.com>

On Mon, Aug 5, 2013 at 2:11 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> Peter,
>
> It's been a few days that I can't connect to redmine. I just got a error
> page saying RoR couldn't start or connect to the MySQL server.
>
> Cheers,
>
> Lenna

OK, Chris Dag has got RedMine to work again, and told
me what he did in case I need to restart if this happens
again. If any RedMine guru is reading and has some
thoughts on the cause and long term solution, drop us
an email please.

As to issue triage - I suggest you start with anything you
filed or commented on, then things you are familiar with.
But any order is fine really.

I suggest for "moving" an issue, we file the new GitHub
issue (linking to the old issue, but also trying to capture
any relevant information from the old bug tracker to be
self sufficient), and then close the old RedMine issue
with a link to its replacement.

Thanks,

Peter

From p.j.a.cock at googlemail.com  Mon Aug  5 10:26:32 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 15:26:32 +0100
Subject: [Biopython-dev] Bio.XXX.Applications vs Bio.motifs.applications
Message-ID: <CAKVJ-_7LGhWHBrT1JDSkB2GyC9f-mToNVs=TD2nitP5FLskZtQ@mail.gmail.com>

Hi all,

I've noticed that as part of migrating from Bio.Motif to Bio.motifs,
the Applications module has acquired a lower case name.

Lower case module names are in principle a good thing (PEP8)
but elsewhere in Biopython the Applications modules are all
using title case.

Would a lower case shorter name be better, such as apps
(i.e. Bio.motifs.apps in this case)? This could also be adopted
in other modules for a gradual conversion if desired (e.g.
introduce Bio.Phylo.apps as an alias for Bio.Phylo.Applications).

What do people think?

Thanks,

Peter

From dalke at dalkescientific.com  Mon Aug  5 21:18:06 2013
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Tue, 6 Aug 2013 03:18:06 +0200
Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython?
In-Reply-To: <CAKVJ-_61GFD3QFEYKwhwyNXfvPnGQc9EHPk3CUs=7CbqErFjnw@mail.gmail.com>
References: <CAKVJ-_5i0M0LHWpR=eWcDEP-X-Dmm9jeggWY7aYdDFXhxO01xQ@mail.gmail.com>
	<CAKVJ-_61GFD3QFEYKwhwyNXfvPnGQc9EHPk3CUs=7CbqErFjnw@mail.gmail.com>
Message-ID: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com>

On Jul 24, 2013, at 11:13 AM, Peter Cock wrote:
> The current Biopython License is very short and liberal, and I have
> long described it as an MIT/BSD type licence. However the actual
> wording matches neither of these exactly (as far as I could tell):

That's my doing. When Jeff and I started Biopython in 1999 we
needed to choose a license. We started with the Python license,
which (for 1.5.2) was:

  Permission to use, copy, modify, and distribute this software and its
  documentation for any purpose and without fee is hereby granted,
  provided that the above copyright notice appear in all copies and that
  both that copyright notice and this permission notice appear in
  supporting documentation, and that the names of Stichting Mathematisch
  Centrum or CWI or Corporation for National Research Initiatives or
  CNRI not be used in advertising or publicity pertaining to
  distribution of the software without specific, written prior
  permission.

  While CWI is the initial source for this software, a modified version
  is made available by the Corporation for National Research Initiatives
  (CNRI) at the Internet address ftp://ftp.python.org.

  STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH
  REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
  MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH
  CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
  DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
  PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
  TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
  PERFORMANCE OF THIS SOFTWARE.

Compare that to the Biopython license, with the alterations marked:

  Permission to use, copy, modify, and distribute this software
  and its documentation >>>with or without modifications<< and for
  any purpose and without fee is hereby granted, provided that
  >>any copyright notices<<< appear in all copies and that both
  >>>those copyright notices<<< and this permission notice appear
  in supporting documentation, and that the names of >>>the
  contributors or copyright holders<<< not be used in advertising
  or publicity pertaining to distribution of the software without
  specific prior permission.

  [2nd paragraph of original Python license omitted]

  >>>THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE<<<
  DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
  ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT
  SHALL >>>THE CONTRIBUTORS OR COPYRIGHT HOLDERS<<< BE LIABLE FOR
  ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
  IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
  ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
  THIS SOFTWARE.

This was called a "Python-style license", and you can see an
example at http://effbot.org/zone/copyright.htm . Indeed, his
PIL package is an example of a current Python module which
still uses that license:
  http://www.pythonware.com/products/pil/license.htm


You'll see that Fredrik Lundh refers to it as the "Historical
Permission Notice and Disclaimer", and points to:

  http://opensource.org/licenses/historical.php

Further note that the OSI comments that "This License has been
voluntarily deprecated by its author" .. whatever that
means ... and that that http://opensource.org/proliferation-report
describes it as "redundant with more popular licenses", and
more specifically the BSD.


> In theory we could ask the OSI to approve our current license, but as
> they explain "yet another license" is not a good thing to encourage:
> http://opensource.org/proliferation

It wouldn't be a "yet another license" as it's already
registered with the OSI ... almost.

The one odd alteration I made was to add "with or without
modifications", because some people on comp.lang.python
expressed concern that "use, copy, modify, and distribute"
could be interpreted to be restrictive, as in "you can
modify it original source code, or distribute the original
source code, but you can't distribute the modified source
code. I've since learned that this is a hyper-picky
interpretation with no legal bearing.

I don't know if that "with or without modifications" is
enough different that the OSI would say it's doesn't fall
under the 'Historical Permission Notice and Disclaimer',


In any case, I agree with a relicensing. The current
license is from a bygone era. Nowadays I just pick the MIT
license.

If there's anything copyright by me still remaining in
Biopython, I hereby relicense it under the MIT and/or one
of the standard n-clause BSD licenses, at your choice.


Cheers,

				Andrew
				dalke at dalkescientific.com


From p.j.a.cock at googlemail.com  Tue Aug  6 05:11:33 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 6 Aug 2013 10:11:33 +0100
Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython?
In-Reply-To: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com>
References: <CAKVJ-_5i0M0LHWpR=eWcDEP-X-Dmm9jeggWY7aYdDFXhxO01xQ@mail.gmail.com>
	<CAKVJ-_61GFD3QFEYKwhwyNXfvPnGQc9EHPk3CUs=7CbqErFjnw@mail.gmail.com>
	<9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com>
Message-ID: <CAKVJ-_4Vtp6+_vNmH0wP9r3z=Egm-u-e_+bqGkdgQHsr+BGEHg@mail.gmail.com>

On Tue, Aug 6, 2013 at 2:18 AM, Andrew Dalke <dalke at dalkescientific.com> wrote:
> On Jul 24, 2013, at 11:13 AM, Peter Cock wrote:
>> The current Biopython License is very short and liberal, and I have
>> long described it as an MIT/BSD type licence. However the actual
>> wording matches neither of these exactly (as far as I could tell):
>
> That's my doing. When Jeff and I started Biopython in 1999 we
> needed to choose a license. We started with the Python license,
> which (for 1.5.2) was:
>
> ...

Ah - with hindsight I should have checked the older Python
licenses, but I was thinking more of their current very long
version.

> You'll see that Fredrik Lundh refers to it as the "Historical
> Permission Notice and Disclaimer", and points to:
>
>   http://opensource.org/licenses/historical.php
>
> Further note that the OSI comments that "This License has been
> voluntarily deprecated by its author" .. whatever that
> means ... and that that http://opensource.org/proliferation-report
> describes it as "redundant with more popular licenses", and
> more specifically the BSD.
>
>> In theory we could ask the OSI to approve our current license, but as
>> they explain "yet another license" is not a good thing to encourage:
>> http://opensource.org/proliferation
>
> It wouldn't be a "yet another license" as it's already
> registered with the OSI ... almost.
>
> The one odd alteration I made was to add "with or without
> modifications", because some people on comp.lang.python
> expressed concern that "use, copy, modify, and distribute"
> could be interpreted to be restrictive, as in "you can
> modify it original source code, or distribute the original
> source code, but you can't distribute the modified source
> code. I've since learned that this is a hyper-picky
> interpretation with no legal bearing.
>
> I don't know if that "with or without modifications" is
> enough different that the OSI would say it's doesn't fall
> under the 'Historical Permission Notice and Disclaimer',

Thanks for that background information. Educational.

> In any case, I agree with a relicensing. The current
> license is from a bygone era. Nowadays I just pick the MIT
> license.
>
> If there's anything copyright by me still remaining in
> Biopython, I hereby relicense it under the MIT and/or one
> of the standard n-clause BSD licenses, at your choice.

That's great Andrew - thank you,

Peter

From p.j.a.cock at googlemail.com  Tue Aug  6 18:51:22 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 6 Aug 2013 23:51:22 +0100
Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application
	plans
Message-ID: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>

Hi Christian et al.,

I've just noticed something in the XXmotif wrapper which
I should have raised back in November 2012 when it was
committed. This is to do with the way the options were
define, e.g.

      _Option(["--negSet", "negSet", "negset", "NEGSET"],
                   "sequence set which has to be used as a reference set",
                   filename = True,
                   equate = False),

The first argument is a list of names, aliases which can
be used via the (legacy) set_parameter method. Of
these the first is what goes in the actual command
string, and the last must be a valid Python identifier
and becomes a property and a keyword argument
for the __init__ method (and ideally follow PEP8
guidelines).

Normally the _Option would just have TWO alias,
in this case ["--negSeq, "negset"] would seem best.

Clearly I'd not documented this well enough, but
I've tried to make this more explicit now:
https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293

Was there a special reason for all these case variants
in the XXmotif options??

We could perhaps just change this now in the newer
Bio.motifs module, despite this being live in the
Biopython 1.61 release... since right now the nasty
all upper case aliases are being used as the property
names and keyword names. But that could break a
few scripts already using Bio.motifs.application's
XXmotif wrapper.

Looking ahead, other than set_parameter, all the other
legacy bits in Bio.Application have all been removed -
so we could take a fresh look at if we can transition to
a more explicit application definition, which I hope is
possible with the class files defining these properties
explicitly (perhaps with decorators for things like
validation methods) - rather than implicitly as now
via the __init__ method which doesn't suit things
like autogenerated API docs.

There may be a catch in how to best make the
parameter order explicit (currently done via the
parameters being in a list) which can be vital for
many command line tools.

Regards,

Peter

From christian at brueffer.de  Thu Aug  8 06:37:19 2013
From: christian at brueffer.de (Christian Brueffer)
Date: Thu, 08 Aug 2013 12:37:19 +0200
Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application
	plans
In-Reply-To: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>
References: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>
Message-ID: <520374DF.9070301@brueffer.de>

On 8/7/13 0:51 , Peter Cock wrote:
> Hi Christian et al.,
> 
> I've just noticed something in the XXmotif wrapper which
> I should have raised back in November 2012 when it was
> committed. This is to do with the way the options were
> define, e.g.
> 
>       _Option(["--negSet", "negSet", "negset", "NEGSET"],
>                    "sequence set which has to be used as a reference set",
>                    filename = True,
>                    equate = False),
> 
> The first argument is a list of names, aliases which can
> be used via the (legacy) set_parameter method. Of
> these the first is what goes in the actual command
> string, and the last must be a valid Python identifier
> and becomes a property and a keyword argument
> for the __init__ method (and ideally follow PEP8
> guidelines).
> 

Yeah, unfortunately I wasn't aware of this detail.

> Normally the _Option would just have TWO alias,
> in this case ["--negSeq, "negset"] would seem best.
> 
> Clearly I'd not documented this well enough, but
> I've tried to make this more explicit now:
> https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293
> 
> Was there a special reason for all these case variants
> in the XXmotif options??
> 

I basically followed the example set by
Bio/Align/Applications/_Clustalw.py.  The "rationale" was to allow for
people to use their
favourite spelling variety.

I guess it was bad luck this happened to serve as an example, as it
was the first piece of code I ever touched in BioPython.

It would be nice to streamline all application wrappers in this regard
sometime...

Chris

From p.j.a.cock at googlemail.com  Thu Aug  8 07:00:22 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 8 Aug 2013 12:00:22 +0100
Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application
	plans
In-Reply-To: <520374DF.9070301@brueffer.de>
References: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>
	<520374DF.9070301@brueffer.de>
Message-ID: <CAKVJ-_5BRKC9Du28dNxZRkSsjKKFqn0vcVRfgD6eBZd0oNr+CQ@mail.gmail.com>

On Thu, Aug 8, 2013 at 11:37 AM, Christian Brueffer
<christian at brueffer.de> wrote:
>>
>> Was there a special reason for all these case variants
>> in the XXmotif options??
>
> I basically followed the example set by
> Bio/Align/Applications/_Clustalw.py.

Ah. Without checking I think maybe the ClustalW documentation
used both cases - but the order was deliberately with the lower
case one last as that was used in the Python object as the
property name and keyword.

> The "rationale" was to allow for people to use their favourite
> spelling variety.
>
> I guess it was bad luck this happened to serve as an example, as it
> was the first piece of code I ever touched in BioPython.
>
> It would be nice to streamline all application wrappers in this regard
> sometime...

Yeah, perhaps we can formally deprecate set_parameter in
the next release which means all the aliases 'go away' and
that leaves us with just the final entry exposed as the usable
property name and keyword.

Peter

From arklenna at gmail.com  Thu Aug  8 15:54:58 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 8 Aug 2013 15:54:58 -0400
Subject: [Biopython-dev] PDB occupancy behavior
Message-ID: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>

Hi all,

I just submitted a pull request I'd like wider feedback on.

https://github.com/biopython/biopython/pull/207

In summary, I am using software-produced PDB files that simply stop after
the coordinate data, so occupancy data is missing. Currently, the Biopython
PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing
this to 1.0.

I would like to see if anyone knows of situations in which this would be a
bad idea.

Cheers,

Lenna

From anaryin at gmail.com  Thu Aug  8 16:02:39 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Thu, 8 Aug 2013 13:02:39 -0700
Subject: [Biopython-dev] [Biopython] PDB occupancy behavior
In-Reply-To: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
References: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
Message-ID: <CAJ9sUYNLHwpGsSWXC4k-bEdGeao60ZvCf+hnKKgTH3GrZF+8SA@mail.gmail.com>

Hi Lenna,

As I mentioned in the Github email, I think it's fine. It doesn't matter if
the occupancy is 0 or 1 in case of a model most of the time. I agree with
it. The only bad thing I can think about is having occupancy for a certain
atom larger than 1 in some bogus cases but to be honest, no software that I
know of bothers checking that...

Cheers,

Jo?o


2013/8/8 Lenna Peterson <arklenna at gmail.com>

> Hi all,
>
> I just submitted a pull request I'd like wider feedback on.
>
> https://github.com/biopython/biopython/pull/207
>
> In summary, I am using software-produced PDB files that simply stop after
> the coordinate data, so occupancy data is missing. Currently, the Biopython
> PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing
> this to 1.0.
>
> I would like to see if anyone knows of situations in which this would be a
> bad idea.
>
> Cheers,
>
> Lenna
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From p.j.a.cock at googlemail.com  Thu Aug  8 18:37:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 8 Aug 2013 23:37:27 +0100
Subject: [Biopython-dev] [Biopython] PDB occupancy behavior
In-Reply-To: <AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
References: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
	<CAJ9sUYNLHwpGsSWXC4k-bEdGeao60ZvCf+hnKKgTH3GrZF+8SA@mail.gmail.com>
	<AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
Message-ID: <CAKVJ-_5o_sef+fUo6e-=Hfo=wiS4FrB8hS72CN31Yh2vdw4waw@mail.gmail.com>

Thanks everyone - that seems like a clear consensus, patch applied :)

Peter

On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared <Jared.Sampson at nyumc.org> wrote:
> Thanks, Lenna and Jo?o -
>
> I also agree, 1.0 is a better default occupancy value.  For most
> structural manipulation purposes, unless specified otherwise, we must assume
> the atoms listed are present in the structure at full occupancy.  Setting a
> reduced occupancy can be useful for partially bound ligands, disordered
> loops, and so forth, but doing so is the exception, not the rule.
>
> Cheers,
> Jared
>
> --
> Jared Sampson
> Xiangpeng Kong Lab
> NYU Langone Medical Center
> Old Public Health Building, Room 610
> 341 East 25th Street
> New York, NY 10016
> 212-263-7898
> http://kong.med.nyu.edu/
>
>
>
>
> On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues
> <anaryin at gmail.com<mailto:anaryin at gmail.com>> wrote:
>
> Hi Lenna,
>
> As I mentioned in the Github email, I think it's fine. It doesn't matter
> if the occupancy is 0 or 1 in case of a model most of the time. I agree
> with it. The only bad thing I can think about is having occupancy for
> a certain atom larger than 1 in some bogus cases but to be honest,
> no software that I know of bothers checking that...
>
> Cheers,
>
> Jo?o
>
>
> 2013/8/8 Lenna Peterson <arklenna at gmail.com<mailto:arklenna at gmail.com>>
>
> Hi all,
>
> I just submitted a pull request I'd like wider feedback on.
>
> https://github.com/biopython/biopython/pull/207
>
> In summary, I am using software-produced PDB files that simply stop after
> the coordinate data, so occupancy data is missing. Currently, the
> Biopython PDBParser sets missing or blank occupancy to 0.0. I am
> suggesting changing this to 1.0.
>
> I would like to see if anyone knows of situations in which this would be a
> bad idea.
>
> Cheers,
>
> Lenna


From p.j.a.cock at googlemail.com  Thu Aug  8 18:37:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 8 Aug 2013 23:37:27 +0100
Subject: [Biopython-dev] [Biopython] PDB occupancy behavior
In-Reply-To: <AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
References: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
	<CAJ9sUYNLHwpGsSWXC4k-bEdGeao60ZvCf+hnKKgTH3GrZF+8SA@mail.gmail.com>
	<AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
Message-ID: <CAKVJ-_5o_sef+fUo6e-=Hfo=wiS4FrB8hS72CN31Yh2vdw4waw@mail.gmail.com>

Thanks everyone - that seems like a clear consensus, patch applied :)

Peter

On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared <Jared.Sampson at nyumc.org> wrote:
> Thanks, Lenna and Jo?o -
>
> I also agree, 1.0 is a better default occupancy value.  For most
> structural manipulation purposes, unless specified otherwise, we must assume
> the atoms listed are present in the structure at full occupancy.  Setting a
> reduced occupancy can be useful for partially bound ligands, disordered
> loops, and so forth, but doing so is the exception, not the rule.
>
> Cheers,
> Jared
>
> --
> Jared Sampson
> Xiangpeng Kong Lab
> NYU Langone Medical Center
> Old Public Health Building, Room 610
> 341 East 25th Street
> New York, NY 10016
> 212-263-7898
> http://kong.med.nyu.edu/
>
>
>
>
> On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues
> <anaryin at gmail.com<mailto:anaryin at gmail.com>> wrote:
>
> Hi Lenna,
>
> As I mentioned in the Github email, I think it's fine. It doesn't matter
> if the occupancy is 0 or 1 in case of a model most of the time. I agree
> with it. The only bad thing I can think about is having occupancy for
> a certain atom larger than 1 in some bogus cases but to be honest,
> no software that I know of bothers checking that...
>
> Cheers,
>
> Jo?o
>
>
> 2013/8/8 Lenna Peterson <arklenna at gmail.com<mailto:arklenna at gmail.com>>
>
> Hi all,
>
> I just submitted a pull request I'd like wider feedback on.
>
> https://github.com/biopython/biopython/pull/207
>
> In summary, I am using software-produced PDB files that simply stop after
> the coordinate data, so occupancy data is missing. Currently, the
> Biopython PDBParser sets missing or blank occupancy to 0.0. I am
> suggesting changing this to 1.0.
>
> I would like to see if anyone knows of situations in which this would be a
> bad idea.
>
> Cheers,
>
> Lenna


From ben at benfulton.net  Thu Aug  8 21:03:10 2013
From: ben at benfulton.net (Ben Fulton)
Date: Thu, 8 Aug 2013 21:03:10 -0400
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_5dL-Fsd3UtDneGgLR2PJVDfHaFFbi6+tSeYqV1DZYNNw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
	<CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
	<CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>
	<CAKVJ-_5dL-Fsd3UtDneGgLR2PJVDfHaFFbi6+tSeYqV1DZYNNw@mail.gmail.com>
Message-ID: <CA+ijMs=S9gh2Ys4ac3kLUwco+zfyTJ=SK5eBJwC4AG4MFNLt7A@mail.gmail.com>

Everything else is passing. The PopGen files pass as well after installing
them from source.


On Mon, Aug 5, 2013 at 9:43 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
> >>>
> >>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I
> can't
> >>> find anywhere else to install the PopGen software from.
> >>>
> >>
> >> There seems to be a fairly recent snapshot on archive.org,
> >>
> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html
> >>
> >> Meanwhile, I have emailed Dr. Mark Beaumont at Reading
> >> University to ask about the server status.
> >
> > Mark has moved to Bristol:
> > http://www.maths.bris.ac.uk/people/profile/mamab
> >
> > FDist and DFDist are available here now:
> > http://www.maths.bris.ac.uk/~mamab/
> >
> > We need to update the Biopython documentation (and check
> > those versions from Bristol still work with our tests).
> >
> > Tiago, could you handle that?
>
> According to his email auto-reply, Tiago is away right now.
>
> I've updated a couple of URLs in the source code:
>
> https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850
>
> Ben - did you see anything else which needs updating here?
>
> Thanks,
>
> Peter
>

From mok at bioxray.dk  Fri Aug  9 04:39:55 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 10:39:55 +0200
Subject: [Biopython-dev] PDB occupancy behavior
Message-ID: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>

Lenna wrote:

>  In summary, I am using software-produced PDB files that simply stop after
>  the coordinate data, so occupancy data is missing. Currently, the Biopython
>  PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing
>  this to 1.0.

I think it is an incorrect default behaviour to set the occupancy to 1 if it's not present in the file. If the occupancy is not there, you can't say anything about it, and it should be set to 0, so the current defaults are correct IMO.

If, for some reason, you NEED the occupancy to be 1, and it is not, it is very simple to write a loop modifying it. I.e. special needs should be taken care of in the users program, not Bio.PDB.

Cheers,
Morten

-- 
Morten Kjeldgaard, asc. professor, MSc, PhD
Dept. of Molecular Biology and Genetics, Aarhus University
Gustav Wieds Vej 10C, Building 3135, DK-8000 Aarhus C, Denmark.


From mok at bioxray.dk  Fri Aug  9 04:33:37 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 10:33:37 +0200
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>

Hi,

I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record.  In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done.

Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example.

My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future.

Cheers,
Morten


[1] https://redmine.open-bio.org/issues/2727
[2] https://github.com/mok0/biopython/tree/pdbwork

From p.j.a.cock at googlemail.com  Fri Aug  9 04:47:15 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 09:47:15 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
Message-ID: <CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>

On Fri, Aug 9, 2013 at 9:39 AM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

> Lenna wrote:
>
> >  In summary, I am using software-produced PDB files that simply stop
> after
> >  the coordinate data, so occupancy data is missing. Currently, the
> Biopython
> >  PDBParser sets missing or blank occupancy to 0.0. I am suggesting
> changing
> >  this to 1.0.
>
> I think it is an incorrect default behaviour to set the occupancy

to 1 if it's not present in the file. If the occupancy is not there,

you can't say anything about it, and it should be set to 0, so the

current defaults are correct IMO.
>
> If, for some reason, you NEED the occupancy to be 1, and it

is not, it is very simple to write a loop modifying it. I.e. special

needs should be taken care of in the users program, not Bio.PDB.
>
> Cheers,
> Morten
>
>
How about the special float values NaN or NA instead?
Or the Python special value None?

Peter

From mok at bioxray.dk  Fri Aug  9 04:33:37 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 10:33:37 +0200
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>

Hi,

I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record.  In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done.

Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example.

My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future.

Cheers,
Morten


[1] https://redmine.open-bio.org/issues/2727
[2] https://github.com/mok0/biopython/tree/pdbwork

From mok at bioxray.dk  Fri Aug  9 05:07:13 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 11:07:13 +0200
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
Message-ID: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>

On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> How about the special float values NaN or NA instead?
> Or the Python special value None?

TBH I don't think there is any good reason to change the current defaults. On the contrary, we should be careful when changing default values since this might break users' programs.

My point is, that Lenna wants to read files that does not follow the PDB standard, and so she needs to make provisions for that in her own program, not the toolkit. 

Putting None in the value of a field that isn't there, but should be according the format specification is more reasonable, since it alerts the user to the fact that something is fishy. However, it should only be done this way if that is a philosophy used throughout the Biopython toolkit. Is it?

I would warn against using NaN since it is non-pythonic and a nightmare to deal with in practice.

Cheers,
Morten


From p.j.a.cock at googlemail.com  Fri Aug  9 07:06:46 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 12:06:46 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
Message-ID: <CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>

On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

> On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > How about the special float values NaN or NA instead?
> > Or the Python special value None?
>
> TBH I don't think there is any good reason to change the current defaults.
> On the contrary, we should be careful when changing default values since
> this might break users' programs.
>
> My point is, that Lenna wants to read files that does not follow the PDB
> standard, and so she needs to make provisions for that in her own program,
> not the toolkit.
>
>
Do you think this should be something handled differently in strict and
permissive mode? Should missing occupancy give a warning or error in strict
mode?

Peter

From arklenna at gmail.com  Fri Aug  9 09:07:41 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Fri, 9 Aug 2013 09:07:41 -0400
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
Message-ID: <CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>

On Friday, 9 August 2013, Peter Cock wrote:

> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk<javascript:;>>
> wrote:
>
> > On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>>
> wrote:
> >
> > > How about the special float values NaN or NA instead?
> > > Or the Python special value None?
> >
> > TBH I don't think there is any good reason to change the current
> defaults.
> > On the contrary, we should be careful when changing default values since
> > this might break users' programs.
> >
> > My point is, that Lenna wants to read files that does not follow the PDB
> > standard, and so she needs to make provisions for that in her own
> program,
> > not the toolkit.
> >
> >
> Do you think this should be something handled differently in strict and
> permissive mode? Should missing occupancy give a warning or error in strict
> mode?


(Resending to dev list)

None in permissive mode makes a lot of sense to me.

Missing occupancy is a fatal error in strict mode.

Lenna

From p.j.a.cock at googlemail.com  Fri Aug  9 09:14:44 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 14:14:44 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
	<CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
Message-ID: <CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>

On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> On Friday, 9 August 2013, Peter Cock wrote:
>
>> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk<javascript:;>>
>> wrote:
>>
>> > On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>>
>> wrote:
>> >
>> > > How about the special float values NaN or NA instead?
>> > > Or the Python special value None?
>> >
>> > TBH I don't think there is any good reason to change the current
>> defaults.
>> > On the contrary, we should be careful when changing default values since
>> > this might break users' programs.
>> >
>> > My point is, that Lenna wants to read files that does not follow the PDB
>> > standard, and so she needs to make provisions for that in her own
>> > program, not the toolkit.
>> >
>> >
>> Do you think this should be something handled differently in strict and
>> permissive mode? Should missing occupancy give a warning or error in strict
>> mode?
>
> (Resending to dev list)
>
> None in permissive mode makes a lot of sense to me.
>
> Missing occupancy is a fatal error in strict mode.
>
> Lenna

Good (error in strict mode).

Do you think a warning in permissive mode for missing occupancy
is also worth adding, or would using None as the value indicate
that nicely?

Peter

From arklenna at gmail.com  Fri Aug  9 09:46:54 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Fri, 9 Aug 2013 09:46:54 -0400
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
	<CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
	<CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>
Message-ID: <CAHQkFdf6cbYyCcXhZixP89RtP8a4-RV8MO0_y8goyVM-wReUyw@mail.gmail.com>

On Friday, 9 August 2013, Peter Cock wrote:

> On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson <arklenna at gmail.com<javascript:;>>
> wrote:
> > On Friday, 9 August 2013, Peter Cock wrote:
> >
> >> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk<javascript:;>
> <javascript:;>>
> >> wrote:
> >>
> >> > On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>
> <javascript:;>>
> >> wrote:
> >> >
> >> > > How about the special float values NaN or NA instead?
> >> > > Or the Python special value None?
> >> >
> >> > TBH I don't think there is any good reason to change the current
> >> defaults.
> >> > On the contrary, we should be careful when changing default values
> since
> >> > this might break users' programs.
> >> >
> >> > My point is, that Lenna wants to read files that does not follow the
> PDB
> >> > standard, and so she needs to make provisions for that in her own
> >> > program, not the toolkit.
> >> >
> >> >
> >> Do you think this should be something handled differently in strict and
> >> permissive mode? Should missing occupancy give a warning or error in
> strict
> >> mode?
> >
> > (Resending to dev list)
> >
> > None in permissive mode makes a lot of sense to me.
> >
> > Missing occupancy is a fatal error in strict mode.
> >
> > Lenna
>
> Good (error in strict mode).
>
> Do you think a warning in permissive mode for missing occupancy
> is also worth adding, or would using None as the value indicate
> that nicely?
>
> Peter
>


I have some concern about changing the type of an attribute but I imagine
any end user who cares about occupancy doesn't want spurious values of
either 1.0 or 0.0 anyway.

I'm not at a computer right now but I believe most problems in the PDB
parser are fatal in strict and warnings in permissive. So there should
already be a warning in place.

It occurred to me it would also be possible o create an "ultra-permissive"
mode designed for parsing computationally produced files, and suppress some
of the warnings (e.g. missing occupancy and B-factor). That way,
the current behavior could be left unchanged. Possibly a permissiveness
level (0 for strict, 1 for current permissive, 2 for even more permissive).

Anyway, I'd be happy to implement any of these options (current parser to
None, restore previous behavior and None in a new permissiveness level,
other?) and of course update the unit test.

Cheers,

Lenna

From p.j.a.cock at googlemail.com  Fri Aug  9 10:22:29 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 15:22:29 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAHQkFdf6cbYyCcXhZixP89RtP8a4-RV8MO0_y8goyVM-wReUyw@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
	<CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
	<CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>
	<CAHQkFdf6cbYyCcXhZixP89RtP8a4-RV8MO0_y8goyVM-wReUyw@mail.gmail.com>
Message-ID: <CAKVJ-_7yb2pexpJqTbGmTGs3aW5goJBxBZAq2EJOhxCugOursQ@mail.gmail.com>

On Fri, Aug 9, 2013 at 2:46 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> On Friday, 9 August 2013, Peter Cock wrote:
>>
>> Good (error in strict mode).
>>
>> Do you think a warning in permissive mode for missing occupancy
>> is also worth adding, or would using None as the value indicate
>> that nicely?
>>
>> Peter
>
>
>
> I have some concern about changing the type of an attribute but I imagine
> any end user who cares about occupancy doesn't want spurious values of
> either 1.0 or 0.0 anyway.
>
> I'm not at a computer right now but I believe most problems in the PDB
> parser are fatal in strict and warnings in permissive. So there should
> already be a warning in place.
>
> It occurred to me it would also be possible o create an "ultra-permissive"
> mode designed for parsing computationally produced files, and suppress some
> of the warnings (e.g. missing occupancy and B-factor). That way, the current
> behavior could be left unchanged. Possibly a permissiveness level (0 for
> strict, 1 for current permissive, 2 for even more permissive).
>
> Anyway, I'd be happy to implement any of these options (current parser to
> None, restore previous behavior and None in a new permissiveness level,
> other?) and of course update the unit test.

You should be able to silence the PDB warnings in two lines anyway,
so I don't think we really need an ultra-permissive no-warnings mode.

Peter

From anaryin at gmail.com  Fri Aug  9 13:26:59 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Fri, 9 Aug 2013 10:26:59 -0700
Subject: [Biopython-dev] Moratorium on commits?
Message-ID: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>

Dear all,

The situation with the occupancy in the PDBParser led to think of one
thing.

Since not everybody is in the same timezone, has the same availability,
etc, what about we introduce a brief moratorium over commits of say 3 days
(except for critical bug fixes)? This will give everybody probably enough
time to read the email and give their opinion.

The downside is that it will make things roll a bit slower but then again,
3 days is not so much..

Cheers,

Jo?o


From p.j.a.cock at googlemail.com  Fri Aug  9 15:06:21 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 20:06:21 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
Message-ID: <CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>

On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
> Dear all,
>
> The situation with the occupancy in the PDBParser led to think of one
> thing.
>
> Since not everybody is in the same timezone, has the same availability,
> etc, what about we introduce a brief moratorium over commits of say 3 days
> (except for critical bug fixes)? This will give everybody probably enough
> time to read the email and give their opinion.
>
> The downside is that it will make things roll a bit slower but then again,
> 3 days is not so much..
>
> Cheers,
>
> Jo?o

I don't think that's really needed for small commits like
this which are simple to interpret. In this case there were
three opinions in favour of the idea, with a fourth counter
view appearing later, resulting in a further tweak.

Longer periods of discussion are far more important on
large code additions or major changes.

Peter


From arklenna at gmail.com  Sat Aug 10 20:43:36 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Sat, 10 Aug 2013 20:43:36 -0400
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
In-Reply-To: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
Message-ID: <CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>

Hi Morten,

I think this looks great. Why not submit a pull request?

Cheers,
Lenna


On Fri, Aug 9, 2013 at 4:33 AM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

> Hi,
>
> I've finally gotten around to following up to a very old patch I sent to
> the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB
> does not parse the important CRYST1 record.  In the bug comments, Peter
> Cock asked to include the explanation of the new keys in the docstring.
> That has now been done.
>
> Peter also asks about the default values chosen (if the CRYST1 header is
> not present). These are probably universally chosen default values in
> various crystallographic programs, and these values are also used in PDB
> entries containinging NMR entries, for example.
>
> My github branch containing the patch #2727 is in [2]. I am using Bio.PDB
> quite a lot, and I would like to contribute more to it in the future.
>
> Cheers,
> Morten
>
>
> [1] https://redmine.open-bio.org/issues/2727
> [2] https://github.com/mok0/biopython/tree/pdbwork
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From mok at bioxray.dk  Sun Aug 11 14:33:05 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Sun, 11 Aug 2013 20:33:05 +0200
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
In-Reply-To: <CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>
References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
	<CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>
Message-ID: <BCB0465E-7439-4562-B746-7585D39E3C48@bioxray.dk>


On 11/08/2013, at 02:43, Lenna Peterson <arklenna at gmail.com> wrote:

> I think this looks great. Why not submit a pull request?

Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I thought that is what I did by posting to the -dev list). 

Cheers,
Morten

From mok at bioxray.dk  Sun Aug 11 14:28:36 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Sun, 11 Aug 2013 20:28:36 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
Message-ID: <CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>


On 09/08/2013, at 21:06, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
>> Dear all,
>> 
>> The situation with the occupancy in the PDBParser led to think of one
>> thing.
>> 
>> Since not everybody is in the same timezone, has the same availability,
>> etc, what about we introduce a brief moratorium over commits of say 3 days
>> (except for critical bug fixes)? This will give everybody probably enough
>> time to read the email and give their opinion.
>> 
>> The downside is that it will make things roll a bit slower but then again,
>> 3 days is not so much..
>> 
>> Cheers,
>> 
>> Jo?o
> 
> I don't think that's really needed for small commits like
> this which are simple to interpret. In this case there were
> three opinions in favour of the idea, with a fourth counter
> view appearing later, resulting in a further tweak.
> 
> Longer periods of discussion are far more important on
> large code additions or major changes.

Sorry, but I don't agree that this is a "small commit". It may not be large in terms of number of bytes, but it is large in terms of impact, since it affects users' programs in unpredictable ways. Whenever a change is made that affects values returned to the user, it is worth spending a few days discussing it,  to let people have a chance to think through the consequences of the change.

Cheers,
Morten


From arklenna at gmail.com  Sun Aug 11 14:40:38 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Sun, 11 Aug 2013 14:40:38 -0400
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
In-Reply-To: <BCB0465E-7439-4562-B746-7585D39E3C48@bioxray.dk>
References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
	<CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>
	<BCB0465E-7439-4562-B746-7585D39E3C48@bioxray.dk>
Message-ID: <CAHQkFdfE2GJK-5Pc2JZhZSL17oPZSQLpXAKZ87QAAqmtqZdvSQ@mail.gmail.com>

On Sun, Aug 11, 2013 at 2:33 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

>
> On 11/08/2013, at 02:43, Lenna Peterson <arklenna at gmail.com> wrote:
>
> > I think this looks great. Why not submit a pull request?
>
> Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I
> thought that is what I did by posting to the -dev list).
>
> Cheers,
> Morten


Hey Morten,

It's good to let the dev list know you have code ready to merge in, but if
you do it on github, it will show up here too:
https://github.com/biopython/biopython/pulls

Here's github's instructions:

https://help.github.com/articles/creating-a-pull-request

Cheers,

Lenna

From p.j.a.cock at googlemail.com  Sun Aug 11 16:50:46 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sun, 11 Aug 2013 21:50:46 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
Message-ID: <CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>

On Sun, Aug 11, 2013 at 7:28 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:
>
> On 09/08/2013, at 21:06, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
>> On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
>>> Dear all,
>>>
>>> The situation with the occupancy in the PDBParser led to think of one
>>> thing.
>>>
>>> Since not everybody is in the same timezone, has the same availability,
>>> etc, what about we introduce a brief moratorium over commits of say 3
>>> days (except for critical bug fixes)? This will give everybody probably
>>> enough time to read the email and give their opinion.
>>>
>>> The downside is that it will make things roll a bit slower but then
>>> again, 3 days is not so much..
>>>
>>> Cheers,
>>>
>>> Jo?o
>>
>> I don't think that's really needed for small commits like
>> this which are simple to interpret. In this case there were
>> three opinions in favour of the idea, with a fourth counter
>> view appearing later, resulting in a further tweak.
>>
>> Longer periods of discussion are far more important on
>> large code additions or major changes.
>
> Sorry, but I don't agree that this is a "small commit". It may
> not be large in terms of number of bytes, but it is large in
> terms of impact, since it affects users' programs in
> unpredictable ways.

Hello again Morten,

I did mean small in number of code change, which I
tried to make clear from the rest of the email, but
as discussed below, I also think the PDB occupancy
change was also small in terms of behaviour.

> Whenever a change is made that affects values
> returned to the user, it is worth spending a few days
> discussing it,  to let people have a chance to think
> through the consequences of the change.

Almost any change impacts the user in some way.

I still feel this was a minor change (although of
course important to some, including you). This is
parsing of malformed PDF files where the user
ALREADY gets a warning (or error in strict mode,
where there would be no functional change) that
there is a problem with the occupancy data.

One reason why I specifically talked about small
commits (in the sense of a simple diff) above is
they are trivial to revert if the need arises, or as
in this case, modify:
https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e

This change was suggested and supported by
people who've been actively contributing to the
Biopython structural module for some time, so I
had reason to trust their good judgement, and as
I wrote at the time there was a clear consensus
with three people in all happy with the idea:
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html

Changes where there isn't clear agreement are
generally discussed over a longer time period.

Note that Biopython is already relatively strict
about not breaking things and preserving backwards
compatibility (to the point where it does delay new
features). We do care about not breaking existing
scripts without warning - so when people speak up
on the list that something is likely to cause them
trouble, we do listen.

Is that any clearer?

Regards,

Peter


From zruan1991 at gmail.com  Sun Aug 11 18:04:10 2013
From: zruan1991 at gmail.com (Zheng Ruan)
Date: Sun, 11 Aug 2013 18:04:10 -0400
Subject: [Biopython-dev] Codon Alignment GSoC Update
Message-ID: <CABM7aFpcMT+BapVya1ESpfKNvvUpdFFTodEo250xkk00yKQZQw@mail.gmail.com>

Hi all,

An update of Codon Alignment Project can be found at (http://zruanweb.com/).
In the next week, I will be implementing the Maximum Likelihood method for
dN/dS ratio estimation. I do not anticipate to write any code for the
optimization and Scipy's functionality is most suitable to be used here.
This might be a new dependency for Biopython. Is it okay to add this? Or
are there some other functions in Biopython for optimization problems?
Thanks!

Best,
Zheng Ruan

From kai.blin at biotech.uni-tuebingen.de  Mon Aug 12 06:53:17 2013
From: kai.blin at biotech.uni-tuebingen.de (Kai Blin)
Date: Mon, 12 Aug 2013 12:53:17 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
Message-ID: <5208BE9D.1090900@biotech.uni-tuebingen.de>

On 2013-08-09 19:26, Jo?o Rodrigues wrote:

Dear biopython devs,

> Since not everybody is in the same timezone, has the same availability,
> etc, what about we introduce a brief moratorium over commits of say 3 days
> (except for critical bug fixes)? This will give everybody probably enough
> time to read the email and give their opinion.

I've been through discussions like this before, in a lot of open source 
projects I'm involved in. I don't think this is a good step to take. 
Saying that "all patches need to wait unless they're special" will 
eventually lead to a dilution of what is considered special, and then 
lead to a point where most patches by core contributors happen to be 
special and patches by new contributors aren't. Because the policy 
doesn't explicitly state this, you then create a very unwelcoming 
atmosphere for the project. I would recommend to consider if avoiding 
the occasional revert is worth that cost.

Personally, one of the things I like about BioPython is how fast I'm 
able to get bugfixes in.

My two cents,
Kai

-- 
Dipl.-Inform. Kai Blin         kai.blin at biotech.uni-tuebingen.de
Institute for Microbiology and Infection Medicine
Division of Microbiology/Biotechnology
Eberhard-Karls-Universit?t T?bingen
Auf der Morgenstelle 28                 Phone : ++49 7071 29-78841
D-72076 T?bingen                        Fax :   ++49 7071 29-5979
Germany
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben

From tiagoantao at gmail.com  Mon Aug 12 07:33:40 2013
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 12 Aug 2013 12:33:40 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <5208BE9D.1090900@biotech.uni-tuebingen.de>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<5208BE9D.1090900@biotech.uni-tuebingen.de>
Message-ID: <CAA9RGEMdWw2NXC3vWBmUouC6YD=4sNAbaS3LzQSQr5ULJT0yFg@mail.gmail.com>

Hi,


On 12 August 2013 11:53, Kai Blin <kai.blin at biotech.uni-tuebingen.de> wrote:

> Personally, one of the things I like about BioPython is how fast I'm able
> to get bugfixes in.
>
>

I agree that the light approach to process is great. 99% of the patches are
pacific and would suffer from a heavier process.

For the rare cases where there are problems, revert can be used. My code
has been reverted a couple of times and I am fine with that (when one
commits to a public project with shared ownership one should expect
peer-review, sometimes heated discussion and corrections - it is normal).

If one thinks a change can be problematic, an initial discussion would be a
good idea. Of course, some times we do not know until after the fact, then
again, the good thing about version control is that we can undo things...

Generally things have been working very well and I would not change the
process to something heavier just because of a single case. Single cases
should be sorted on a case-by-case basis, with no stress.

My 2p,
Tiago

From yeyanbo289 at gmail.com  Mon Aug 12 09:25:22 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 12 Aug 2013 21:25:22 +0800
Subject: [Biopython-dev] GSOC weekly update 8
Message-ID: <CADoMHjzscM52YLsW0HiYRSMPvxbO=11gLnKgJBApMr-ChGKA1w@mail.gmail.com>

Hi all,

My update about Biopython.Phylo project can be found here:
http://blog.yeyanbo.com/posts/google-summer-of-code-9.html

Best,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*

From mok at bioxray.dk  Mon Aug 12 14:33:26 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Mon, 12 Aug 2013 20:33:26 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
Message-ID: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>


On 11/08/2013, at 22:50, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> I still feel this was a minor change (although of
> course important to some, including you). This is
> parsing of malformed PDF files where the user
> ALREADY gets a warning (or error in strict mode,
> where there would be no functional change) that
> there is a problem with the occupancy data.
> 
> One reason why I specifically talked about small
> commits (in the sense of a simple diff) above is
> they are trivial to revert if the need arises, or as
> in this case, modify:
> https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e
> 
> This change was suggested and supported by
> people who've been actively contributing to the
> Biopython structural module for some time, so I
> had reason to trust their good judgement, and as
> I wrote at the time there was a clear consensus
> with three people in all happy with the idea:
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html


I respect that you listen more to developers that have been contributing for a long time. That is quite understandable, but I hope that does not prevent me from contributing my opinions.

What prompted my response was the suggestion that the occupancy should be set to 1.0 if it is abscent from the file, i.e. if the PDB file is malformed. I think that is an incorrect behavior, and I say that not as a core developer, but as a crystallographer. If invalid data is present in the file, you do not want the toolkit transforming it to valid data.

After thinking about it, the suggestion to set values to None when they are not defined in a malformed file now appears quite reasonable, but if it is done this way with occupancies, it should also done this way with B-factors, chain identifiers and other values that are mandatory in the file according to the format specs. From the users perspective, if the values returned are None, you are alerted to the fact that something is wrong, and you should make an appropriate choice, whatever that may be.

Cheers,
Morten


From arklenna at gmail.com  Mon Aug 12 15:25:20 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Mon, 12 Aug 2013 15:25:20 -0400
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
	<677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
Message-ID: <CAHQkFddu6WGWwrRrC_1M50k24-1AiYAV6azmBP+EtOmA1DOSkg@mail.gmail.com>

On Mon, Aug 12, 2013 at 2:33 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

>
> On 11/08/2013, at 22:50, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > I still feel this was a minor change (although of
> > course important to some, including you). This is
> > parsing of malformed PDF files where the user
> > ALREADY gets a warning (or error in strict mode,
> > where there would be no functional change) that
> > there is a problem with the occupancy data.
> >
> > One reason why I specifically talked about small
> > commits (in the sense of a simple diff) above is
> > they are trivial to revert if the need arises, or as
> > in this case, modify:
> >
> https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e
> >
> > This change was suggested and supported by
> > people who've been actively contributing to the
> > Biopython structural module for some time, so I
> > had reason to trust their good judgement, and as
> > I wrote at the time there was a clear consensus
> > with three people in all happy with the idea:
> >
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html
>
>
> I respect that you listen more to developers that have been contributing
> for a long time. That is quite understandable, but I hope that does not
> prevent me from contributing my opinions.
>
> What prompted my response was the suggestion that the occupancy should be
> set to 1.0 if it is abscent from the file, i.e. if the PDB file is
> malformed. I think that is an incorrect behavior, and I say that not as a
> core developer, but as a crystallographer. If invalid data is present in
> the file, you do not want the toolkit transforming it to valid data.
>

 I appreciate the physical/practical feedback about the commits.

After thinking about it, the suggestion to set values to None when they are
> not defined in a malformed file now appears quite reasonable, but if it is
> done this way with occupancies, it should also done this way with
> B-factors, chain identifiers and other values that are mandatory in the
> file according to the format specs. From the users perspective, if the
> values returned are None, you are alerted to the fact that something is
> wrong, and you should make an appropriate choice, whatever that may be.
>
>
I agree that `None` is a good warning value for missing data.

I just skimmed the code and summarized how some of the missing values are
handled:

* Serial number: 0
* Chain: fatal in both strict and permissive modes (i.e. no try/except)
* Coordinates: fatal in both strict and permissive modes
* Occupancy: we recently decided to set as None in permissive
* B-factor: 0.0 in permissive (code comment states this is PDB default)
* Model seq id: 0

The StructureBuilder class also has certain ways of handling duplicate
residues and atoms that I'm not particularly familiar with. For example,
I'm not quite sure what will happen if successive atoms have missing serial
numbers.

PDB is a format where there's always a balance between absolute adherence
to the format and enough flexibility to deal with the wide range of
malformed files.

Lenna

From mok at bioxray.dk  Mon Aug 12 15:42:28 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Mon, 12 Aug 2013 21:42:28 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAHQkFddu6WGWwrRrC_1M50k24-1AiYAV6azmBP+EtOmA1DOSkg@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
	<677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
	<CAHQkFddu6WGWwrRrC_1M50k24-1AiYAV6azmBP+EtOmA1DOSkg@mail.gmail.com>
Message-ID: <0F6D9BF5-BFAA-4118-8D90-936AC44A29FA@bioxray.dk>


On 12/08/2013, at 21:25, Lenna Peterson <arklenna at gmail.com> wrote:

> * B-factor: 0.0 in permissive (code comment states this is PDB default)

The default referred to in that code comment is what the PDB annotators put in that field if the information is not provided by the depositor (which could be the case for i.e. an NMR model). From the PDB Atomic Coordinate Entry Format Description, Version 3.30:

	* If the depositor provides the data, then the isotropic B value is given for the temperature factor.
	
	* If there are neither isotropic B values from the depositor, nor anisotropic temperature factors in ANISOU, then the default value of 0.0 is used for the temperature factor.

In other words, the PDB format specification has no recommendations for what default values should be used if the field is blank in a malformed file, only what their staff should put in the entry when they receive it from the depositor.

So IMO Biopython is free to use None if the B-value is missing in a malformed file.

(I haven't checked the other items that Lenna mentions.)

Cheers,
Morten

From anaryin at gmail.com  Mon Aug 12 15:51:03 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Mon, 12 Aug 2013 12:51:03 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
Message-ID: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>

Hi all,

Moving to a new thread because this is a very specific issue.

I think that, from a programming point of view (but I'm a biologist so
correct me if I'm wrong) having None values upon parsing is probably a
better idea. Then, when writing, these should be translated to whatever
default there is in the PDB documentation.

Cheers,

Jo?o


From anaryin at gmail.com  Mon Aug 12 15:51:03 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Mon, 12 Aug 2013 12:51:03 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
Message-ID: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>

Hi all,

Moving to a new thread because this is a very specific issue.

I think that, from a programming point of view (but I'm a biologist so
correct me if I'm wrong) having None values upon parsing is probably a
better idea. Then, when writing, these should be translated to whatever
default there is in the PDB documentation.

Cheers,

Jo?o


From p.j.a.cock at googlemail.com  Mon Aug 12 16:36:15 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 12 Aug 2013 21:36:15 +0100
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
Message-ID: <CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>

On Monday, August 12, 2013, Jo?o Rodrigues wrote:

> Hi all,
>
> Moving to a new thread because this is a very specific issue.
>
> I think that, from a programming point of view (but I'm a biologist so
> correct me if I'm wrong) having None values upon parsing is probably a
> better idea. Then, when writing, these should be translated to whatever
> default there is in the PDB documentation.
>

Or throw an error to force the user to fix it?

Or write a blank occupancy to allow preservation of the
(flawed) input?

(Thank you for raising the output question now, it is a logically
consequence of putting None in the parsed structure)

Peter


From anaryin at gmail.com  Mon Aug 12 16:39:30 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Mon, 12 Aug 2013 13:39:30 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
Message-ID: <CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>

Throwing an error might not be a good idea because when dealing with models
they sometimes have missing fields... then we'd have to fix them all
somehow before parsing them.

The None value seems a good indicator that something is amiss, while not
putting any value there. There should also be a warning upon writing that
the value is being replaced by a default value. Blank is also good
actually, maybe we could add an option to the writer/parser to "preserve"
values?

Cheers,

Jo?o


2013/8/12 Peter Cock <p.j.a.cock at googlemail.com>

>
>
> On Monday, August 12, 2013, Jo?o Rodrigues wrote:
>
>> Hi all,
>>
>> Moving to a new thread because this is a very specific issue.
>>
>> I think that, from a programming point of view (but I'm a biologist so
>> correct me if I'm wrong) having None values upon parsing is probably a
>> better idea. Then, when writing, these should be translated to whatever
>> default there is in the PDB documentation.
>>
>
> Or throw an error to force the user to fix it?
>
> Or write a blank occupancy to allow preservation of the
> (flawed) input?
>
> (Thank you for raising the output question now, it is a logically
> consequence of putting None in the parsed structure)
>
> Peter
>
>


From p.j.a.cock at googlemail.com  Mon Aug 12 16:40:24 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 12 Aug 2013 21:40:24 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
	<677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
Message-ID: <CAKVJ-_5BbciDv7qMyJTcs=Z73Zxcj3YdFM74zQy+-jyt1m=7gw@mail.gmail.com>

On Monday, August 12, 2013, Morten Kjeldgaard wrote:

>
> On 11/08/2013, at 22:50, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>>
> wrote:
>
> > I still feel this was a minor change (although of
> > course important to some, including you). This is
> > parsing of malformed PDF files where the user
> > ALREADY gets a warning (or error in strict mode,
> > where there would be no functional change) that
> > there is a problem with the occupancy data.
> >
> > One reason why I specifically talked about small
> > commits (in the sense of a simple diff) above is
> > they are trivial to revert if the need arises, or as
> > in this case, modify:
> >
> https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e
> >
> > This change was suggested and supported by
> > people who've been actively contributing to the
> > Biopython structural module for some time, so I
> > had reason to trust their good judgement, and as
> > I wrote at the time there was a clear consensus
> > with three people in all happy with the idea:
> >
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html
>
>
> I respect that you listen more to developers that

have been contributing for a long time. That is quite

understandable, but I hope that does not prevent

me from contributing my opinions.


Of course not - your input (which was after the initial
change) has already resulted in a review of that
change and the adoption of None instead.

So thank you for speaking up,

Peter

From eric.talevich at gmail.com  Mon Aug 12 18:35:05 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Mon, 12 Aug 2013 15:35:05 -0700
Subject: [Biopython-dev] Codon Alignment GSoC Update
In-Reply-To: <CABM7aFpcMT+BapVya1ESpfKNvvUpdFFTodEo250xkk00yKQZQw@mail.gmail.com>
References: <CABM7aFpcMT+BapVya1ESpfKNvvUpdFFTodEo250xkk00yKQZQw@mail.gmail.com>
Message-ID: <CAMC681=HhPimOqV3Rmmm6jDtt9aJhVdQEisiOp_-sL4g1YfjOQ@mail.gmail.com>

Hi Zheng,

Nice work this week. For the next tasks:

1. It's probably not a high priority to implement all of the dN/dS
approaches described in Yang's book (i.e. LWL85m, LPB93, Ina95), beyond the
simple early methods (NG86, LWL85)  and the finale, YN00. If you get around
to doing them all, cool, but if you only have time to do one more I'd pick
YN00.

2. SciPy is a relatively large dependency, so I recommend making it a
runtime import -- do the import from within the function that needs it,
rather than at the top-level scope of the module. E.g.:
Bio.Phylo._utils.to_networkx

3. Where are you focusing your documentation efforts? If you're keeping
most of the descriptions in the docstrings, it would be convenient to
format the text as reStructuredText for processing with Epydoc and Sphinx.
Time permitting, it would also be nice to have a chapter on this work in
the Tutorial, see Doc/Tutorial.tex (also fine to write this up as a
separate LaTeX document first and roll it in later).

Cheers,
Eric


On Sun, Aug 11, 2013 at 3:04 PM, Zheng Ruan <zruan1991 at gmail.com> wrote:

> Hi all,
>
> An update of Codon Alignment Project can be found at (http://zruanweb.com/).
> In the next week, I will be implementing the Maximum Likelihood method for
> dN/dS ratio estimation. I do not anticipate to write any code for the
> optimization and Scipy's functionality is most suitable to be used here.
> This might be a new dependency for Biopython. Is it okay to add this? Or
> are there some other functions in Biopython for optimization problems?
> Thanks!
>
> Best,
> Zheng Ruan
>

From eric.talevich at gmail.com  Mon Aug 12 19:03:07 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Mon, 12 Aug 2013 16:03:07 -0700
Subject: [Biopython-dev] GSOC weekly update 8
In-Reply-To: <CADoMHjzscM52YLsW0HiYRSMPvxbO=11gLnKgJBApMr-ChGKA1w@mail.gmail.com>
References: <CADoMHjzscM52YLsW0HiYRSMPvxbO=11gLnKgJBApMr-ChGKA1w@mail.gmail.com>
Message-ID: <CAMC681m+W1gTT1hUJUthnbnjKyDg106BJdmWQuS+rQKApvx5=g@mail.gmail.com>

Hi Yanbo,

Looks like excellent progress.

At some point, would you mind documenting how the bit array operations are
used to represent trees, e.g. how a bit array (BitString instance) should
be interpreted in terms of taxa and tree topologies?

Thanks,
Eric


On Mon, Aug 12, 2013 at 6:25 AM, Yanbo Ye <yeyanbo289 at gmail.com> wrote:

> Hi all,
>
> My update about Biopython.Phylo project can be found here:
> http://blog.yeyanbo.com/posts/google-summer-of-code-9.html
>
> Best,
> Yanbo
>
> --
>
> *Yanbo Ye*
> *Guangzhou Institutes of Biomedicine and Health, *
> *Chinese Academy of Sciences*
> *190 Kaiyuan Avenue, Science Park, Guangzhou, China**
> *
> *
> *
> *Email: ye_yanbo at gibh.ac.cn*
> *Web: http://www.yeyanbo.com*
> *Phone: (86)-020-32093810*
>

From p.j.a.cock at googlemail.com  Wed Aug 14 05:44:24 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 14 Aug 2013 10:44:24 +0100
Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation
In-Reply-To: <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com>
References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com>
	<CAKVJ-_6SYCoL+2d=Cogkf4ws=iDTyxc0qdAfZZRJi-5jYc3TqA@mail.gmail.com>
	<86a9lcl1nt.fsf@fastmail.fm>
	<CAKVJ-_4kvUGeOW4CZ15swd_e48t8B_MrWCaO4mESAqC9_uLdYA@mail.gmail.com>
	<CAKVJ-_5ABuwE59WOfXW26yyEWff8ob-8KE+vMipMigv4bFLZfQ@mail.gmail.com>
	<1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com>
Message-ID: <CAKVJ-_4yYku2hbPATWOjjRc7J6sbpeqKhM8q8woBfedax54w7Q@mail.gmail.com>

On Friday, July 26, 2013 Peter wrote:
> On Wed, Jul 24, 2013  Peter Cock wrote:
>> On Wed, Jul 24, 2013 Brad Chapman wrote:
>>>
>>> Peter and Michiel;
>>>
>>>>> Do we actually need setuptools?
>>>>> Looking at setup.py, it seems that distutils is sufficient for our
>>>>> needs.
>>>>> If so, let's remove the dependency on setuptools.
>>>
>>> We used setuptools/distribute to install dependencies, although
>>> practically this doesn't work well since pip doesn't finish NumPy
>>> installation before installing Biopython. So I'm fine with taking it out
>>> if you want to simplify the setup and avoid the extra dependency.
>>
>> Sounds like a plan - but we should all test this change, especially
>> users of PIP, easy_install, virtual env etc.
>>
>
> So who's going to do the commit - Brad or Michiel?
>
> Peter
>

On Fri, Jul 26, 2013 at 1:09 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> Brad, can you do it?
> Best,
> -Michiel.

I've done it:
https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9

This needs some more testing now - particularly with the
non-standard install options like pip, easy_install, etc.

Peter

From p.j.a.cock at googlemail.com  Thu Aug 15 07:28:47 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 15 Aug 2013 12:28:47 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 next week?
Message-ID: <CAKVJ-_4+Qh6HQcWPPeb-yFmdDchCY2oQ2kths=GtJfK2srx+7w@mail.gmail.com>

Hello all,

Are there any remaining issues people think need to be
resolve prior to releasing Biopython 1.62? If not, unless
anyone else volunteers, I will make time for this next week.

Possible issues worth reviewing - please reply on the
existing threads:

Changes to setup.py to remove use of setuptools,
this would benefit from wider testing:
https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010806.html

Changes to PDB occupancy, do we need to change
PDB writing in light of this?
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010802.html

Update the Prank tool test to work with recent versions:
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010757.html

Note that PyPy now have a beta out support Python 3,
it would be nice to fully test with that as well...
http://morepypy.blogspot.co.uk/2013/07/pypy3-21-beta-1.html

Thanks,

Peter

From arklenna at gmail.com  Thu Aug 15 09:18:35 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 15 Aug 2013 09:18:35 -0400
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
Message-ID: <CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>

On Monday, 12 August 2013, Jo?o Rodrigues wrote:

> Throwing an error might not be a good idea because when dealing with models
> they sometimes have missing fields... then we'd have to fix them all
> somehow before parsing them.
>
> The None value seems a good indicator that something is amiss, while not
> putting any value there. There should also be a warning upon writing that
> the value is being replaced by a default value. Blank is also good
> actually, maybe we could add an option to the writer/parser to "preserve"
> values?
>
>
I don't think writing string "None" into a fixed width field would be a
good idea. So it's probably best to change occupancy (and any other missing
values set to None) to blank, correct width fields for writing.

I've never tangled with the writer and I have incoming PhD students this
week but I can attempt to add this functionality early next week.

Lenna


From p.j.a.cock at googlemail.com  Thu Aug 15 09:23:50 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 15 Aug 2013 14:23:50 +0100
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
Message-ID: <CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>

On Thu, Aug 15, 2013 at 2:18 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> On Monday, 12 August 2013, Jo?o Rodrigues wrote:
>>
>> Throwing an error might not be a good idea because when dealing with
>> models
>> they sometimes have missing fields... then we'd have to fix them all
>> somehow before parsing them.
>>
>> The None value seems a good indicator that something is amiss, while not
>> putting any value there. There should also be a warning upon writing that
>> the value is being replaced by a default value. Blank is also good
>> actually, maybe we could add an option to the writer/parser to "preserve"
>> values?
>>
>
> I don't think writing string "None" into a fixed width field would be a good
> idea. So it's probably best to change occupancy (and any other missing
> values set to None) to blank, correct width fields for writing.

I didn't mean to suggest writing the string "None" in the field, and
I'm not sure if Jo?o did - it would certainly be an invalid PDB file.

I agree that where the data structure has None (e.g. from our parser)
then the writer could use a blank string (of the appropriate width).
For mandatory fields like occupancy, this should give a warning.

> I've never tangled with the writer and I have incoming PhD students this
> week but I can attempt to add this functionality early next week.

That would be great (assuming no-one else want to tackle it sooner).

Thanks,

Peter


From arklenna at gmail.com  Thu Aug 15 10:54:53 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 15 Aug 2013 10:54:53 -0400
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
Message-ID: <CAHQkFdenX9S8d6de6RYBW1wAHpxXi4aP7qTtzNm3Hy6H-nkn5Q@mail.gmail.com>

> > I don't think writing string "None" into a fixed width field would be a
> good
> > idea. So it's probably best to change occupancy (and any other missing
> > values set to None) to blank, correct width fields for writing.
>
> I didn't mean to suggest writing the string "None" in the field, and
> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>
>
I didn't mean anyone was suggesting we intentionally do this, but I bet
that's what the writer is doing now!


From eric.talevich at gmail.com  Thu Aug 15 13:35:00 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Thu, 15 Aug 2013 10:35:00 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
Message-ID: <CAMC681k-zTdyM4_WEoF2dH70tCBxfVm=a=0iB+F0gHxdWfWDRA@mail.gmail.com>

On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Thanks for these details Ben - it sounds like a mixture of real
> test failures, and mere warnings that an external tool wasn't
> found.
>
> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton <ben at benfulton.net> wrote:
> > My test machine was running Ubuntu 12.04.
> >
> > For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and
> > got this error:
> > ApplicationError: Command 'fasttree -out temp_test.tree
> > Quality/example.fasta' returned non-zero exit status 1, 'Unknown or
> > incorrect use of option -out'
>
> I don't seem to have fasttree installed at all, and from the
> test and wrapper it is not explicit about which version is
> was originally written for.
>

I pushed a patch to not use the potentially problematic '-out' flag:
https://github.com/biopython/biopython/commit/771c1ed23bbb39dcf37805b4cb7bb23ffcb0c46a

According to FastTree's changelog (
http://www.microbesonline.org/fasttree/ChangeLog), the -out option was
added in version 2.1.5, released August 30, 2012. So the 'fasttree' package
on the stable Ubuntu (12.04) does not have the -out flag, but the package
in subsequent Ubuntus and other Debian derivatives does.

-Eric

From eric.talevich at gmail.com  Thu Aug 15 19:44:38 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Thu, 15 Aug 2013 16:44:38 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
Message-ID: <CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>

On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Thanks for these details Ben - it sounds like a mixture of real
> test failures, and mere warnings that an external tool wasn't
> found.
>
> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton <ben at benfulton.net> wrote:
> > My test machine was running Ubuntu 12.04.
>
[...]
> > I downloaded version 130708 of Prank from
> > http://code.google.com/p/prank-msa/downloads/list. The error is on line
> 165
> > of the test file:
> >
> > AssertionError:
> > -----------------
> >  PRANK v.130708:
> > -----------------
> >
> > Input for the analysis
> >  - converting 'Quality/example.fasta' to 'temp with space.phy'
>
> This sounds like a minor change in the stdout with recent
> versions of PRANK.
>
>
It's more exciting than that: Old versions of Prank created .xml and .dnd
files by default, and had "-noxml" and "-notree" options to avoid creating
them (or clean them up, whichever). New Pranks do not create these files by
default, but do have "-showxml" and "-showtree" flags if you want them.

I removed the use of these flags in the unit test. One of the tests used
the set_parameter method, so I substituted the "-dots" flag for "-notree".
It passes on my machine now:
https://github.com/biopython/biopython/commit/30d7bcfb6eab8283a53372b2ad64b59be7461eb3

The doctests in Bio/Align/Applications/_Prank.py should probably change,
too, since the same flags are used there. (I have not done this.)

-Eric

From w.arindrarto at gmail.com  Fri Aug 16 03:14:24 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Fri, 16 Aug 2013 09:14:24 +0200
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
Message-ID: <CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>

Hi Michiel, Peter,

In preparation for the 1.62 release, I've made the following changes
to Bio.NCBIStandalone and Bio.ParserSupport:

* Migrated the two modules under Bio.SearchIO._legacy
* Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning

I've pushed the changes to this branch:
https://github.com/bow/biopython/tree/bio_blast_migrate

Tests seem to be running fine still, but now there is the awkward
situation where if users import Bio.NCBIStandalone and/or
Bio.ParserSupport directly they will be greeted with two warnings: the
BiopythonWarning for the modules' deprecation and the
BiopythonExperimentalWarning for SearchIO.

We could suppress the SearchIO warning in Bio.NCBIStandalone and
Bio.ParserSupport. But before this is done, I was wondering if we have
a defined timeline for removing a BiopythonExperimentalWarning? (i.e.
if it will be removed in this release, then we could do that instead).

Any opinions on this :)?

Cheers,
Bow

On Sat, Jul 13, 2013 at 12:54 PM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> Hi Bow,
>
>
>> Would it be ok if we move parts that are used by SearchIO into their own
>> private classes in
>> Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current
>> files?
>
> That sounds fine to me. Any other opinions, anybody?
>
> Best,
> -Michiel.
>
> ________________________________
> From: Wibowo Arindrarto <w.arindrarto at gmail.com>
> To: Michiel de Hoon <mjldehoon at yahoo.com>
> Cc: Peter Cock <p.j.a.cock at googlemail.com>; Eric Talevich
> <eric.talevich at gmail.com>; Zheng Ruan <zruan1991 at gmail.com>; Biopython-Dev
> Mailing List <biopython-dev at biopython.org>
> Sent: Saturday, July 13, 2013 3:58 PM
> Subject: Re: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
>
> Hi Michiel,
>
> There are two classes from Bio.Blast.NCBIStandalone still being used
> by Bio.SearchIO internally (for the BLAST text parser): the
> BlastParser and the Iterator classes. The BlastParser class itself
> still relies on Bio.ParserSupport. Would it be ok if we move parts
> that are used by SearchIO into their own private classes in
> Bio.SearchIO, while putting the BiopythonDeprecationWarning on the
> current files?
>
> Best regards,
> Bow
>
> On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon <mjldehoon at yahoo.com>
> wrote:
>> The following pieces of code had a PendingDeprecationWarning in Biopython
>> release 1.61, and can be upgraded to a BiopythonDeprecationWarning:
>>
>> Bio.Blast.NCBIStandalone (entire module). This module has had a
>> PendingDeprecationWarning since September 2010.
>>
>> Bio.Motif (entire module). Its functionality is available from Bio.motifs,
>> so Bio.Motif can be deprecated.
>>
>> Bio.ParserSupport (entire module). This module is currently only being
>> used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning
>> since September 2011.
>>
>> Any final objections?
>>
>> Best,
>> -Michiel
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
>

From p.j.a.cock at googlemail.com  Fri Aug 16 05:31:13 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 16 Aug 2013 10:31:13 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
Message-ID: <CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>

On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote:
> On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote:
>> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote:
>> > I downloaded version 130708 of Prank from
>> > http://code.google.com/p/prank-msa/downloads/list.
>> >  The error is on line 165 of the test file:
>> >
>> > AssertionError:
>> > -----------------
>> >  PRANK v.130708:
>> > -----------------
>> >
>> > Input for the analysis
>> >  - converting 'Quality/example.fasta' to 'temp with space.phy'
>>
>> This sounds like a minor change in the stdout with recent
>> versions of PRANK.
>>
>
> It's more exciting than that: Old versions of Prank created .xml and .dnd
> files by default, and had "-noxml" and "-notree" options to avoid creating
> them (or clean them up, whichever). New Pranks do not create these files by
> default, but do have "-showxml" and "-showtree" flags if you want them.

Well that API break is a bit annoying, but your test changes make sense.

Do we need to add these new switches to the wrapper itself?

Peter

From eric.talevich at gmail.com  Sun Aug 18 14:14:13 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Sun, 18 Aug 2013 11:14:13 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
	<CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>
Message-ID: <CAMC681=tq26Y+M4zE=F++nHHc+jsC1mwZ-pW37Gp3B2BHa-SPA@mail.gmail.com>

On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote:
> > On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote:
> >> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote:
> >> > I downloaded version 130708 of Prank from
> >> > http://code.google.com/p/prank-msa/downloads/list.
> >> >  The error is on line 165 of the test file:
> >> >
> >> > AssertionError:
> >> > -----------------
> >> >  PRANK v.130708:
> >> > -----------------
> >> >
> >> > Input for the analysis
> >> >  - converting 'Quality/example.fasta' to 'temp with space.phy'
> >>
> >> This sounds like a minor change in the stdout with recent
> >> versions of PRANK.
> >>
> >
> > It's more exciting than that: Old versions of Prank created .xml and .dnd
> > files by default, and had "-noxml" and "-notree" options to avoid
> creating
> > them (or clean them up, whichever). New Pranks do not create these files
> by
> > default, but do have "-showxml" and "-showtree" flags if you want them.
>
> Well that API break is a bit annoying, but your test changes make sense.
>
> Do we need to add these new switches to the wrapper itself?
>

Here's the commit to add those switches to the wrapper:
https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1

I suppose it would be helpful if the wrapper detected the version of Prank
and handled the show(tree|xml) flags appropriately to avoid errors. But
that would require running the executable first, I think, which is not
something our wrappers normally do. (And then it would make sense to cache
the result for the duration of the running process.)

-Eric

From p.j.a.cock at googlemail.com  Sun Aug 18 14:39:08 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sun, 18 Aug 2013 19:39:08 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAMC681=tq26Y+M4zE=F++nHHc+jsC1mwZ-pW37Gp3B2BHa-SPA@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
	<CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>
	<CAMC681=tq26Y+M4zE=F++nHHc+jsC1mwZ-pW37Gp3B2BHa-SPA@mail.gmail.com>
Message-ID: <CAKVJ-_6eMHt-qbC7G7HZ9oafYW4i8+xDiq24AUKapKjtFnPq-A@mail.gmail.com>

On Sun, Aug 18, 2013 at 7:14 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
> On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock wrote:
>>
>> Well that API break is a bit annoying, but your test changes make sense.
>>
>> Do we need to add these new switches to the wrapper itself?
>
>
> Here's the commit to add those switches to the wrapper:
> https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1
>
> I suppose it would be helpful if the wrapper detected the version of Prank
> and handled the show(tree|xml) flags appropriately to avoid errors. But that
> would require running the executable first, I think, which is not something
> our wrappers normally do. (And then it would make sense to cache the result
> for the duration of the running process.)
>
> -Eric

Historically we've just documented this kind of issue in the
parameter docstring - the idea of auto-running the tool in
the background to check the version just sounds like Trouble.

Peter

From yeyanbo289 at gmail.com  Mon Aug 19 03:36:00 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 19 Aug 2013 15:36:00 +0800
Subject: [Biopython-dev] GSOC weekly update 10
Message-ID: <CADoMHjwSFvZnRcY7i_GTwHtj84rrchJVo96uPAQai0Tej600nw@mail.gmail.com>

Hi all,

Biopython.Phylo project update of last week is here:
http://blog.yeyanbo.com/posts/google-summer-of-code-10.html

Thanks,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*

From zruan1991 at gmail.com  Mon Aug 19 11:06:05 2013
From: zruan1991 at gmail.com (Zheng Ruan)
Date: Mon, 19 Aug 2013 11:06:05 -0400
Subject: [Biopython-dev] Codon Alignment GSoC Weekly Update
Message-ID: <CABM7aFohv-E-MzWGbVNrnpPS4BHWoui_9MaNZuEYH8YTFwLqfA@mail.gmail.com>

Hi all,

An update of CodonAlignment GSoC can be found at (http://zruanweb.com/).
Thanks for your comments and suggestions.

Best,
Zheng Ruan

From michael.maher at ucsf.edu  Mon Aug 19 15:24:04 2013
From: michael.maher at ucsf.edu (Cyrus Maher)
Date: Mon, 19 Aug 2013 12:24:04 -0700
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
Message-ID: <CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>

Hi everybody!!-

My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab
of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...

I am writing because I'm interested in submitting a new Biopython module.
Since this is likely a one-time event, the wiki recommends proceeding
through a developer. After speaking with Peter Cock, he recommended that I
open things up for discussion on the mailing list.

Attached is a draft that describes a new method, termed MOSAIC, which
integrates multiple sequence alignments from an arbitrary number number of
sources. We show that it greatly increases the number of orthologs that we
are able to detect while maintaining or improving functional-,
phylogenetic-, and sequence identity-based measures of ortholog quality.

Code and documentation may be found here:

https://dl.dropboxusercontent.com/u/43327584/html/index.html

Looking forward to hearing what you think!

Best,

-Cyrus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OD_fullpaper_8_5_13.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 1666812 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython-dev/attachments/20130819/85ee506a/attachment-0001.bin>

From davidjosephcain at gmail.com  Mon Aug 19 17:18:48 2013
From: davidjosephcain at gmail.com (David Cain)
Date: Mon, 19 Aug 2013 17:18:48 -0400
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
Message-ID: <CAPyP4u+jjFx6pwNC8406BR9QvUq7OHov-jKGdoORmT_=TcdVPA@mail.gmail.com>

Hi, Cyrus! Before the constructive criticism, I just wanted to say your
module looks excellent and thank you for opening it up as free software!

I'm by no means a developer (just interested in Biopython's development),
but I noticed your code generally doesn't adhere to
PEP8<http://www.python.org/dev/peps/pep-0008/>.
If you're interested in getting feedback from others, it's quite valuable
to format your code by the standards. (Proper PEP 8 code has a look and
feel that's easier for the trained eye to view).

Key things that detract from your module's readability:
- CamelCase method, module, and field names (when a Python developer sees
these, they're prone to assuming the name is for a class). Of course,
Biopython doesn't provide the best example here, but there are reasons for
that<http://www.biopython.org/pipermail/biopython-dev/2012-September/009938.html>
(it'll
be fixed eventually). All-caps names are either refrained from use, or used
for constants (i.e. you may wish to rename your module `mosaic`).
- Very long line wrapping - you should really try to keep your lines to 79
characters
- Using integers as booleans (you should stick to True/False, e.g. `while
True` in lieu of `while 1`)
- module renamings: it's much easier to see `random.shuffle` over
`r.shuffle`, as one can assume `random` is the standard module, whereas `r`
might be completely different.

Also, your module should definitely remove usage of pdb if you wish to
publish it as part of an official Python package.

Would you be open to hosting a development branch of your code on GitHub or
a similar community-editable resource? Any acceptance to the official
Biopython distribution would of course be up to the main devs, but I'd be
more than happy to test your code and make suggestions, regardless of its
integration to a third-party package.

David

From christian at brueffer.de  Tue Aug 20 07:36:09 2013
From: christian at brueffer.de (Christian Brueffer)
Date: Tue, 20 Aug 2013 13:36:09 +0200
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
Message-ID: <521354A9.6020701@brueffer.de>

On 8/19/13 21:24 , Cyrus Maher wrote:
> Hi everybody!!-
> 
> My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab
> of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> 
> I am writing because I'm interested in submitting a new Biopython module.
> Since this is likely a one-time event, the wiki recommends proceeding
> through a developer. After speaking with Peter Cock, he recommended that I
> open things up for discussion on the mailing list.
> 
> Attached is a draft that describes a new method, termed MOSAIC, which
> integrates multiple sequence alignments from an arbitrary number number of
> sources. We show that it greatly increases the number of orthologs that we
> are able to detect while maintaining or improving functional-,
> phylogenetic-, and sequence identity-based measures of ortholog quality.
> 
> Code and documentation may be found here:
> 
> https://dl.dropboxusercontent.com/u/43327584/html/index.html
> 
> Looking forward to hearing what you think!
> 

Hi Cyrus,

I agree with David on the PEP8 issue.  A very nice tool to use is the
pep8 checker, https://pypi.python.org/pypi/pep8

I see that you use MSAProbs.  I have an MSAProbs application wrapper in
the works.  I haven't submitted it yet due to incomplete unit tests,
but maybe it's useful to you:

https://github.com/cbrueffer/biopython/tree/msaprobs

Cheers,

Chris


From michael.maher at ucsf.edu  Tue Aug 20 14:24:43 2013
From: michael.maher at ucsf.edu (Cyrus Maher)
Date: Tue, 20 Aug 2013 11:24:43 -0700
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <521354A9.6020701@brueffer.de>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
Message-ID: <CAME4z058aywiZJmAVnXqDosfSsgOmgWPJaoDURL11s_eHR3saA@mail.gmail.com>

Thanks for your feedback, guys!! I did a bit of general clean-up and I've
made all the recommended PEP8 changes, with the exception that I kept
capital letters if they were part of an acronym. I've also switched the
link in the documentation over to github and configured mosaic to use the
MSAProbs application wrapper if it's installed. Let me know what you think!!

Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html
Code: https://github.com/cyrusmaher/mosaic

Cheers,

-Cyrus


On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer
<christian at brueffer.de>wrote:

> On 8/19/13 21:24 , Cyrus Maher wrote:
> > Hi everybody!!-
> >
> > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> lab
> > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> >
> > I am writing because I'm interested in submitting a new Biopython module.
> > Since this is likely a one-time event, the wiki recommends proceeding
> > through a developer. After speaking with Peter Cock, he recommended that
> I
> > open things up for discussion on the mailing list.
> >
> > Attached is a draft that describes a new method, termed MOSAIC, which
> > integrates multiple sequence alignments from an arbitrary number number
> of
> > sources. We show that it greatly increases the number of orthologs that
> we
> > are able to detect while maintaining or improving functional-,
> > phylogenetic-, and sequence identity-based measures of ortholog quality.
> >
> > Code and documentation may be found here:
> >
> > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> >
> > Looking forward to hearing what you think!
> >
>
> Hi Cyrus,
>
> I agree with David on the PEP8 issue.  A very nice tool to use is the
> pep8 checker, https://pypi.python.org/pypi/pep8
>
> I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> the works.  I haven't submitted it yet due to incomplete unit tests,
> but maybe it's useful to you:
>
> https://github.com/cbrueffer/biopython/tree/msaprobs
>
> Cheers,
>
> Chris
>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From mok at bioxray.dk  Tue Aug 20 14:35:14 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Tue, 20 Aug 2013 20:35:14 +0200
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdenX9S8d6de6RYBW1wAHpxXi4aP7qTtzNm3Hy6H-nkn5Q@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
	<CAHQkFdenX9S8d6de6RYBW1wAHpxXi4aP7qTtzNm3Hy6H-nkn5Q@mail.gmail.com>
Message-ID: <43FD0A6C-ED54-4861-AADA-9F3E8FB6172A@bioxray.dk>


On 15/08/2013, at 16:54, Lenna Peterson <arklenna at gmail.com> wrote:

>>> I don't think writing string "None" into a fixed width field would be a
>> good
>>> idea. So it's probably best to change occupancy (and any other missing
>>> values set to None) to blank, correct width fields for writing.
>> 
>> I didn't mean to suggest writing the string "None" in the field, and
>> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>> 
>> 
> I didn't mean anyone was suggesting we intentionally do this, but I bet
> that's what the writer is doing now!

I think the output should be identical to the input if a PDB file is read and then written again (apart from the fact that  Bio.PDB currently doesn't save all headers.)

Cheers,
Morten


From davidjosephcain at gmail.com  Tue Aug 20 17:25:07 2013
From: davidjosephcain at gmail.com (David Cain)
Date: Tue, 20 Aug 2013 17:25:07 -0400
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z058aywiZJmAVnXqDosfSsgOmgWPJaoDURL11s_eHR3saA@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
	<CAME4z058aywiZJmAVnXqDosfSsgOmgWPJaoDURL11s_eHR3saA@mail.gmail.com>
Message-ID: <CAPyP4uLWAR4MVkeZ9ZafsW+zTgp_CYdZ-ZQYWmkCQnEeB_=tTg@mail.gmail.com>

Hi, Cyrus - I took a quick look at your code on GitHub. Did you publish a
different version of MOSAIC? By my linter, there are 309 PEP8 errors on
mosaic.py.

Also, as a general comment, your code seems to rely on sys.exit
extensively. Python's exception framework is pretty handy - maybe your
module could raise its own custom exceptions (Biopython's PDB parser is a
good example of this design strategy).


David Cain
+1 (339) 222 4452


On Tue, Aug 20, 2013 at 2:24 PM, Cyrus Maher <michael.maher at ucsf.edu> wrote:

> Thanks for your feedback, guys!! I did a bit of general clean-up and I've
> made all the recommended PEP8 changes, with the exception that I kept
> capital letters if they were part of an acronym. I've also switched the
> link in the documentation over to github and configured mosaic to use the
> MSAProbs application wrapper if it's installed. Let me know what you
> think!!
>
> Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html
> Code: https://github.com/cyrusmaher/mosaic
>
> Cheers,
>
> -Cyrus
>
>
> On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer
> <christian at brueffer.de>wrote:
>
> > On 8/19/13 21:24 , Cyrus Maher wrote:
> > > Hi everybody!!-
> > >
> > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> > lab
> > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> > >
> > > I am writing because I'm interested in submitting a new Biopython
> module.
> > > Since this is likely a one-time event, the wiki recommends proceeding
> > > through a developer. After speaking with Peter Cock, he recommended
> that
> > I
> > > open things up for discussion on the mailing list.
> > >
> > > Attached is a draft that describes a new method, termed MOSAIC, which
> > > integrates multiple sequence alignments from an arbitrary number number
> > of
> > > sources. We show that it greatly increases the number of orthologs that
> > we
> > > are able to detect while maintaining or improving functional-,
> > > phylogenetic-, and sequence identity-based measures of ortholog
> quality.
> > >
> > > Code and documentation may be found here:
> > >
> > > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> > >
> > > Looking forward to hearing what you think!
> > >
> >
> > Hi Cyrus,
> >
> > I agree with David on the PEP8 issue.  A very nice tool to use is the
> > pep8 checker, https://pypi.python.org/pypi/pep8
> >
> > I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> > the works.  I haven't submitted it yet due to incomplete unit tests,
> > but maybe it's useful to you:
> >
> > https://github.com/cbrueffer/biopython/tree/msaprobs
> >
> > Cheers,
> >
> > Chris
> >
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> >
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From arklenna at gmail.com  Tue Aug 20 17:31:40 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Tue, 20 Aug 2013 17:31:40 -0400
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <521354A9.6020701@brueffer.de>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
Message-ID: <CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>

Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8
(it can be a bit aggressive but that's what version control is for, right?)

Cheers,

Lenna


On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer
<christian at brueffer.de>wrote:

> On 8/19/13 21:24 , Cyrus Maher wrote:
> > Hi everybody!!-
> >
> > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> lab
> > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> >
> > I am writing because I'm interested in submitting a new Biopython module.
> > Since this is likely a one-time event, the wiki recommends proceeding
> > through a developer. After speaking with Peter Cock, he recommended that
> I
> > open things up for discussion on the mailing list.
> >
> > Attached is a draft that describes a new method, termed MOSAIC, which
> > integrates multiple sequence alignments from an arbitrary number number
> of
> > sources. We show that it greatly increases the number of orthologs that
> we
> > are able to detect while maintaining or improving functional-,
> > phylogenetic-, and sequence identity-based measures of ortholog quality.
> >
> > Code and documentation may be found here:
> >
> > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> >
> > Looking forward to hearing what you think!
> >
>
> Hi Cyrus,
>
> I agree with David on the PEP8 issue.  A very nice tool to use is the
> pep8 checker, https://pypi.python.org/pypi/pep8
>
> I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> the works.  I haven't submitted it yet due to incomplete unit tests,
> but maybe it's useful to you:
>
> https://github.com/cbrueffer/biopython/tree/msaprobs
>
> Cheers,
>
> Chris
>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From arklenna at gmail.com  Tue Aug 20 18:16:18 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Tue, 20 Aug 2013 18:16:18 -0400
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
Message-ID: <CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>

On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:
>
>
> I didn't mean to suggest writing the string "None" in the field, and
> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>
> I agree that where the data structure has None (e.g. from our parser)
> then the writer could use a blank string (of the appropriate width).
> For mandatory fields like occupancy, this should give a warning.
>
>
As I suspected, the writer currently fails on None (it's expecting a
float). Test-driven development!

However, I don't see a simple or elegant way to force writing of a blank
occupancy. ATOM lines are currently written using C-style string
formatting, and the occupancy field is `%6.2f`.

Off the top of my head, I'd:

1. Store the original format string
2. Modify the format string to have "%6s" at the appropriate position
3. Modify the occupancy to be an empty string or a space
4. Set the return value to the formatted string
5. Restore the original format string
6. Return the return value

However, this seems...ugly at best. I don't know that switching formatting
styles (e.g. to string.format() or others) will help. And in most
circumstances, the type checking of the format string is useful.

Any thoughts?

Cheers,

Lenna


From anaryin at gmail.com  Tue Aug 20 18:25:57 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 20 Aug 2013 15:25:57 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
	<CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
Message-ID: <CAJ9sUYP4CVc5uA1La2b8hKRzmGX4NcnoEEOR+ZSPA8NQ_yu_jg@mail.gmail.com>

Hi,

We should probably change it to str.format() regardless of advantages.

If we indeed have None in the parser then writing becomes a bit more
complicated. But I guess it's more correct? I'd vote for having a small
check/conversion on the writer, besides on the formatting of the string.

As a biologist, I don't care if it is none of empty string, or whatever,
but for scripting maybe it makes more sense to be None? That's what I mean
with more correct.

Cheers,

Jo?o


From michael.maher at ucsf.edu  Wed Aug 21 18:00:04 2013
From: michael.maher at ucsf.edu (Cyrus Maher)
Date: Wed, 21 Aug 2013 15:00:04 -0700
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
	<CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>
Message-ID: <CAME4z078NxcRxMFVs4trJHmz5VQ_0FGYDDLtKz3o_o5nmFZn2g@mail.gmail.com>

Thanks for sending that along Lenna! And thanks everybody for being patient
with me! This is my first experience sharing software, so it's great to
learn from you guys...

As far as updates:
-I've fixed all pep8 errors, with the exception of some finicky
continuation indent complaints.
-I've also uploaded example files so that the file "mosaic_example.py" can
be run without modification. From the mosaic directory, just type:
    python mosaic_example.py testfiles.txt
-The documentation has be updated as well.

I would of course be open to any additional feedback you guys could offer
for improving the code.

That said, I was also hoping to get your thoughts on whether this seemed
like the type of project that would fit in with Biopython. Peter said that
Eric might have some good comments on this matter?


Cheers,

-Cyrus


On Tue, Aug 20, 2013 at 2:31 PM, Lenna Peterson <arklenna at gmail.com> wrote:

> Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8
> (it can be a bit aggressive but that's what version control is for, right?)
>
> Cheers,
>
> Lenna
>
>
> On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer
> <christian at brueffer.de>wrote:
>
> > On 8/19/13 21:24 , Cyrus Maher wrote:
> > > Hi everybody!!-
> > >
> > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> > lab
> > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> > >
> > > I am writing because I'm interested in submitting a new Biopython
> module.
> > > Since this is likely a one-time event, the wiki recommends proceeding
> > > through a developer. After speaking with Peter Cock, he recommended
> that
> > I
> > > open things up for discussion on the mailing list.
> > >
> > > Attached is a draft that describes a new method, termed MOSAIC, which
> > > integrates multiple sequence alignments from an arbitrary number number
> > of
> > > sources. We show that it greatly increases the number of orthologs that
> > we
> > > are able to detect while maintaining or improving functional-,
> > > phylogenetic-, and sequence identity-based measures of ortholog
> quality.
> > >
> > > Code and documentation may be found here:
> > >
> > > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> > >
> > > Looking forward to hearing what you think!
> > >
> >
> > Hi Cyrus,
> >
> > I agree with David on the PEP8 issue.  A very nice tool to use is the
> > pep8 checker, https://pypi.python.org/pypi/pep8
> >
> > I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> > the works.  I haven't submitted it yet due to incomplete unit tests,
> > but maybe it's useful to you:
> >
> > https://github.com/cbrueffer/biopython/tree/msaprobs
> >
> > Cheers,
> >
> > Chris
> >
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> >
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From p.j.a.cock at googlemail.com  Thu Aug 22 09:01:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 22 Aug 2013 14:01:27 +0100
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z078NxcRxMFVs4trJHmz5VQ_0FGYDDLtKz3o_o5nmFZn2g@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
	<CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>
	<CAME4z078NxcRxMFVs4trJHmz5VQ_0FGYDDLtKz3o_o5nmFZn2g@mail.gmail.com>
Message-ID: <CAKVJ-_4JOLva-8j9xoD2LDMNcsLTPGxn867FdVihK4e+m0y77w@mail.gmail.com>

On Wed, Aug 21, 2013 at 11:00 PM, Cyrus Maher <michael.maher at ucsf.edu> wrote:
>
> That said, I was also hoping to get your thoughts on whether this seemed
> like the type of project that would fit in with Biopython. Peter said that
> Eric might have some good comments on this matter?

Right - I was thinking Eric and this year's phylogenetic focused GSoC
students should have some good comments, e.g. about adding
something like pal2nal into Biopython.

Peter

From p.j.a.cock at googlemail.com  Fri Aug 23 04:54:35 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 23 Aug 2013 09:54:35 +0100
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
Message-ID: <CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>

On Fri, Aug 16, 2013 at 8:14 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Michiel, Peter,
>
> In preparation for the 1.62 release, I've made the following changes
> to Bio.NCBIStandalone and Bio.ParserSupport:
>
> * Migrated the two modules under Bio.SearchIO._legacy
> * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning

So basically you're proposing formally deprecating parsing plain
text BLAST output (via NCBIStandalone and Bio.ParserSupport)
but continuing to support this format via SearchIO (using a copy
of the current parser as a private module)?

This then gives you the freedom to rewrite the old text parser
more simply (e.g. assuming only recent versions of the BLAST
suite), which might be nice.

> I've pushed the changes to this branch:
> https://github.com/bow/biopython/tree/bio_blast_migrate
>
> Tests seem to be running fine still, but now there is the awkward
> situation where if users import Bio.NCBIStandalone and/or
> Bio.ParserSupport directly they will be greeted with two warnings: the
> BiopythonWarning for the modules' deprecation and the
> BiopythonExperimentalWarning for SearchIO.
>
> We could suppress the SearchIO warning in Bio.NCBIStandalone and
> Bio.ParserSupport. But before this is done, I was wondering if we have
> a defined timeline for removing a BiopythonExperimentalWarning? (i.e.
> if it will be removed in this release, then we could do that instead).

It doesn't make sense to have a defined timetime for removing a
BiopythonExperimentalWarning - it will be on a case by case basis.

Do you think SearchIO is ready for that now (or in Biopython 1.63)?

Peter

From p.j.a.cock at googlemail.com  Fri Aug 23 05:05:02 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 23 Aug 2013 10:05:02 +0100
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
	<CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
Message-ID: <CAKVJ-_7Xx24v0ekLX_dMBU_+dymVG4VyryKuwrk2vpoi60g8Pg@mail.gmail.com>

On Tue, Aug 20, 2013 at 11:16 PM, Lenna Peterson <arklenna at gmail.com> wrote:
>
> On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
>>
>>
>> I didn't mean to suggest writing the string "None" in the field, and
>> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>>
>> I agree that where the data structure has None (e.g. from our parser)
>> then the writer could use a blank string (of the appropriate width).
>> For mandatory fields like occupancy, this should give a warning.
>>
>
> As I suspected, the writer currently fails on None (it's expecting a float).
> Test-driven development!
>
> However, I don't see a simple or elegant way to force writing of a blank
> occupancy. ATOM lines are currently written using C-style string formatting,
> and the occupancy field is `%6.2f`.
>
> Off the top of my head, I'd:
>
> 1. Store the original format string
> 2. Modify the format string to have "%6s" at the appropriate position
> 3. Modify the occupancy to be an empty string or a space
> 4. Set the return value to the formatted string
> 5. Restore the original format string
> 6. Return the return value
>
> However, this seems...ugly at best. I don't know that switching formatting
> styles (e.g. to string.format() or others) will help. And in most
> circumstances, the type checking of the format string is useful.
>
> Any thoughts?

I would suggest something like this (untested):

$ git diff
diff --git a/Bio/PDB/PDBIO.py b/Bio/PDB/PDBIO.py
index 2f64571..11a52ca 100644
--- a/Bio/PDB/PDBIO.py
+++ b/Bio/PDB/PDBIO.py
@@ -8,7 +8,7 @@
 from Bio.PDB.StructureBuilder import StructureBuilder # To allow
saving of chains, residues, etc..
 from Bio.Data.IUPACData import atom_weights # Allowed Elements

-_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c
%8.3f%8.3f%8.3f%6.2f%6.2f      %4s%2s%2s\n"
+_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c   %8.3f%8.3f%8.3f%s%6.2f
     %4s%2s%2s\n"


 class Select(object):
@@ -85,8 +85,21 @@ class PDBIO(object):
         x, y, z=atom.get_coord()
         bfactor=atom.get_bfactor()
         occupancy=atom.get_occupancy()
+        # Handle a missing occupancy (None) with a blank entry:
+        try:
+            occupancy_str = "%6.2f" % occupancy
+        except TypeError:
+            if occupancy is None:
+                occupancy_str = " " * 6
+                import warnings
+                from Bio import BiopythonWarning
+                # TODO - Introduce exception BiopythonWriterWarning?
+                warning.warn("Missing occupancy will be recorded as blank",
+                             BiopythonWarning)
+            else:
+                raise TypeError("Invalid occupancy %r in atom %r" %
(occupancy, atom))
         args=(record_type, atom_number, name, altloc, resname, chain_id,
-            resseq, icode, x, y, z, occupancy, bfactor, segid,
+            resseq, icode, x, y, z, occupancy_str, bfactor, segid,
             element, charge)
         return _ATOM_FORMAT_STRING % args


The error message could be improved (e.g. a more helpful identification
of the ATOM at fault)?

Peter


From w.arindrarto at gmail.com  Sat Aug 24 06:22:56 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Sat, 24 Aug 2013 12:22:56 +0200
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
Message-ID: <CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>

Hi Peter, everyone,

>> In preparation for the 1.62 release, I've made the following changes
>> to Bio.NCBIStandalone and Bio.ParserSupport:
>>
>> * Migrated the two modules under Bio.SearchIO._legacy
>> * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning
>
> So basically you're proposing formally deprecating parsing plain
> text BLAST output (via NCBIStandalone and Bio.ParserSupport)
> but continuing to support this format via SearchIO (using a copy
> of the current parser as a private module)?
>
> This then gives you the freedom to rewrite the old text parser
> more simply (e.g. assuming only recent versions of the BLAST
> suite), which might be nice.

Yes. This seems like a sensible thing to do now.

>> I've pushed the changes to this branch:
>> https://github.com/bow/biopython/tree/bio_blast_migrate
>>
>> Tests seem to be running fine still, but now there is the awkward
>> situation where if users import Bio.NCBIStandalone and/or
>> Bio.ParserSupport directly they will be greeted with two warnings: the
>> BiopythonWarning for the modules' deprecation and the
>> BiopythonExperimentalWarning for SearchIO.
>>
>> We could suppress the SearchIO warning in Bio.NCBIStandalone and
>> Bio.ParserSupport. But before this is done, I was wondering if we have
>> a defined timeline for removing a BiopythonExperimentalWarning? (i.e.
>> if it will be removed in this release, then we could do that instead).
>
> It doesn't make sense to have a defined timetime for removing a
> BiopythonExperimentalWarning - it will be on a case by case basis.
>
> Do you think SearchIO is ready for that now (or in Biopython 1.63)?

Hmm..what I have in mind is actually as soon as we lift SearchIO's
BiopythonExperimentalWarning, we give Bio.Blast a
PendingDeprecationWarning. I think this gives users a clearer / firmer
choice, since it could be confusing to have two different modules that
handle BLAST parsing in Biopython.

As for the readiness, I think the important features that we planned
have been implemented in SearchIO. I don't have any major feature
change that I would like to implement anytime soon, too. So yes, I
think it is ready.

Best,
Bow

From yeyanbo289 at gmail.com  Sun Aug 25 23:53:50 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 26 Aug 2013 11:53:50 +0800
Subject: [Biopython-dev] GSOC weekly update 11
Message-ID: <CADoMHjw0e9-orsE+qrbe9S5YeB38oad2k+UsiaEnVT9=2xiZoQ@mail.gmail.com>

Hi all,

Biopython.Phylo project update for last week is here:
http://blog.yeyanbo.com/posts/google-summer-of-code-11.html

Thanks,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*

From p.j.a.cock at googlemail.com  Mon Aug 26 10:04:35 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 26 Aug 2013 15:04:35 +0100
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
	<CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
Message-ID: <CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>

On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Peter, everyone,
>
> As for the readiness, I think the important features that we planned
> have been implemented in SearchIO. I don't have any major feature
> change that I would like to implement anytime soon, too. So yes, I
> think it is ready.

So you'd be comfortable with removing the experimental warning
for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
thing is resolved)?

And you would like to officially support plain text BLAST parsing
(despite it not being recommend by the NCBI, and known to have
been quite a lot of work in the past to keep the parser working)?

We should probably also give you (Bow) commit rights too, so you
can handle basic parser updates within SearchIO directly - assuming
you're happy with that?

Regards,

Peter

From w.arindrarto at gmail.com  Mon Aug 26 12:04:38 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Mon, 26 Aug 2013 18:04:38 +0200
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
	<CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
	<CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>
Message-ID: <CADEGkF6otBZncQxhGq3iHNjGkOsoXcso4JTkbTmBfsVJMga1DA@mail.gmail.com>

On Mon, Aug 26, 2013 at 4:04 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto
> <w.arindrarto at gmail.com> wrote:
>> Hi Peter, everyone,
>>
>> As for the readiness, I think the important features that we planned
>> have been implemented in SearchIO. I don't have any major feature
>> change that I would like to implement anytime soon, too. So yes, I
>> think it is ready.
>
> So you'd be comfortable with removing the experimental warning
> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
> thing is resolved)?

Yes. I think all public-facing modules are ok now. There are still two
issue which I consider minor, but I think should be mentioned before
we lift the warning:

1. Storing [T]FAST[X|Y] query and hit strand information (see
https://redmine.open-bio.org/issues/3419). I'm not sure yet if I
should do the commit, but Jason's patch look sensible (and I can
probably add some more so that the parser knows whether to set the
strand on hit or query sequence).

2. Collapsing / merging overlapping HSPs. I've received one (or two)
mail(s) asking whether it is possible to merge overlapping HSPs
(apparently BLAST sometimes do this). I haven't figured a way to
cleanly implement this, so this is on hold for now.

In addition, we had a discussion some months ago about the Bio._utils
module that SearchIO uses (see
http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010219.html,
http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010240.html,
and http://lists.open-bio.org/pipermail/biopython-dev/2013-February/010290.html).
We had an extensive discussion about this last time, which went as far
as considering a change on how we run our tests. Since the Bio._utils
module itself is private, however, no public-facing functions in
SearchIO is affected.

Other than these, some planned features are implementing the HMMER3.1
parser (which I think should not interfere with lifting the warning).

> And you would like to officially support plain text BLAST parsing
> (despite it not being recommend by the NCBI, and known to have
> been quite a lot of work in the past to keep the parser working)?

Looking at http://lists.open-bio.org/pipermail/biopython/2012-September/008166.html,
the most sensible approach seems to be to put the current parser under
SearchIO (hence the module reorganization I did; so we can deprecate
Bio.Blast as a whole without losing functionality), without actually
advertising that we have full support of parsing the text output
(perhaps put a disclaimer that plain text is not guaranteed to work?).
I feel like some people may still want to use previous BLAST versions
anyway, and we do have a functioning parser tested up to 2.2.26+, so
throwing it away doesn't seem to be the best thing to do here. And in
the case that someone does want to extend the parser (could be me,
could be someone else) to work with the latest BLAST version, (s)he
can then extend the existing parser.

> We should probably also give you (Bow) commit rights too, so you
> can handle basic parser updates within SearchIO directly - assuming
> you're happy with that?

This is fine with me.

Best,
Bow

P.S. I made the pull request for the reorganization here:
https://github.com/biopython/biopython/pull/223, comments are welcomed
:).

From p.j.a.cock at googlemail.com  Tue Aug 27 04:41:39 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 27 Aug 2013 09:41:39 +0100
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CADEGkF6otBZncQxhGq3iHNjGkOsoXcso4JTkbTmBfsVJMga1DA@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
	<CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
	<CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>
	<CADEGkF6otBZncQxhGq3iHNjGkOsoXcso4JTkbTmBfsVJMga1DA@mail.gmail.com>
Message-ID: <CAKVJ-_5dEA9daJLnK9VDqHs2dma8gT2sFM0SSEjmrR4DVVRhjA@mail.gmail.com>

On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
>
>> So you'd be comfortable with removing the experimental warning
>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
>> thing is resolved)?
>
> Yes. I think all public-facing modules are ok now. There are still two
> issue which I consider minor, but I think should be mentioned before
> we lift the warning:
>
> ...
>
> Other than these, some planned features are implementing the HMMER3.1
> parser (which I think should not interfere with lifting the warning).

We'll also want to update the Tutorial as well, merging the BLAST
and SearchIO chapters. Let's start work on this just after releasing
Biopython 1.62 then, which I think we can now go ahead with :)

Lenna has sorted out the PDB occupancy issue, and Eric has
updated the PRANK unit tests.

I think this means we are OK to do the release in the next day
or two? Any objections?

Regards,

Peter

From p.j.a.cock at googlemail.com  Tue Aug 27 04:43:17 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 27 Aug 2013 09:43:17 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
Message-ID: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>

Continuing this thread under a new title, as below, I would
like to do the Biopython 1.62 release in the next day or two:

http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html

Peter

On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote:
> On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote:
>>
>>> So you'd be comfortable with removing the experimental warning
>>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
>>> thing is resolved)?
>>
>> Yes. I think all public-facing modules are ok now. There are still two
>> issue which I consider minor, but I think should be mentioned before
>> we lift the warning:
>>
>> ...
>>
>> Other than these, some planned features are implementing the HMMER3.1
>> parser (which I think should not interfere with lifting the warning).
>
> We'll also want to update the Tutorial as well, merging the BLAST
> and SearchIO chapters. Let's start work on this just after releasing
> Biopython 1.62 then, which I think we can now go ahead with :)
>
> Lenna has sorted out the PDB occupancy issue, and Eric has
> updated the PRANK unit tests.
>
> I think this means we are OK to do the release in the next day
> or two? Any objections?
>
> Regards,
>
> Peter

From w.arindrarto at gmail.com  Tue Aug 27 05:41:32 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Tue, 27 Aug 2013 11:41:32 +0200
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
Message-ID: <CADEGkF4-QCSOeHO6HO0Jqz=5j2XmCQvJ6xjxY92uCTGg0fJrjQ@mail.gmail.com>

Hi Peter, everyone,

On Tue, Aug 27, 2013 at 10:43 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Continuing this thread under a new title, as below, I would
> like to do the Biopython 1.62 release in the next day or two:
>
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html
>
> Peter
>
> On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote:
>> On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote:
>>>
>>>> So you'd be comfortable with removing the experimental warning
>>>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
>>>> thing is resolved)?
>>>
>>> Yes. I think all public-facing modules are ok now. There are still two
>>> issue which I consider minor, but I think should be mentioned before
>>> we lift the warning:
>>>
>>> ...
>>>
>>> Other than these, some planned features are implementing the HMMER3.1
>>> parser (which I think should not interfere with lifting the warning).
>>
>> We'll also want to update the Tutorial as well, merging the BLAST
>> and SearchIO chapters. Let's start work on this just after releasing
>> Biopython 1.62 then, which I think we can now go ahead with :)

Ah yes. I missed the tutorial. Then yes, it should be updated as well.
If we are doing this after 1.62 is released, is worth it to aim for a
larger change (I recall there was a discussion some time ago about
porting the tutorial to Sphinx).

>> Lenna has sorted out the PDB occupancy issue, and Eric has
>> updated the PRANK unit tests.
>>
>> I think this means we are OK to do the release in the next day
>> or two? Any objections?

No objections from me :).

Best,
Bow

From eric.talevich at gmail.com  Tue Aug 27 14:45:58 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Tue, 27 Aug 2013 11:45:58 -0700
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
Message-ID: <CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>

On Tue, Aug 27, 2013 at 1:43 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Continuing this thread under a new title, as below, I would
> like to do the Biopython 1.62 release in the next day or two:
>
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html
>
> Peter
>
> On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote:
> > On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote:
> >>
> >>> So you'd be comfortable with removing the experimental warning
> >>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
> >>> thing is resolved)?
> >>
> >> Yes. I think all public-facing modules are ok now. There are still two
> >> issue which I consider minor, but I think should be mentioned before
> >> we lift the warning:
> >>
> >> ...
> >>
> >> Other than these, some planned features are implementing the HMMER3.1
> >> parser (which I think should not interfere with lifting the warning).
> >
> > We'll also want to update the Tutorial as well, merging the BLAST
> > and SearchIO chapters. Let's start work on this just after releasing
> > Biopython 1.62 then, which I think we can now go ahead with :)
> >
> > Lenna has sorted out the PDB occupancy issue, and Eric has
> > updated the PRANK unit tests.
> >
> > I think this means we are OK to do the release in the next day
> > or two? Any objections?
> >
> > Regards,
> >
> > Peter
>


Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the
Tutorial to mention CDAO support?

Also, has anything else noteworthy been added since the beta that we can
announce in the NEWS file?

Thanks,
Eric

From p.j.a.cock at googlemail.com  Tue Aug 27 15:27:48 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 27 Aug 2013 20:27:48 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
	<CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
Message-ID: <CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>

On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>
> Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the
> Tutorial to mention CDAO support?

Go for it - I need to retest the DSSP unit test tomorrow anyway.

> Also, has anything else noteworthy been added since the beta that we can
> announce in the NEWS file?

Minor bug fixes and more tests? Perhaps the PDB occupancy change?

Peter

From w.arindrarto at gmail.com  Wed Aug 28 08:12:24 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Wed, 28 Aug 2013 14:12:24 +0200
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
	<CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
	<CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>
Message-ID: <CADEGkF4nxEBwJkNh1H75mYX3ga2SJZmcHRDRCwwf7AZQ_C+4kw@mail.gmail.com>

Hi Peter, everyone,

On Tue, Aug 27, 2013 at 9:27 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>>
>> Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the
>> Tutorial to mention CDAO support?
>
> Go for it - I need to retest the DSSP unit test tomorrow anyway.
>
>> Also, has anything else noteworthy been added since the beta that we can
>> announce in the NEWS file?
>
> Minor bug fixes and more tests? Perhaps the PDB occupancy change?
>
> Peter

I don't like to believe in coincidences, but just last night a user
emailed me about an issue in SearchIO's exonerate parser which I feel
should be mentioned here (exchange attached on his permission). He
stumbled on an error where an
exonerate output file is unparseable because of split codon
alignments. In short, I feel we should not lift the
BiopythonExperimentalWarning for the 1.62 release.

The issue is caused by protein to genome alignments in exonerate (in
the protein2genome alignment mode) that has split codons in it. When
split codons are present, SearchIO splits these HSPs into fragments
which are basically a single contiguous sequence alignment. These
fragments have their own Seq objects (representing hit and query
sequences). The problem is, these Seq objects have to be full
sequences and the query sequence fragment (protein) do not represent a
full sequence here (since the underlying codon is split).

Currently, SearchIO raises an AssertionError when this type of
alignment is found and simply says it can not deal with it. This
should not remain the case, though. A test case was actually put up
for this (https://github.com/biopython/biopython/blob/master/Tests/Exonerate/exn_22_m_protein2genome.exn#L173).
However, since I have yet to find a way to properly represent these
fragments with Seq objects, the actual test have not been written (and
I missed this when doing the last review).

I have thought of several alternatives:

* I saw a ThreeLetterProtein Alphabet in
https://github.com/biopython/biopython/blob/master/Bio/Alphabet/__init__.py#L136,
maybe we could use this to create Seq objects that allows partial
codons?

* Change HSPFragment to not use full Seq objects anymore (which may
require some rework on the HSP objects as well)

But have not explored them thoroughly. I should note that Zheng Ruan's
GSoC project on Codon alignments
(http://zruanweb.com/category/gsoc.html) may prove useful as well
here.

While I don't expect the issue to pop up often (it shows up only when
exonerate is used with the protein2genome mode out of the many modes
it has and when the alignment hits a split codon), I feel like it
should be discussed (if not, mentioned) here first since dealing with
the issue may require some more reworking.

So I'm sorry for the late warning and missing this. I hope this is not
too late :).

Best,
Bow
-------------- next part --------------
On Wed, Aug 28, 2013 at 10:31 AM, Wibowo Arindrarto <w.arindrarto at gmail.com> wrote:
> Hi Somak,
>
>> Do you have any idea whether Bioperl based Exonerate parser can handle such cases?
>> I'm yet to try Bioperl.
>
> I tried your file with Bioperl's parser, and while it can parse the
> entire file without errors, I don't know whether all the information
> in the file (sequence, sequence coordinates) are parsed properly. But
> maybe that's just me being less familiar with Bioperl. I suggest
> posting to their mailing list
> (http://lists.open-bio.org/pipermail/bioperl-l/) or searching the list
> archive if you have any questions regarding this. The library also
> have an active community behind it.
>
>> And please feel free to forward this mail to Biopythonlist or any other discussion forum you
>> think is appropriate,
>
> Ok, thanks :).
>
>> Thanks again
>>
>> Somak Ray
>
> Best,
> Bow
>
>> ________________________________________
>> From: w.arindrarto at gmail.com [w.arindrarto at gmail.com] on behalf of Wibowo Arindrarto [bow at bow.web.id]
>> Sent: Tuesday, August 27, 2013 8:01 PM
>> To: Ray, Somak
>> Subject: Re: On parsing of exonerate output
>>
>> Hi Somak,
>>
>>> Dear Dr. Arindrarto,
>>>
>>> I came across your blog about parsing outputs from Exonerate . I have some
>>> generated some files using exonarates protein2dna model. However when
>>> running your scripts on them I'm getting some assertion error in python 2.7.
>>> I'm attaching  two of such exonerate outputs.The "Result_goodfile.txt" can
>>> be passed by the parser whereas "Result_badfile.txt" can't be parsed.
>>>
>>> Please let me know if there's any solution to the problem.
>>>
>>> Thanks in advance
>>
>> Hmm..looking at the files, it seems that this is caused by a split
>> codon in the alignment (Results_badfile.txt, line 25). The problem is,
>> the three-letter amino acid sequence needs to be translated into a
>> single-letter amino acid sequence since Biopython could not create Seq
>> objects with three-letter amino acid codes. However, this conversion
>> means that codons that span introns (as the one on line 25) could not
>> be dealt with properly since a single fragment expects a full Seq
>> object (hence the error you're seeing;  it expects the three-letter
>> amino acid sequence length to be multiples of three).
>>
>> So the short answer is no, there is not yet an immediate solution to this issue.
>>
>> I should mention that this came at an appropriate time, though, so
>> thanks for the email :). I am reviewing known SearchIO issues and this
>> was apparently an issue that I have lost track of (checking at the
>> test suite, there is a test for this case but it has not been included
>> in the test suite).
>>
>> Do you mind if I forward this email to the Biopython list
>> (http://biopython.org/wiki/Mailing_lists)? I think other developers /
>> users may be interested in this.
>>
>> Best,
>> Bow

From p.j.a.cock at googlemail.com  Wed Aug 28 13:31:19 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 18:31:19 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_46o8X9==60xrB0mZCsApJZKQS9GdaxWgPVVRW05OnXdA@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
	<CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
	<CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>
	<CADEGkF4nxEBwJkNh1H75mYX3ga2SJZmcHRDRCwwf7AZQ_C+4kw@mail.gmail.com>
	<CAKVJ-_46o8X9==60xrB0mZCsApJZKQS9GdaxWgPVVRW05OnXdA@mail.gmail.com>
Message-ID: <CAKVJ-_6J-j2S5rWGs-v51UK3fxXQQqvLDqqK6i85+KARX3UYrQ@mail.gmail.com>

Hello all,

I'm starting the release 1.62 process now, getting the new DSSP
test working cross platform was more work than I expected -
thank goodness for the BuildBot server yet again :)

Please don't commit anything to the master branch until further
notice,

Thanks,

Peter

From p.j.a.cock at googlemail.com  Wed Aug 28 14:28:43 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 19:28:43 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
Message-ID: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>

On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hello all,
>
> I'm starting the release 1.62 process now, getting the new DSSP
> test working cross platform was more work than I expected -
> thank goodness for the BuildBot server yet again :)
>
> Please don't commit anything to the master branch until further
> notice,
>
> Thanks,
>
> Peter

While I finish off the Windows installers etc, and have dinner,
would anyone like to volunteer to write a draft for the release
announcement to go out on the mailing lists and news blog?
http://news.open-bio.org/news/category/obf-projects/biopython/

These are usually based on the rather dry NEWS file information,
and the previous announcement for style/links/etc.

Thanks,

Peter

From p.j.a.cock at googlemail.com  Wed Aug 28 14:53:21 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 19:53:21 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
Message-ID: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>

Hello all - especially newcomers,

There are going to be several boring but useful things to do to
the Biopython code base once we're finished with Python 2.5
(the imminent release of Biopython 1.62 has been clearly
described as the final Biopython release to support it).

Some of these tasks are quite easy, and might tempt some
of our non-core contributors or new-comers to have a go,
however to avoid too much duplication of effort I'd suggest
**replying in this thread if you want to tackle anything** - and
then start working out how to send us your first pull request.

Things which will need doing:

(0) Disable the Python 2.5 and Jython 2.5 buildbot
(this will be done by me or Tiago)

(1) Disable the Python 2.5 target in TravisCI, see
https://travis-ci.org/biopython/biopython/
(this is a simple one line edit to the .travis.yml file)

(2) Remove all the with statement imports (and any
comment lines associated with them):

from __future__ import with_statement

(3) Remove Bio/_py3k/_namedtuple.py and adjust
import lines accordingly

(4) Scan over the code base looking for any comments
about Python 2.5 (e.g. using the grep command), and
reviewing them one by one to see if there is an old
workaround we can now remove.

(5) More advanced code review, for example looking
for places we can better take advantage of context
managers (with statements) for file handles.

Of this list, (1), (2) and (3) are certainly things suitable
for relative newcomers - and assuming I'm not away I
will happily do the pull request reviews.

For the more advances issues (4) and (5) we may need
more eyes on the code...

Thank you,

Peter

From p.j.a.cock at googlemail.com  Wed Aug 28 15:01:36 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 20:01:36 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
Message-ID: <CAKVJ-_7yBM0znHk-N91mzBOe-=3gFExzO9N4dXaBnaW7uWzG3Q@mail.gmail.com>

On Wed, Aug 28, 2013 at 7:28 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all,
>>
>> I'm starting the release 1.62 process now, getting the new DSSP
>> test working cross platform was more work than I expected -
>> thank goodness for the BuildBot server yet again :)
>>
>> Please don't commit anything to the master branch until further
>> notice,
>>
>> Thanks,
>>
>> Peter
>
> While I finish off the Windows installers etc, and have dinner,
> would anyone like to volunteer to write a draft for the release
> announcement to go out on the mailing lists and news blog?
> http://news.open-bio.org/news/category/obf-projects/biopython/
>
> These are usually based on the rather dry NEWS file information,
> and the previous announcement for style/links/etc.
>
> Thanks,
>
> Peter

A provisional tar-ball, zip file, and four Windows installers are
up now (but deliberately not yet listed on the download wiki page):
http://biopython.org/DIST/

If anyone would care to sanity test those in the next hour or two,
that would be great.

Thanks,

Peter

From p.j.a.cock at googlemail.com  Wed Aug 28 16:43:58 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 21:43:58 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
Message-ID: <CAKVJ-_5Ybyo7oi5atbjm9fyFjNZiP635WDX_V-cKwt+nBo517Q@mail.gmail.com>

On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hello all - especially newcomers,
>
> There are going to be several boring but useful things to do to
> the Biopython code base once we're finished with Python 2.5
> (the imminent release of Biopython 1.62 has been clearly
> described as the final Biopython release to support it).
>
> Some of these tasks are quite easy, and might tempt some
> of our non-core contributors or new-comers to have a go,
> however to avoid too much duplication of effort I'd suggest
> **replying in this thread if you want to tackle anything** - and
> then start working out how to send us your first pull request.

I tweeted this earlier,
https://twitter.com/pjacock/status/372796602760855552

> Things which will need doing:
>
> ...
>
> (1) Disable the Python 2.5 target in TravisCI, see
> https://travis-ci.org/biopython/biopython/
> (this is a simple one line edit to the .travis.yml file)

The first easy task has been claimed already:
https://github.com/biopython/biopython/pull/226

Wayne wrote:
>> Via Twitter, I saw your note"
>> (1) Disable the Python 2.5 target in TravisCI, see
>> https://travis-ci.org/biopython/biopython/
>> (this is a simple one line edit to the .travis.yml file)"
>>
>> Turned out it really was as easy as you said.

Once the release is out, that fix can go in - thanks :)

Wayne (BCC'd), please sign up to the biopython-dev
list if you haven't already:

http://lists.open-bio.org/mailman/listinfo/biopython-dev

Thank you,

Peter

From arklenna at gmail.com  Wed Aug 28 16:57:10 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Wed, 28 Aug 2013 16:57:10 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
Message-ID: <CAHQkFdeLZSrX4MVJhNvnLOoeX_1_b+nuX9ULKBm-fz-y=hRbsQ@mail.gmail.com>

On Wed, Aug 28, 2013 at 2:53 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

>
> (2) Remove all the with statement imports (and any
> comment lines associated with them):
>
> from __future__ import with_statement
>

As I demonstrated, I regularly forget that `with` is "new"!


>
> (4) Scan over the code base looking for any comments
> about Python 2.5 (e.g. using the grep command), and
> reviewing them one by one to see if there is an old
> workaround we can now remove.
>

If I count:

    find Bio -name "*.py" -exec grep -H -n ".*#.*2\.5" {} \;

I only see 24 - not too bad. Many are `with` related.


>
> (5) More advanced code review, for example looking
> for places we can better take advantage of context
> managers (with statements) for file handles.
>

For this one:

    find Bio -name "*.py" -exec grep -H -n -P "= ?open\(" {} \;

I find 145...although not all `open()` statements can be easily swapped for
`with`.

I'm currently prepping for my UK trip so I may not be able to do any of
this before I get back mid-September.

Cheers,

Lenna

From p.j.a.cock at googlemail.com  Wed Aug 28 16:58:58 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 21:58:58 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_5Ybyo7oi5atbjm9fyFjNZiP635WDX_V-cKwt+nBo517Q@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_5Ybyo7oi5atbjm9fyFjNZiP635WDX_V-cKwt+nBo517Q@mail.gmail.com>
Message-ID: <CAKVJ-_77G9pqtJieJXSsaUuGwj-jnU4tA7JMVmgp_O1ca4qmAA@mail.gmail.com>

On Wed, Aug 28, 2013 at 9:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all - especially newcomers,
>>
>> There are going to be several boring but useful things to do to
>> the Biopython code base once we're finished with Python 2.5
>> (the imminent release of Biopython 1.62 has been clearly
>> described as the final Biopython release to support it).
>>
>> Some of these tasks are quite easy, and might tempt some
>> of our non-core contributors or new-comers to have a go,
>> however to avoid too much duplication of effort I'd suggest
>> **replying in this thread if you want to tackle anything** - and
>> then start working out how to send us your first pull request.
>
> I tweeted this earlier,
> https://twitter.com/pjacock/status/372796602760855552
>
>> Things which will need doing:
>>
>> ...
>>
>> (1) Disable the Python 2.5 target in TravisCI, see
>> https://travis-ci.org/biopython/biopython/
>> (this is a simple one line edit to the .travis.yml file)
>
> The first easy task has been claimed already:
> https://github.com/biopython/biopython/pull/226

And task (2) as well on the same pull request - keen!

Wayne (BCC'd), could you delay trying task (3) for a
few days to give someone else a chance please ;)

Maybe have a look for things under (4) instead,
Lenna's quick count suggests plenty of things
need looking at...

Peter

From w.arindrarto at gmail.com  Wed Aug 28 17:17:57 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Wed, 28 Aug 2013 23:17:57 +0200
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
Message-ID: <CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>

Hi everyone,

I've written a draft of our 1.62 release (below). I'd appreciate it if
somebody gives it another look (for typos, etc.). Also, if I miss
somebody in the contributors list, please let me know :).

---

Biopython 1.62 released
=======================


Source distributions and Windows installers for **Biopython** 1.62 are
now available from the [downloads
page](http://biopython.org/wiki/Download) on the [official Biopython
website](http://biopython.org/wiki/Main_Page) and from the [Python
Package Index (PyPI)](https://pypi.python.org/pypi/biopython).


# Python support

This is our first official release that supports Python 3.
Specifically, we tested under Python 3.3. Other versions of Python 3
may still work albeit with some issues.

We still fully support Python 2.5, 2.6, and 2.7. Support under
[Jython](http://www.jython.org/) is available for versions 2.5 and 2.7
and under [PyPy](http://pypy.org/) for versions 1.9 and 2.0. However,
unlike CPython, Jython and PyPy support is partial: NumPy and our C
extensions are not covered.

Please note that this release marks our last official support Python
2.5. Beginning from Biopython 1.63, the minimum supported Python
version will be 2.6.


# Highlights

* The translation functions will give a warning on any partial codons
(and this will probably become an error in a future release). If you
know you are dealing with partial sequences, either pad with N to
extend the sequence length to a multiple of three, or explicitly trim
the sequence.

* The handling of joins and related complex features in Genbank/EMBL
files has been changed with the introduction of a CompoundLocation
object. Previously a SeqFeature for something like a multi-exon CDS
would have a child SeqFeature (under the sub_features attribute) for
each exon. The sub_features property will still be populated for now,
but is deprecated and will in future be removed. Please consult the
examples in the help (docstrings) and Tutorial.

* Thanks to the efforts of Ben Morris, the Phylo module now supports
the file formats NeXML and CDAO. The Newick parser is also
significantly faster, and can now optionally extract bootstrap values
from the Newick comment field (like Molphy and Archaeopteryx do). Nate
Sutton added a wrapper for FastTree to Bio.Phylo.Applications.

* New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats
from UniProt-GOA.

* The BioSQL module is now supported in Jython. MySQL and PostgreSQL
databases can be used. The relevant JDBC driver should be available in
the CLASSPATH.

* Feature labels on circular GenomeDiagram figures now support the
label_position argument (start, middle or end) in addition to the
current default placement, and in a change to prior releases these
labels are outside the features which is now consistent with the
linear diagrams.

* The code for parsing 3D structures in mmCIF files was updated to use
the Python standard library's shlex module instead of C code using
flex.

* The Bio.Sequencing.Applications module now includes a BWA command
line wrapper.

* Bio.motifs supports JASPAR format files with multiple
position-frequence matrices.

Additionally there have been other minor bug fixes and more unit tests.


# Contributors

Many thanks to the Biopython developers and community for making this release
possible, especially the following contributors:


Alexander Campbell (first contribution)
Andrea Rizzi (first contribution)
Anthony Mathelier (first contribution)
Ben Morris (first contribution)
Brad Chapman
Christian Brueffer
David Arenillas (first contribution)
David Martin (first contribution)
Eric Talevich
Iddo Friedberg
Jian-Long Huang (first contribution)
Joao Rodrigues
Kai Blin
Michiel de Hoon
Nate Sutton (first contribution)
Peter Cock
Petra Kubincov? (first contribution)
Phillip Garland
Saket Choudhary (first contribution)
Tiago Antao
Wibowo 'Bow' Arindrarto
Xabier Bello (first contribution)

----

Best,
Bow

On Wed, Aug 28, 2013 at 8:28 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all,
>>
>> I'm starting the release 1.62 process now, getting the new DSSP
>> test working cross platform was more work than I expected -
>> thank goodness for the BuildBot server yet again :)
>>
>> Please don't commit anything to the master branch until further
>> notice,
>>
>> Thanks,
>>
>> Peter
>
> While I finish off the Windows installers etc, and have dinner,
> would anyone like to volunteer to write a draft for the release
> announcement to go out on the mailing lists and news blog?
> http://news.open-bio.org/news/category/obf-projects/biopython/
>
> These are usually based on the rather dry NEWS file information,
> and the previous announcement for style/links/etc.
>
> Thanks,
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev


From p.j.a.cock at googlemail.com  Wed Aug 28 17:30:33 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 22:30:33 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
	<CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
Message-ID: <CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>

On Wed, Aug 28, 2013 at 10:17 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi everyone,
>
> I've written a draft of our 1.62 release (below). I'd appreciate it if
> somebody gives it another look (for typos, etc.). Also, if I miss
> somebody in the contributors list, please let me know :).

Thanks Bow - I don't think the WordPress blog understands
markdown style markup, but bonus marks anyway :)

I'm about to update the tar-ball and zip file to include the
NEWS file updated with the two names Bow spotted as
missing - hopefully there are no more and this commit
will get the release tag:

https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff

(The Windows installers I prepared earlier should not be
affected as they don't include the NEWS file)

> # Python support
>
> This is our first official release that supports Python 3.
> Specifically, we tested under Python 3.3. Other versions
> of Python 3 may still work albeit with some issues.

I'd be a bit more explicit:

Specifically, this is supported under Python 3.3. Older
versions of Python 3 may still work albeit with some
issues, but are *not* supported.

> Please note that this release marks our last official support Python
> 2.5. Beginning from Biopython 1.63, the minimum supported Python
> version will be 2.6.

Minor typo, needs a for/of, e.g.

Please note that this release marks our last official support for
Python 2.5

Thanks Bow,

Peter

From w.arindrarto at gmail.com  Wed Aug 28 18:17:44 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Thu, 29 Aug 2013 00:17:44 +0200
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
	<CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
	<CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>
Message-ID: <CADEGkF4SN5-VDSvhiRXEnKPNwdUU7vbrUCPWP-0-qfU0PtQkfg@mail.gmail.com>

Hi Peter,

> Thanks Bow - I don't think the WordPress blog understands
> markdown style markup, but bonus marks anyway :)

Ah yes, I was planning to convert it later to HTML (I find writing
markdown first easier ~ and also more mailing-list friendly).

> I'm about to update the tar-ball and zip file to include the
> NEWS file updated with the two names Bow spotted as
> missing - hopefully there are no more and this commit
> will get the release tag:
>
> https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff
>
> (The Windows installers I prepared earlier should not be
> affected as they don't include the NEWS file)
>
>> # Python support
>>
>> This is our first official release that supports Python 3.
>> Specifically, we tested under Python 3.3. Other versions
>> of Python 3 may still work albeit with some issues.
>
> I'd be a bit more explicit:
>
> Specifically, this is supported under Python 3.3. Older
> versions of Python 3 may still work albeit with some
> issues, but are *not* supported.
>
>> Please note that this release marks our last official support Python
>> 2.5. Beginning from Biopython 1.63, the minimum supported Python
>> version will be 2.6.
>
> Minor typo, needs a for/of, e.g.
>
> Please note that this release marks our last official support for
> Python 2.5
>
> Thanks Bow,
>
> Peter

Fixes applied, thanks too :).

Best,
Bow

From p.j.a.cock at googlemail.com  Wed Aug 28 18:21:54 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 23:21:54 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CADEGkF4SN5-VDSvhiRXEnKPNwdUU7vbrUCPWP-0-qfU0PtQkfg@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
	<CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
	<CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>
	<CADEGkF4SN5-VDSvhiRXEnKPNwdUU7vbrUCPWP-0-qfU0PtQkfg@mail.gmail.com>
Message-ID: <CAKVJ-_4CuA_OoahrMomHra7qMnbuCcEyFH1cpBNQuxsrWnXEXQ@mail.gmail.com>

On Wed, Aug 28, 2013 at 11:17 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Peter,
>
>> Thanks Bow - I don't think the WordPress blog understands
>> markdown style markup, but bonus marks anyway :)
>
> Ah yes, I was planning to convert it later to HTML (I find writing
> markdown first easier ~ and also more mailing-list friendly).

Thank you :)

This is live now but can be edited - so we can fix any
remaining issues before sending round the emails:
http://news.open-bio.org/news/2013/08/biopython-1-62-released/

Tagged on GitHub too,
https://github.com/biopython/biopython/tree/biopython-162

Note I have not yet pushed to PyPI - I'd like one or two
positive reports first before doing that (just in case).

Thanks all,

Peter

From p.j.a.cock at googlemail.com  Wed Aug 28 18:47:04 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 23:47:04 +0100
Subject: [Biopython-dev] Biopython 1.62 released
Message-ID: <CAKVJ-_750vVWnooCx4zPqQf+KOBjFOzF48Zp+F66=MS6o-1c2A@mail.gmail.com>

Dear Biopythoneers,

Source distributions and Windows installers for Biopython 1.62 are now
available from the downloads page on the official Biopython website
and (soon) from the Python Package Index (PyPI).

Python support

This is our first release of Biopython which officially supports
Python 3. Specifically, this is supported under Python 3.3. Older
versions of Python 3 may still work albeit with some issues, but are
not supported.

We still fully support Python 2.5, 2.6, and 2.7. Support under Jython
is available for versions 2.5 and 2.7 and under PyPy for versions 1.9
and 2.0. However, unlike CPython, Jython and PyPy support is partial:
NumPy and our C extensions are not covered.

Please note that this release marks our last official for support
Python 2.5. Beginning from Biopython 1.63, the minimum supported
Python version will be 2.6.

Highlights

The translation functions will give a warning on any partial codons
(and this will probably become an error in a future release). If you
know you are dealing with partial sequences, either pad with ?N? to
extend the sequence length to a multiple of three, or explicitly trim
the sequence.

The handling of joins and related complex features in Genbank/EMBL
files has been changed with the introduction of a CompoundLocation
object. Previously a SeqFeaturefor something like a multi-exon CDS
would have a child SeqFeature (under thesub_features attribute) for
each exon. The sub_features property will still be populated for now,
but is deprecated and will in future be removed. Please consult the
examples in the help (docstrings) and Tutorial.

Thanks to the efforts of Ben Morris, the Phylo module now supports the
file formats NeXML and CDAO. The Newick parser is also significantly
faster, and can now optionally extract bootstrap values from the
Newick comment field (like Molphy and Archaeopteryx do). Nate Sutton
added a wrapper for FastTree toBio.Phylo.Applications.

New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats
from UniProt-GOA.

The BioSQL module is now supported in Jython. MySQL and PostgreSQL
databases can be used. The relevant JDBC driver should be available in
the CLASSPATH.

Feature labels on circular GenomeDiagram figures now support the
label_positionargument (start, middle or end) in addition to the
current default placement, and in a change to prior releases these
labels are outside the features which is now consistent with the
linear diagrams.

The code for parsing 3D structures in mmCIF files was updated to use
the Python standard library?s shlex module instead of C code using
flex.

The Bio.Sequencing.Applications module now includes a BWA command line wrapper.
Bio.motifs supports JASPAR format files with multiple
position-frequence matrices.

Additionally there have been other minor bug fixes and more unit tests.

Contributors

Many thanks to the Biopython developers and community for making this
release possible, especially the following contributors:

Alexander Campbell (first contribution)
Andrea Rizzi (first contribution)
Anthony Mathelier (first contribution)
Ben Morris (first contribution)
Brad Chapman
Christian Brueffer
David Arenillas (first contribution)
David Martin (first contribution)
Eric Talevich
Iddo Friedberg
Jian-Long Huang (first contribution)
Joao Rodrigues
Kai Blin
Lenna Peterson
Michiel de Hoon
Matsuyuki Shirota (first contribution)
Nate Sutton (first contribution)
Peter Cock
Petra Kubincov? (first contribution)
Phillip Garland
Saket Choudhary (first contribution)
Tiago Antao
Wibowo ?Bow? Arindrarto
Xabier Bello (first contribution)

Thank you all.

Release announcement here (RSS feed available):
http://news.open-bio.org/news/2013/08/biopython-1-62-released/

P.S. You can follow @Biopython on Twitter
https://twitter.com/Biopython


From p.j.a.cock at googlemail.com  Thu Aug 29 05:04:59 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 10:04:59 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
Message-ID: <CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>

On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hello all - especially newcomers,
>
> There are going to be several boring but useful things to do to
> the Biopython code base once we're finished with Python 2.5
> (the imminent release of Biopython 1.62 has been clearly
> described as the final Biopython release to support it).
>
> Some of these tasks are quite easy, and might tempt some
> of our non-core contributors or new-comers to have a go,
> however to avoid too much duplication of effort I'd suggest
> **replying in this thread if you want to tackle anything** - and
> then start working out how to send us your first pull request.
>
> Things which will need doing:
>
> (0) Disable the Python 2.5 and Jython 2.5 buildbot
> (this will be done by me or Tiago)

Done.

> (1) Disable the Python 2.5 target in TravisCI, see
> https://travis-ci.org/biopython/biopython/
> (this is a simple one line edit to the .travis.yml file)

Done by Wayne,
https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de

> (2) Remove all the with statement imports (and any
> comment lines associated with them):
>
> from __future__ import with_statement

Done by Wayne,
https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84

> (3) Remove Bio/_py3k/_namedtuple.py and adjust
> import lines accordingly

Any new volunteer want to try this?

> (4) Scan over the code base looking for any comments
> about Python 2.5 (e.g. using the grep command), and
> reviewing them one by one to see if there is an old
> workaround we can now remove.

Lenna had a quick look, there should be some easy one here.

> (5) More advanced code review, for example looking
> for places we can better take advantage of context
> managers (with statements) for file handles.

Another new one, related to (5), and fairly easy:

(6) Reviewing examples in the docstrings and Tutorial
where it would make sense to use a 'with' for file handles.

This should also solve many of the ResourceWarning:
unclosed file ... warnings visible running the full test
suite under Python 3, e.g. see:
http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio

Peter

From chris.mit7 at gmail.com  Thu Aug 29 11:20:09 2013
From: chris.mit7 at gmail.com (Chris Mitchell)
Date: Thu, 29 Aug 2013 11:20:09 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
Message-ID: <CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>

I was going to take a stab at (3), but it seems that _namedtuple.py doesn't
exist.

Looking under _py3k as well as grep -Ri namedtuple ./*

fails to find it. I'm pulling from
https://github.com/biopython/biopython.git


On Thu, Aug 29, 2013 at 5:04 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > Hello all - especially newcomers,
> >
> > There are going to be several boring but useful things to do to
> > the Biopython code base once we're finished with Python 2.5
> > (the imminent release of Biopython 1.62 has been clearly
> > described as the final Biopython release to support it).
> >
> > Some of these tasks are quite easy, and might tempt some
> > of our non-core contributors or new-comers to have a go,
> > however to avoid too much duplication of effort I'd suggest
> > **replying in this thread if you want to tackle anything** - and
> > then start working out how to send us your first pull request.
> >
> > Things which will need doing:
> >
> > (0) Disable the Python 2.5 and Jython 2.5 buildbot
> > (this will be done by me or Tiago)
>
> Done.
>
> > (1) Disable the Python 2.5 target in TravisCI, see
> > https://travis-ci.org/biopython/biopython/
> > (this is a simple one line edit to the .travis.yml file)
>
> Done by Wayne,
>
> https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de
>
> > (2) Remove all the with statement imports (and any
> > comment lines associated with them):
> >
> > from __future__ import with_statement
>
> Done by Wayne,
>
> https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84
>
> > (3) Remove Bio/_py3k/_namedtuple.py and adjust
> > import lines accordingly
>
> Any new volunteer want to try this?
>
> > (4) Scan over the code base looking for any comments
> > about Python 2.5 (e.g. using the grep command), and
> > reviewing them one by one to see if there is an old
> > workaround we can now remove.
>
> Lenna had a quick look, there should be some easy one here.
>
> > (5) More advanced code review, for example looking
> > for places we can better take advantage of context
> > managers (with statements) for file handles.
>
> Another new one, related to (5), and fairly easy:
>
> (6) Reviewing examples in the docstrings and Tutorial
> where it would make sense to use a 'with' for file handles.
>
> This should also solve many of the ResourceWarning:
> unclosed file ... warnings visible running the full test
> suite under Python 3, e.g. see:
>
> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From p.j.a.cock at googlemail.com  Thu Aug 29 11:30:51 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 16:30:51 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
Message-ID: <CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>

On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell <chris.mit7 at gmail.com> wrote:
> I was going to take a stab at (3), but it seems that _namedtuple.py doesn't
> exist.
>
> Looking under _py3k as well as grep -Ri namedtuple ./*
>
> fails to find it. I'm pulling from
> https://github.com/biopython/biopython.git

Oops. I wrote that email on my latop - it was a file never checked
into source code control. Looking back it was a plan for allowing
us to use named tuples on older versions of Python. Sorry!

But I have come up with another easy task instead,

(7) Update exception style from this,

except ErrorClass, variable_name:

to this:

except ErrorClass as variable_name:

The second form is the only allowed syntax in Python 3,
but was not possible under Python 2.5.

Regards,

Peter

From chris.mit7 at gmail.com  Thu Aug 29 12:03:51 2013
From: chris.mit7 at gmail.com (Chris Mitchell)
Date: Thu, 29 Aug 2013 12:03:51 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
	<CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>
Message-ID: <CAK_U6OCThFBe3OYVOSjKr+h1s0s5kJgijq3ie4ZkXhQX4Zc06Q@mail.gmail.com>

Sounds good. Just took care of (7), running the test suite and will send a
pull request when that passes.

Chris


On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell <chris.mit7 at gmail.com>
> wrote:
> > I was going to take a stab at (3), but it seems that _namedtuple.py
> doesn't
> > exist.
> >
> > Looking under _py3k as well as grep -Ri namedtuple ./*
> >
> > fails to find it. I'm pulling from
> > https://github.com/biopython/biopython.git
>
> Oops. I wrote that email on my latop - it was a file never checked
> into source code control. Looking back it was a plan for allowing
> us to use named tuples on older versions of Python. Sorry!
>
> But I have come up with another easy task instead,
>
> (7) Update exception style from this,
>
> except ErrorClass, variable_name:
>
> to this:
>
> except ErrorClass as variable_name:
>
> The second form is the only allowed syntax in Python 3,
> but was not possible under Python 2.5.
>
> Regards,
>
> Peter
>

From p.j.a.cock at googlemail.com  Thu Aug 29 12:20:51 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 17:20:51 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAK_U6OCThFBe3OYVOSjKr+h1s0s5kJgijq3ie4ZkXhQX4Zc06Q@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
	<CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>
	<CAK_U6OCThFBe3OYVOSjKr+h1s0s5kJgijq3ie4ZkXhQX4Zc06Q@mail.gmail.com>
Message-ID: <CAKVJ-_5QV1G-gWO8ftqa82GTJ-YW3z1AocM-ttNO_co0c=5ZsQ@mail.gmail.com>

On Thu, Aug 29, 2013 at 5:03 PM, Chris Mitchell <chris.mit7 at gmail.com> wrote:
> Sounds good. Just took care of (7), running the test suite and will send a
> pull request when that passes.
>
> Chris

https://github.com/biopython/biopython/pull/227 looks good, but
has highlighted a bug in Scripts/debug/debug_blast_parser.py
(see my comment on GitHub).

Good work,

Peter

From p.j.a.cock at googlemail.com  Thu Aug 29 12:33:43 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 17:33:43 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
Message-ID: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>

> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all - especially newcomers,
>>
>> There are going to be several boring but useful things to do to
>> the Biopython code base once we're finished with Python 2.5
>> (the imminent release of Biopython 1.62 has been clearly
>> described as the final Biopython release to support it).
>>
>> Some of these tasks are quite easy, and might tempt some
>> of our non-core contributors or new-comers to have a go,
>> however to avoid too much duplication of effort I'd suggest
>> **replying in this thread if you want to tackle anything** - and
>> then start working out how to send us your first pull request.
>>
>> Things which will need doing:
>>
>> (0) Disable the Python 2.5 and Jython 2.5 buildbot
>> (this will be done by me or Tiago)
>
> Done.
>
>> (1) Disable the Python 2.5 target in TravisCI, see
>> https://travis-ci.org/biopython/biopython/
>> (this is a simple one line edit to the .travis.yml file)
>
> Done by Wayne,
> https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de
>
>> (2) Remove all the with statement imports (and any
>> comment lines associated with them):
>>
>> from __future__ import with_statement
>
> Done by Wayne,
> https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84
>
>> (3) Remove Bio/_py3k/_namedtuple.py and adjust
>> import lines accordingly

(3) was a false alarm, just an old file on my latop confusing me.

>> (4) Scan over the code base looking for any comments
>> about Python 2.5 (e.g. using the grep command), and
>> reviewing them one by one to see if there is an old
>> workaround we can now remove.
>
> Lenna had a quick look, there should be some easy one here.
>
>> (5) More advanced code review, for example looking
>> for places we can better take advantage of context
>> managers (with statements) for file handles.
>
> Another new one, related to (5), and fairly easy:
>
> (6) Reviewing examples in the docstrings and Tutorial
> where it would make sense to use a 'with' for file handles.
>
> This should also solve many of the ResourceWarning:
> unclosed file ... warnings visible running the full test
> suite under Python 3, e.g. see:
> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio

On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> ... I have come up with another easy task instead,
>
> (7) Update exception style from this,
>
> except ErrorClass, variable_name:
>
> to this:
>
> except ErrorClass as variable_name:
>
> The second form is the only allowed syntax in Python 3,
> but was not possible under Python 2.5.

(7) is being tackled by Chris Mitchell,
https://github.com/biopython/biopython/pull/227

Here's another fairly easy task for another new volunteer?:

(8) Excluding doctests and the Tutorial, use print function
rather than print statement. e.g. replace this:

print variable1, variable2

with this:

from __future__ import print_function
...
print(variable1, variable2)

Note that I am deliberately not suggesting we switch the
user visible examples on our documentation yet - that
deserves some discussion first.

Peter

From p.j.a.cock at googlemail.com  Thu Aug 29 13:03:24 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 18:03:24 +0100
Subject: [Biopython-dev] Python 2.6+ support for __dir__ method
Message-ID: <CAKVJ-_7gUKDAqW-B7+eFCjVX=3LZYvUbELfCgjOWzPX8TCCUhQ@mail.gmail.com>

Hi all,

I was reading over the list of what's new in Python 2.6 and wondered about this:

> The built-in dir() function now checks for a __dir__() method on the
> objects it receives. This method must return a list of strings containing
> the names of valid attributes for the object, and lets the object control
> the value that dir() produces. Objects that have __getattr__() or
> __getattribute__() methods can use this to advertise pseudo-attributes
> they will honor. (issue 1591665)

http://docs.python.org/2/whatsnew/2.6.html

Does that sound useful for some of our more dynamic objects?

Peter

From arklenna at gmail.com  Thu Aug 29 13:18:16 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 29 Aug 2013 13:18:16 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
Message-ID: <CAHQkFdfhJsRGGq=BPpZDh=rS8UtJcqA0ELSqHwjGvPqb5=GG=g@mail.gmail.com>

On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

>
> Here's another fairly easy task for another new volunteer?:
>
> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:
>
> print variable1, variable2
>
> with this:
>
> from __future__ import print_function
> ...
> print(variable1, variable2)
>
> Note that I am deliberately not suggesting we switch the
> user visible examples on our documentation yet - that
> deserves some discussion first.
>
>
>From the docs:  "When using the 2to3 source-to-source conversion tool, all
print statements are automatically converted to print() function calls, so
this is mostly a non-issue for larger projects."

http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function

Which suggests either doing it with the tool or just waiting until the full
3.0 changeover?

From p.j.a.cock at googlemail.com  Thu Aug 29 13:35:16 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 18:35:16 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAHQkFdfhJsRGGq=BPpZDh=rS8UtJcqA0ELSqHwjGvPqb5=GG=g@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAHQkFdfhJsRGGq=BPpZDh=rS8UtJcqA0ELSqHwjGvPqb5=GG=g@mail.gmail.com>
Message-ID: <CAKVJ-_5XLYbe0Axooob=wr-ynCGRPL7dR7bpHjvd6n5CALAAow@mail.gmail.com>

On Thursday, August 29, 2013, Lenna Peterson wrote:

>
>
> On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock <p.j.a.cock at googlemail.com<javascript:_e({}, 'cvml', 'p.j.a.cock at googlemail.com');>
> > wrote:
>
>>
>> Here's another fairly easy task for another new volunteer?:
>>
>> (8) Excluding doctests and the Tutorial, use print function
>> rather than print statement. e.g. replace this:
>>
>> print variable1, variable2
>>
>> with this:
>>
>> from __future__ import print_function
>> ...
>> print(variable1, variable2)
>>
>> Note that I am deliberately not suggesting we switch the
>> user visible examples on our documentation yet - that
>> deserves some discussion first.
>>
>>
> From the docs:  "When using the 2to3 source-to-source conversion tool, all
> print statements are automatically converted to print() function calls, so
> this is mostly a non-issue for larger projects."
>
> http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function
>
> Which suggests either doing it with the tool or just waiting until the
> full 3.0 changeover?
>

My motivation is a step towards a single codebase for both
Python 2 and Python 3 without needing 2to3, see:

http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html
http://www.slideshare.net/pjacock/biopython-update-bosc2013/

Peter

From superbobry at gmail.com  Thu Aug 29 16:34:59 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Fri, 30 Aug 2013 00:34:59 +0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
Message-ID: <CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>

On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Here's another fairly easy task for another new volunteer?:
>
> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:
>
> print variable1, variable2
>
> with this:
>
> from __future__ import print_function
> ...
> print(variable1, variable2)
>
> Note that I am deliberately not suggesting we switch the
> user visible examples on our documentation yet - that
> deserves some discussion first.


So the task is to remove print statement from the code only, right? I think
I can do this, should I use a separate branch?

Sergei

From p.j.a.cock at googlemail.com  Thu Aug 29 16:44:49 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 21:44:49 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
Message-ID: <CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>

On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
> On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
>>
>> Here's another fairly easy task for another new volunteer?:
>>
>> (8) Excluding doctests and the Tutorial, use print function
>> rather than print statement. e.g. replace this:
>>
>> print variable1, variable2
>>
>> with this:
>>
>> from __future__ import print_function
>> ...
>> print(variable1, variable2)
>>
>> Note that I am deliberately not suggesting we switch the
>> user visible examples on our documentation yet - that
>> deserves some discussion first.
>
>
> So the task is to remove print statement from the code only, right?

Replacing them with print functions, and testing this
worked OK under both Python 2 and Python 3, yes :)

> I think I can do this, should I use a separate branch?
>
> Sergei

Yes, I would certainly recommend keeping the
default 'master' branch as a copy of the official one,
and creating a new 'print-function' branch (or whatever
name you prefer) for this work.

We probably need to improve this wiki page - so any
comments about what is unclear would be great (on
a new email thread): http://biopython.org/wiki/GitUsage

Thanks,

Peter

From p.j.a.cock at googlemail.com  Fri Aug 30 06:49:23 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 11:49:23 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
Message-ID: <CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>

Hello Biopythoneers,

I've outlined another relatively simple improvement for potential
new contributors to try below....

On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>> Hello all - especially newcomers,
>>>
>>> There are going to be several boring but useful things to do to
>>> the Biopython code base once we're finished with Python 2.5
>>> (the imminent release of Biopython 1.62 has been clearly
>>> described as the final Biopython release to support it).
>>>
>>> ...
>>>
>>> (4) Scan over the code base looking for any comments
>>> about Python 2.5 (e.g. using the grep command), and
>>> reviewing them one by one to see if there is an old
>>> workaround we can now remove.
>>
>> Lenna had a quick look, there should be some easy one here.
>>
>>> (5) More advanced code review, for example looking
>>> for places we can better take advantage of context
>>> managers (with statements) for file handles.
>>
>> Another new one, related to (5), and fairly easy:
>>
>> (6) Reviewing examples in the docstrings and Tutorial
>> where it would make sense to use a 'with' for file handles.
>>
>> This should also solve many of the ResourceWarning:
>> unclosed file ... warnings visible running the full test
>> suite under Python 3, e.g. see:
>> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio
>
> On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> ... I have come up with another easy task instead,
>>
>> (7) Update exception style

(7) was done by Chris Mitchell,
https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942

> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:

(8) is being looked at by Sergei Lebedev.

----

Here's another idea, under the general issue (5) of taking
advantage of context managers (with statements), which
I would judge to be fairly easy (but not trivial).

(9) Use context managers (with statements) for temporary
warning filters in the unit tests.

Currently many of our unit tests add simple filters to ignore
a warning, and then restore the old filters using pop(). This
mostly works, but is fragile and the filter list is global so this
can have strange side effects. See:

$ grep "warnings." Tests/*.py

The idea here is to replace this:

warnings.simplefilter('ignore', PDBConstructionWarning)
#some code which may trigger the warning
warnings.filters.pop()

with this:

with warnings.catch_warnings():
    warnings.simplefilter("ignore", PDBConstructionWarning)
    #some code which may trigger the warning

Note the indentation - these changes will not give nice
clean diffs, so this will not be so easy to review.

I would therefore suggest editing just one test file at a
time (i.e. limit each commit to changing a single file), as
that makes it easier to selectively apply your changes

Please make sure you test this Python 2.6 which is most
likely to have problems with this "new" style ;)

(Again, if anyone plans to work on this, please let the list
know to minimised duplicated effort.)

If you're not familiar with our test suite, there is a chapter
introducing this in the main Tutorial & Cookbook,
http://biopython.org/DIST/docs/tutorial/Tutorial.html

Thanks,

Peter

From superbobry at gmail.com  Fri Aug 30 08:58:31 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Fri, 30 Aug 2013 16:58:31 +0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
Message-ID: <CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>

> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:
Unfortunately we cannot exclude doctests, because 'from __future__' import
is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on
docstrings with print statement.

Sergei


On Fri, Aug 30, 2013 at 12:44 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev <superbobry at gmail.com>
> wrote:
> > On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock <p.j.a.cock at googlemail.com>
> > wrote:
> >>
> >> Here's another fairly easy task for another new volunteer?:
> >>
> >> (8) Excluding doctests and the Tutorial, use print function
> >> rather than print statement. e.g. replace this:
> >>
> >> print variable1, variable2
> >>
> >> with this:
> >>
> >> from __future__ import print_function
> >> ...
> >> print(variable1, variable2)
> >>
> >> Note that I am deliberately not suggesting we switch the
> >> user visible examples on our documentation yet - that
> >> deserves some discussion first.
> >
> >
> > So the task is to remove print statement from the code only, right?
>
> Replacing them with print functions, and testing this
> worked OK under both Python 2 and Python 3, yes :)
>
> > I think I can do this, should I use a separate branch?
> >
> > Sergei
>
> Yes, I would certainly recommend keeping the
> default 'master' branch as a copy of the official one,
> and creating a new 'print-function' branch (or whatever
> name you prefer) for this work.
>
> We probably need to improve this wiki page - so any
> comments about what is unclear would be great (on
> a new email thread): http://biopython.org/wiki/GitUsage
>
> Thanks,
>
> Peter
>

From p.j.a.cock at googlemail.com  Fri Aug 30 09:14:14 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 14:14:14 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
	<CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
Message-ID: <CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>

On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
>> (8) Excluding doctests and the Tutorial, use print function
>> rather than print statement. e.g. replace this:
>
> Unfortunately we cannot exclude doctests, because 'from __future__' import
> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on
> docstrings with print statement.
>
> Sergei

Could you clarify this? Does this cause a problem via:

[Tests]$ python run_tests.py doctest

If you have a small example, copy & paste the "git diff" output here.

Peter

From superbobry at gmail.com  Fri Aug 30 09:28:50 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Fri, 30 Aug 2013 17:28:50 +0400
Subject: [Biopython-dev] =?utf-8?q?_Re=3A__Post_Biopython_1=2E62_release?=
 =?utf-8?q?=2C_clean-up_after_dropping_Python_2=2E5?=
In-Reply-To: <CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
	<CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
	<CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>
Message-ID: <etPan.52209e12.6b8b4567.5796@yooki.labs.intellij.net>

Sure,?a common pattern for a lot of BioPython modules seems to be:

? ? # +from __future__ import print_function


? ? def foo():
? ? ? ? """A docstring with print statement.

? ? ? ? >>> print "foo"
? ? ? ? foo
? ? ? ? """
? ? ? ? print "Running foo ..."
? ? ? ? # +print("Running foo ...")


? ? if __name__ == "__main__":
? ? ? ? import doctest
? ? ? ? doctest.testmod()

where foo is some function, which uses print statement in its body. Since we want to switch from print statements to print function we replace?print "Running foo ..."?with a?print()?call and add from?__future__ import ...?to the?beginning?of the module.?

What happens if we try to run the doctests after we've switched to?print_function?

? ? $ python /tmp/foo.py
? ? **********************************************************************
? ? File "/tmp/foo.py", line 7, in __main__.foo
? ? Failed example:
? ? ? ? print "foo"
? ? Exception raised:
? ? ? ? Traceback (most recent call last):
? ? ? ? ? File ".../doctest.py", line 1254, in __run
? ? ? ? ? ? compileflags, 1) in test.globs
? ? ? ? ? File "<doctest __main__.foo[0]>", line 1
? ? ? ? ? ? print "foo"
? ? ? ? ? ? ? ? ? ? ? ^
? ? ? ? SyntaxError: invalid syntax
? ? **********************************************************************
? ? 1 items had failures:
? ? ? ?1 of ? 1 in __main__.foo
? ? ***Test Failed*** 1 failures.

So, enabling?print_function?makes doctests using print statement fail with a SyntaxError, as shown by the example above. Thus, if we want to get rid of print statement in the code we have no other choice but to do the same it in the doctests.

Sergei?


On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com) wrote:

On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev <superbobry at gmail.com> wrote:  
>> (8) Excluding doctests and the Tutorial, use print function  
>> rather than print statement. e.g. replace this:  
>  
> Unfortunately we cannot exclude doctests, because 'from __future__' import  
> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on  
> docstrings with print statement.  
>  
> Sergei  

Could you clarify this? Does this cause a problem via:  

[Tests]$ python run_tests.py doctest  

If you have a small example, copy & paste the "git diff" output here.  

Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biopython-dev/attachments/20130830/8d297e64/attachment-0001.html>

From p.j.a.cock at googlemail.com  Fri Aug 30 10:22:26 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 15:22:26 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <etPan.52209e12.6b8b4567.5796@yooki.labs.intellij.net>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
	<CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
	<CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>
	<etPan.52209e12.6b8b4567.5796@yooki.labs.intellij.net>
Message-ID: <CAKVJ-_5+0GeN5p5d9tvf+BxzHC+6DUnamwNJ6_N61STWa0TLFQ@mail.gmail.com>

Thanks Sergei - that clarified things.

Unfortunately this doesn't just break our convenience __main__ trick for
running the doctests in any single module, it also breaks doing it via:

$ python run_tests.py doctest

This means we'd have to update the doctests to also use Python 3
style print functions... which may be premature (we'll need to do
this at some point though).

How about the less ambitious plan of replacing lines like this:

print variable

with:

print(variable)

This will be understood as a print function call on Python 3 (and work),
and will also work on Python 2 (without the future import) where it will
be parsed as redundant parentheses.

Note you can't use this trick where more than one variable is printed,
because then on Python 2 the brackets will create a tuple instead.

Peter


On Fri, Aug 30, 2013 at 2:28 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
> Sure, a common pattern for a lot of BioPython modules seems to be:
>
>     # +from __future__ import print_function
>
>
>     def foo():
>         """A docstring with print statement.
>
>         >>> print "foo"
>         foo
>         """
>         print "Running foo ..."
>         # +print("Running foo ...")
>
>
>     if __name__ == "__main__":
>         import doctest
>         doctest.testmod()
>
> where foo is some function, which uses print statement in its body. Since we
> want to switch from print statements to print function we replace print
> "Running foo ..." with a print() call and add from __future__ import ... to
> the beginning of the module.
>
> What happens if we try to run the doctests after we've switched to
> print_function?
>
>     $ python /tmp/foo.py
>     **********************************************************************
>     File "/tmp/foo.py", line 7, in __main__.foo
>     Failed example:
>         print "foo"
>     Exception raised:
>         Traceback (most recent call last):
>           File ".../doctest.py", line 1254, in __run
>             compileflags, 1) in test.globs
>           File "<doctest __main__.foo[0]>", line 1
>             print "foo"
>                       ^
>         SyntaxError: invalid syntax
>     **********************************************************************
>     1 items had failures:
>        1 of   1 in __main__.foo
>     ***Test Failed*** 1 failures.
>
> So, enabling print_function makes doctests using print statement fail with a
> SyntaxError, as shown by the example above. Thus, if we want to get rid of
> print statement in the code we have no other choice but to do the same it in
> the doctests.
>
> Sergei
>
>
>
> On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com)
> wrote:
>
> On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev <superbobry at gmail.com>
> wrote:
>>> (8) Excluding doctests and the Tutorial, use print function
>>> rather than print statement. e.g. replace this:
>>
>> Unfortunately we cannot exclude doctests, because 'from __future__' import
>> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on
>> docstrings with print statement.
>>
>> Sergei
>
> Could you clarify this? Does this cause a problem via:
>
> [Tests]$ python run_tests.py doctest
>
> If you have a small example, copy & paste the "git diff" output here.
>
> Peter

From p.j.a.cock at googlemail.com  Fri Aug 30 11:46:59 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 16:46:59 +0100
Subject: [Biopython-dev] Fwd: [biopython] Potential error in mass
	calculations for RNA/DNA? (#229)
In-Reply-To: <biopython/biopython/issues/229@github.com>
References: <biopython/biopython/issues/229@github.com>
Message-ID: <CAKVJ-_7EN2VR3qpinNUhDqOs=CD+0EFz40RSzjztS5fjH7gZYw@mail.gmail.com>

Who are our sequence mass experts?
https://github.com/biopython/biopython/issues/229

---------- Forwarded message ----------
From: nruggero <notifications at github.com>
Date: Thu, Aug 29, 2013 at 11:03 PM
Subject: [biopython] Potential error in mass calculations for RNA/DNA?
(#229)
To: biopython/biopython <biopython at noreply.github.com>


In Bio/Data/IUPACData.py the molecular weights of unambiguous DNA are
listed as:

unambiguous_dna_weights = {
    "A": 347.,
    "C": 323.,
    "G": 363.,
    "T": 322.,
    }

As far as I can tell these are the molecular weights for the non-deoxy
bases instead of the deoxy bases. For example, AMP (347.22) instead of dAMP
(331.22) is listed.

I've looked at the original BioPearl code that these numbers were taken
from and I think they were just copied incorrectly. I have also looked at
the code which uses this dict in Bio/SeqUtils/__init__.py called
molecular_weight() and it just takes the sum of these values over the
sequence (no correction made).

So, is this an error or am I missing something basic?
Thanks

?
Reply to this email directly or view it on
GitHub<https://github.com/biopython/biopython/issues/229>
.


From superbobry at gmail.com  Fri Aug 30 18:53:53 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Sat, 31 Aug 2013 02:53:53 +0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>
Message-ID: <CAFQn-3_92W5NFht51YapuhrB64hVhQOvqCXfuh6JoR7NO2ragQ@mail.gmail.com>

Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which
does all the job, so I think I can take #9.

Sergei

[*] https://github.com/biopython/biopython/pull/230


On Fri, Aug 30, 2013 at 2:49 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Hello Biopythoneers,
>
> I've outlined another relatively simple improvement for potential
> new contributors to try below....
>
> On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >>> Hello all - especially newcomers,
> >>>
> >>> There are going to be several boring but useful things to do to
> >>> the Biopython code base once we're finished with Python 2.5
> >>> (the imminent release of Biopython 1.62 has been clearly
> >>> described as the final Biopython release to support it).
> >>>
> >>> ...
> >>>
> >>> (4) Scan over the code base looking for any comments
> >>> about Python 2.5 (e.g. using the grep command), and
> >>> reviewing them one by one to see if there is an old
> >>> workaround we can now remove.
> >>
> >> Lenna had a quick look, there should be some easy one here.
> >>
> >>> (5) More advanced code review, for example looking
> >>> for places we can better take advantage of context
> >>> managers (with statements) for file handles.
> >>
> >> Another new one, related to (5), and fairly easy:
> >>
> >> (6) Reviewing examples in the docstrings and Tutorial
> >> where it would make sense to use a 'with' for file handles.
> >>
> >> This should also solve many of the ResourceWarning:
> >> unclosed file ... warnings visible running the full test
> >> suite under Python 3, e.g. see:
> >>
> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio
> >
> > On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> ... I have come up with another easy task instead,
> >>
> >> (7) Update exception style
>
> (7) was done by Chris Mitchell,
>
> https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942
>
> > (8) Excluding doctests and the Tutorial, use print function
> > rather than print statement. e.g. replace this:
>
> (8) is being looked at by Sergei Lebedev.
>
> ----
>
> Here's another idea, under the general issue (5) of taking
> advantage of context managers (with statements), which
> I would judge to be fairly easy (but not trivial).
>
> (9) Use context managers (with statements) for temporary
> warning filters in the unit tests.
>
> Currently many of our unit tests add simple filters to ignore
> a warning, and then restore the old filters using pop(). This
> mostly works, but is fragile and the filter list is global so this
> can have strange side effects. See:
>
> $ grep "warnings." Tests/*.py
>
> The idea here is to replace this:
>
> warnings.simplefilter('ignore', PDBConstructionWarning)
> #some code which may trigger the warning
> warnings.filters.pop()
>
> with this:
>
> with warnings.catch_warnings():
>     warnings.simplefilter("ignore", PDBConstructionWarning)
>     #some code which may trigger the warning
>
> Note the indentation - these changes will not give nice
> clean diffs, so this will not be so easy to review.
>
> I would therefore suggest editing just one test file at a
> time (i.e. limit each commit to changing a single file), as
> that makes it easier to selectively apply your changes
>
> Please make sure you test this Python 2.6 which is most
> likely to have problems with this "new" style ;)
>
> (Again, if anyone plans to work on this, please let the list
> know to minimised duplicated effort.)
>
> If you're not familiar with our test suite, there is a chapter
> introducing this in the main Tutorial & Cookbook,
> http://biopython.org/DIST/docs/tutorial/Tutorial.html
>
> Thanks,
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>

From p.j.a.cock at googlemail.com  Sat Aug 31 05:31:53 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sat, 31 Aug 2013 10:31:53 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAFQn-3_92W5NFht51YapuhrB64hVhQOvqCXfuh6JoR7NO2ragQ@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>
	<CAFQn-3_92W5NFht51YapuhrB64hVhQOvqCXfuh6JoR7NO2ragQ@mail.gmail.com>
Message-ID: <CAKVJ-_7nPcfApCLgMGjb8-jToJDcDnkk8ve7LiiNgy5w0prxrQ@mail.gmail.com>

On Fri, Aug 30, 2013 at 11:53 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
> Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which
> does all the job, so I think I can take #9.
>
> Sergei
>
> [*] https://github.com/biopython/biopython/pull/230

Print-function-like syntax committed for (8), thank you.
We'll need to come back to this later as there are still
lots of print statements left in the codebase... time for
a more general discussion about what people would
prefer to see in the user-facing documentation.

If you'd like to try some context managers for the
warnings in the unit tests (9), that would be great.

Note some of the tests will require you to install a
command line tool - it should be clear, but if we
need to add more documentation (e.g. URLs) please
let us know.

Thanks,

Peter

From eric.talevich at gmail.com  Thu Aug  1 20:04:29 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Thu, 1 Aug 2013 13:04:29 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
Message-ID: <CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>

On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Wednesday, July 31, 2013, Ben Fulton wrote:
>
> > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see
> how
> > much code is covered by tests. The overall total was 74% which is pretty
> > respectable.
> >
> > I ran the tests on a fairly fresh machine, which meant I had to install a
> > lot of software, some of which I either didn't get installed properly, or
> > the tests are out of date, or there were failures for some other reason.
> I
> > ended up having to skip seven test files:
> >
> > Dialign_Tool
> > EmbossPhylipNew
> > Mafft
> > PopGen_DFDist
> > PopGen_FDist
> > XXMotif
> > phyml
>
>
> I'm pretty sure I have some or all of those setup on at least one
> of my test machines, so with a little more work together we
> can try to resolve those (which may mean updating the docs).
>

I just fixed the error in test_phyml_tool.py, it was a simple one:
https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f


> There were three tests I managed to get running but still had failures:
> >
> > FastTree
> > NCBI_BLAST
> > Prank_too
>

The FastTree test is not based on the unittest framework, so the output
contains the word "Failed" in three places to describe error-handling tests
that worked correctly. Can we see the output for this one? (It works on my
machine.)

The test is also fairly new, so there could be some version-compatibility
issues there too.

Thanks,
Eric


From ben at benfulton.net  Fri Aug  2 02:20:49 2013
From: ben at benfulton.net (Ben Fulton)
Date: Thu, 1 Aug 2013 22:20:49 -0400
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
Message-ID: <CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>

My test machine was running Ubuntu 12.04.

For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and
got this error:
ApplicationError: Command 'fasttree -out temp_test.tree
Quality/example.fasta' returned non-zero exit status 1, 'Unknown or
incorrect use of option -out'

The NCBI_BLAST error involves rpsblast not being in the install. Version
2.2.25-7 using apt-get.

Dialign is version 2.2.1-5 using apt-get. I got two errors: first,
DIALIGN2_DIR not being set. It was installed to /usr/bin so I set
DIALIGN2_DIR to that directory; then I got "Environment variable
DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these
items are needed, though I may have missed them in the documentation.

I downloaded version 130708 of Prank from
http://code.google.com/p/prank-msa/downloads/list. The error is on line 165
of the test file:

AssertionError:
-----------------
 PRANK v.130708:
-----------------

Input for the analysis
 - converting 'Quality/example.fasta' to 'temp with space.phy'

EmbossPhylipNew I tried to install from source, but it was complicated and
I didn't get it finished.

I'll send some notes on the other errors when I get a few minutes.


On Thu, Aug 1, 2013 at 4:04 PM, Eric Talevich <eric.talevich at gmail.com>wrote:

> On Wed, Jul 31, 2013 at 12:40 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:
>
>> On Wednesday, July 31, 2013, Ben Fulton wrote:
>>
>> > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see
>> how
>> > much code is covered by tests. The overall total was 74% which is pretty
>> > respectable.
>> >
>> > I ran the tests on a fairly fresh machine, which meant I had to install
>> a
>> > lot of software, some of which I either didn't get installed properly,
>> or
>> > the tests are out of date, or there were failures for some other
>> reason. I
>> > ended up having to skip seven test files:
>> >
>> > Dialign_Tool
>> > EmbossPhylipNew
>> > Mafft
>> > PopGen_DFDist
>> > PopGen_FDist
>> > XXMotif
>> > phyml
>>
>>
>> I'm pretty sure I have some or all of those setup on at least one
>> of my test machines, so with a little more work together we
>> can try to resolve those (which may mean updating the docs).
>>
>
> I just fixed the error in test_phyml_tool.py, it was a simple one:
>
> https://github.com/biopython/biopython/commit/90da547f0a85c00d3ca300bdf52bdb96ddeb449f
>
>
> > There were three tests I managed to get running but still had failures:
>> >
>> > FastTree
>> > NCBI_BLAST
>> > Prank_too
>>
>
> The FastTree test is not based on the unittest framework, so the output
> contains the word "Failed" in three places to describe error-handling tests
> that worked correctly. Can we see the output for this one? (It works on my
> machine.)
>
> The test is also fairly new, so there could be some version-compatibility
> issues there too.
>
> Thanks,
> Eric
>


From glenveegee at gmail.com  Fri Aug  2 08:17:14 2013
From: glenveegee at gmail.com (Glen van Ginkel)
Date: Fri, 2 Aug 2013 09:17:14 +0100
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <51FB69C6.3040200@ebi.ac.uk>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
Message-ID: <CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>

Hi all,

Given Lenna's recent work on the mmCIF parser I thought this might be of
interest.

Kind regards,

Glen

wwPDB Workshop on mmCIF/PDBx for Programmers
--------------------------------------------

What, why and how?
------------------
The world of the PDB will be changing rapidly and profoundly over the next
few
years. A major change will involve the transition from PDB to mmCIF/PDBx as
the principal deposition and dissemination format (see
http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
http://wwpdb.org/workshop/wgroup.html). To help software developers in the
area of structural biology to make the transition and begin supporting the
mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
organising a programmers workshop. This two-day event will include lectures
by
experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
language-specific libraries or packages (C/C++, Java, Python). Ample time
will
be devoted to tutorials and individual "code hacking", with the experts
available to assist the workshop participants. Confirmed tutors include Paul
Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas
Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).

When and where?
---------------
The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton,
Cambridge, UK, on 20 and 21 November 2013.

How much?
---------
If you are selected as a participant, we expect you to pay for your own
travel
to and from Cambridge. However, there is no fee for this workshop, and we
will
provide accommodation (at the HolidayInn Express in nearby Duxford;
http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner
on
the 20th (all thanks to generous funding from the Wellcome Trust to PDBe).

Who can apply and how?
----------------------
This workshop is intended for "high-powered" software developers in any area
of structural biology and structural bioinformatics whose products process
(read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods,
visualisation, validation, modelling, docking, structure prediction, etc. To
ensure a high ratio of tutors to workshop participants, the number of
participants is limited to 15.

You can apply for the workshop by sending an e-mail to Sameer Velankar at
PDBe
(sameer at ebi.ac.uk) no later than 31 August 2013. Please include:

- a brief description of the software program(s) or package(s) you have
developed or are developing, what it does, in which field, how many users,
relevant publications, etc.;
- what programming language(s) you are specifically interested in;
- how you would benefit from this workshop;
- any specific topics or questions you would like to see addressed in the
workshop.

If the workshop is oversubscribed, we will use the information and
motivation
provided by the applicants to select the participants.

Participants are expected to bring their own laptop with compilers etc.
installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
participants who are aware of the basic principles of the format will
probably
gain more from the workshop.

Applicants will be informed by mid-September if they have been selected or
not, or if they are on the stand-by list.

For informal inquiries about the workshop, please contact Sameer Velankar at
PDBe (sameer at ebi.ac.uk).

Please feel free to distribute this announcement to other interested people
or
fora!


--Gerard Kleywegt & Sameer Velankar
   Protein Data Bank in Europe
   A member of the Worldwide Protein Data Bank

---
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
gerard at ebi.ac.uk ..................... pdbe.org
Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
https://lists.sdsc.edu/mailman/listinfo/pdb-l .


From p.j.a.cock at googlemail.com  Fri Aug  2 09:16:53 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 10:16:53 +0100
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
	<CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
Message-ID: <CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>

Thanks for forwarding that Glen - it would be great if any of
our structural Biopython folk could go.

Is anyone interested & reasonably close to Cambridge UK?

Peter

On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel <glenveegee at gmail.com> wrote:
> Hi all,
>
> Given Lenna's recent work on the mmCIF parser I thought this might be of
> interest.
>
> Kind regards,
>
> Glen
>
> wwPDB Workshop on mmCIF/PDBx for Programmers
> --------------------------------------------
>
> What, why and how?
> ------------------
> The world of the PDB will be changing rapidly and profoundly over the next
> few
> years. A major change will involve the transition from PDB to mmCIF/PDBx as
> the principal deposition and dissemination format (see
> http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
> http://wwpdb.org/workshop/wgroup.html). To help software developers in the
> area of structural biology to make the transition and begin supporting the
> mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
> organising a programmers workshop. This two-day event will include lectures
> by
> experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
> language-specific libraries or packages (C/C++, Java, Python). Ample time
> will
> be devoted to tutorials and individual "code hacking", with the experts
> available to assist the workshop participants. Confirmed tutors include Paul
> Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac), Andreas
> Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).
>
> When and where?
> ---------------
> The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in Hinxton,
> Cambridge, UK, on 20 and 21 November 2013.
>
> How much?
> ---------
> If you are selected as a participant, we expect you to pay for your own
> travel
> to and from Cambridge. However, there is no fee for this workshop, and we
> will
> provide accommodation (at the HolidayInn Express in nearby Duxford;
> http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop dinner
> on
> the 20th (all thanks to generous funding from the Wellcome Trust to PDBe).
>
> Who can apply and how?
> ----------------------
> This workshop is intended for "high-powered" software developers in any area
> of structural biology and structural bioinformatics whose products process
> (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid methods,
> visualisation, validation, modelling, docking, structure prediction, etc. To
> ensure a high ratio of tutors to workshop participants, the number of
> participants is limited to 15.
>
> You can apply for the workshop by sending an e-mail to Sameer Velankar at
> PDBe
> (sameer at ebi.ac.uk) no later than 31 August 2013. Please include:
>
> - a brief description of the software program(s) or package(s) you have
> developed or are developing, what it does, in which field, how many users,
> relevant publications, etc.;
> - what programming language(s) you are specifically interested in;
> - how you would benefit from this workshop;
> - any specific topics or questions you would like to see addressed in the
> workshop.
>
> If the workshop is oversubscribed, we will use the information and
> motivation
> provided by the applicants to select the participants.
>
> Participants are expected to bring their own laptop with compilers etc.
> installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
> participants who are aware of the basic principles of the format will
> probably
> gain more from the workshop.
>
> Applicants will be informed by mid-September if they have been selected or
> not, or if they are on the stand-by list.
>
> For informal inquiries about the workshop, please contact Sameer Velankar at
> PDBe (sameer at ebi.ac.uk).
>
> Please feel free to distribute this announcement to other interested people
> or
> fora!
>
>
> --Gerard Kleywegt & Sameer Velankar
>    Protein Data Bank in Europe
>    A member of the Worldwide Protein Data Bank
>
> ---
> Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
> gerard at ebi.ac.uk ..................... pdbe.org
> Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
> TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
> https://lists.sdsc.edu/mailman/listinfo/pdb-l .
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev


From p.j.a.cock at googlemail.com  Fri Aug  2 09:31:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 10:31:27 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
Message-ID: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>

Thanks for these details Ben - it sounds like a mixture of real
test failures, and mere warnings that an external tool wasn't
found.

On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton <ben at benfulton.net> wrote:
> My test machine was running Ubuntu 12.04.
>
> For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and
> got this error:
> ApplicationError: Command 'fasttree -out temp_test.tree
> Quality/example.fasta' returned non-zero exit status 1, 'Unknown or
> incorrect use of option -out'

I don't seem to have fasttree installed at all, and from the
test and wrapper it is not explicit about which version is
was originally written for.

> The NCBI_BLAST error involves rpsblast not being in the install.
> Version 2.2.25-7 using apt-get.

I believe this is down to an NCBI stupidity with binary name
clashes, both the old 'legacy' C BLAST and the new C++
BLAST+ suite have a binary called rpsblast.

Our test code copes with this by searching the path and checking
each rpsblast binary found - looking for the new version only.

However, Debian policy is to resolve ambiguities like this with
a unilateral renaming - in this case I *think* they called the new
binary rpsblast+ instead. Can you confirm that? I don't have
access to a Debian machine right now.

So, strictly speaking the Biopython test is correct - you don't
have the new rpsblast installed. However, it would be more
helpful if we also checked for the Debian alias rpsblast+ too.

That shouldn't be too complicated to do - especially if you
could rerun the tests using Biopython from git for me?

> Dialign is version 2.2.1-5 using apt-get. I got two errors: first,
> DIALIGN2_DIR not being set. It was installed to /usr/bin so I set
> DIALIGN2_DIR to that directory; then I got "Environment variable
> DIALIGN2_DIR directory missing BLOSUM file." I'm not sure either of these
> items are needed, though I may have missed them in the documentation.

This again looks like a Debian packaging issue versus the
manual install instructions for Dialign. Perhaps they have
fixed Dialign to find its matrix under a data folder...

You could try simple commenting out the check on the
environment variable in test_Dialign_tool.py and seeing
if the tests pass or not.

> I downloaded version 130708 of Prank from
> http://code.google.com/p/prank-msa/downloads/list. The error is on line 165
> of the test file:
>
> AssertionError:
> -----------------
>  PRANK v.130708:
> -----------------
>
> Input for the analysis
>  - converting 'Quality/example.fasta' to 'temp with space.phy'

This sounds like a minor change in the stdout with recent
versions of PRANK.

> EmbossPhylipNew I tried to install from source, but it was complicated and I
> didn't get it finished.
>
> I'll send some notes on the other errors when I get a few minutes.

Thanks,

Peter


From p.j.a.cock at googlemail.com  Fri Aug  2 12:00:54 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 13:00:54 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
Message-ID: <CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>

On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
>> The NCBI_BLAST error involves rpsblast not being in the install.
>> Version 2.2.25-7 using apt-get.
>
> I believe this is down to an NCBI stupidity with binary name
> clashes, both the old 'legacy' C BLAST and the new C++
> BLAST+ suite have a binary called rpsblast.
>
> Our test code copes with this by searching the path and checking
> each rpsblast binary found - looking for the new version only.
>
> However, Debian policy is to resolve ambiguities like this with
> a unilateral renaming - in this case I *think* they called the new
> binary rpsblast+ instead. Can you confirm that? I don't have
> access to a Debian machine right now.

Certainly this was their plan and was done on Bio-Linux,
http://lists.debian.org/debian-med/2011/05/msg00025.html

> So, strictly speaking the Biopython test is correct - you don't
> have the new rpsblast installed. However, it would be more
> helpful if we also checked for the Debian alias rpsblast+ too.
>
> That shouldn't be too complicated to do - especially if you
> could rerun the tests using Biopython from git for me?

This commit is now on our master branch,

https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a

Thanks,

Peter


From anaryin at gmail.com  Fri Aug  2 16:13:04 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Fri, 2 Aug 2013 09:13:04 -0700
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
	<CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
	<CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>
Message-ID: <CAJ9sUYNs+LYfpDcX-1eQVGry9SuKXqe=Q5yg-7HiXZbEWjsTaA@mail.gmail.com>

Hi Peter, Glen,

I'll be going (or trying to at least).

Cheers,

Jo?o


2013/8/2 Peter Cock <p.j.a.cock at googlemail.com>

> Thanks for forwarding that Glen - it would be great if any of
> our structural Biopython folk could go.
>
> Is anyone interested & reasonably close to Cambridge UK?
>
> Peter
>
> On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel <glenveegee at gmail.com>
> wrote:
> > Hi all,
> >
> > Given Lenna's recent work on the mmCIF parser I thought this might be of
> > interest.
> >
> > Kind regards,
> >
> > Glen
> >
> > wwPDB Workshop on mmCIF/PDBx for Programmers
> > --------------------------------------------
> >
> > What, why and how?
> > ------------------
> > The world of the PDB will be changing rapidly and profoundly over the
> next
> > few
> > years. A major change will involve the transition from PDB to mmCIF/PDBx
> as
> > the principal deposition and dissemination format (see
> > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
> > http://wwpdb.org/workshop/wgroup.html). To help software developers in
> the
> > area of structural biology to make the transition and begin supporting
> the
> > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
> > organising a programmers workshop. This two-day event will include
> lectures
> > by
> > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
> > language-specific libraries or packages (C/C++, Java, Python). Ample time
> > will
> > be devoted to tutorials and individual "code hacking", with the experts
> > available to assist the workshop participants. Confirmed tutors include
> Paul
> > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac),
> Andreas
> > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).
> >
> > When and where?
> > ---------------
> > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in
> Hinxton,
> > Cambridge, UK, on 20 and 21 November 2013.
> >
> > How much?
> > ---------
> > If you are selected as a participant, we expect you to pay for your own
> > travel
> > to and from Cambridge. However, there is no fee for this workshop, and we
> > will
> > provide accommodation (at the HolidayInn Express in nearby Duxford;
> > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop
> dinner
> > on
> > the 20th (all thanks to generous funding from the Wellcome Trust to
> PDBe).
> >
> > Who can apply and how?
> > ----------------------
> > This workshop is intended for "high-powered" software developers in any
> area
> > of structural biology and structural bioinformatics whose products
> process
> > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid
> methods,
> > visualisation, validation, modelling, docking, structure prediction,
> etc. To
> > ensure a high ratio of tutors to workshop participants, the number of
> > participants is limited to 15.
> >
> > You can apply for the workshop by sending an e-mail to Sameer Velankar at
> > PDBe
> > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include:
> >
> > - a brief description of the software program(s) or package(s) you have
> > developed or are developing, what it does, in which field, how many
> users,
> > relevant publications, etc.;
> > - what programming language(s) you are specifically interested in;
> > - how you would benefit from this workshop;
> > - any specific topics or questions you would like to see addressed in the
> > workshop.
> >
> > If the workshop is oversubscribed, we will use the information and
> > motivation
> > provided by the applicants to select the participants.
> >
> > Participants are expected to bring their own laptop with compilers etc.
> > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
> > participants who are aware of the basic principles of the format will
> > probably
> > gain more from the workshop.
> >
> > Applicants will be informed by mid-September if they have been selected
> or
> > not, or if they are on the stand-by list.
> >
> > For informal inquiries about the workshop, please contact Sameer
> Velankar at
> > PDBe (sameer at ebi.ac.uk).
> >
> > Please feel free to distribute this announcement to other interested
> people
> > or
> > fora!
> >
> >
> > --Gerard Kleywegt & Sameer Velankar
> >    Protein Data Bank in Europe
> >    A member of the Worldwide Protein Data Bank
> >
> > ---
> > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
> > gerard at ebi.ac.uk ..................... pdbe.org
> > Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
> > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
> > https://lists.sdsc.edu/mailman/listinfo/pdb-l .
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From p.j.a.cock at googlemail.com  Fri Aug  2 16:20:02 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 2 Aug 2013 17:20:02 +0100
Subject: [Biopython-dev] Fwd: pdb-l: Announcement: wwPDB Workshop on
 mmCIF/PDBx for Programmers, 20/21 Nov-13, Cambridge (UK)
In-Reply-To: <CAJ9sUYNs+LYfpDcX-1eQVGry9SuKXqe=Q5yg-7HiXZbEWjsTaA@mail.gmail.com>
References: <alpine.LRH.1.10.1308011834090.22795@struktbio205.bmc.uu.se>
	<51FB69C6.3040200@ebi.ac.uk>
	<CABrdYJKQ4QxHs1k3V1-t0--rF4EcBvFNs4OGMz5HotDSSpDo=A@mail.gmail.com>
	<CAKVJ-_555M7sGPoyr9NCiUoH+5-dDdqd18SXpX6+yM520h+Ohg@mail.gmail.com>
	<CAJ9sUYNs+LYfpDcX-1eQVGry9SuKXqe=Q5yg-7HiXZbEWjsTaA@mail.gmail.com>
Message-ID: <CAKVJ-_4j_9Bih3isF=q=pzAUVFyAWLuwWZ7H_xCQJ_EC+b_6CA@mail.gmail.com>

That's good new Jo?o - thanks! Peter.

On Fri, Aug 2, 2013 at 5:13 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
> Hi Peter, Glen,
>
> I'll be going (or trying to at least).
>
> Cheers,
>
> Jo?o
>
>
> 2013/8/2 Peter Cock <p.j.a.cock at googlemail.com>
>>
>> Thanks for forwarding that Glen - it would be great if any of
>> our structural Biopython folk could go.
>>
>> Is anyone interested & reasonably close to Cambridge UK?
>>
>> Peter
>>
>> On Fri, Aug 2, 2013 at 9:17 AM, Glen van Ginkel <glenveegee at gmail.com>
>> wrote:
>> > Hi all,
>> >
>> > Given Lenna's recent work on the mmCIF parser I thought this might be of
>> > interest.
>> >
>> > Kind regards,
>> >
>> > Glen
>> >
>> > wwPDB Workshop on mmCIF/PDBx for Programmers
>> > --------------------------------------------
>> >
>> > What, why and how?
>> > ------------------
>> > The world of the PDB will be changing rapidly and profoundly over the
>> > next
>> > few
>> > years. A major change will involve the transition from PDB to mmCIF/PDBx
>> > as
>> > the principal deposition and dissemination format (see
>> > http://www.wwpdb.org/news/news_2013.html#22-May-2013 and
>> > http://wwpdb.org/workshop/wgroup.html). To help software developers in
>> > the
>> > area of structural biology to make the transition and begin supporting
>> > the
>> > mmCIF/PDBx format in their own programs, wwPDB (http://wwpdb.org/) is
>> > organising a programmers workshop. This two-day event will include
>> > lectures
>> > by
>> > experts in mmCIF/PDBx (http://mmcif.rcsb.org/) and developers of
>> > language-specific libraries or packages (C/C++, Java, Python). Ample
>> > time
>> > will
>> > be devoted to tutorials and individual "code hacking", with the experts
>> > available to assist the workshop participants. Confirmed tutors include
>> > Paul
>> > Adams (Phenix), Eugene Krissinel (CCP4), Garib Murshudov (Refmac),
>> > Andreas
>> > Prlic (RCSB), Sameer Velankar (PDBe) and John Westbrook (RCSB).
>> >
>> > When and where?
>> > ---------------
>> > The workshop will be held at the EMBL-EBI (http://ebi.ac.uk/) in
>> > Hinxton,
>> > Cambridge, UK, on 20 and 21 November 2013.
>> >
>> > How much?
>> > ---------
>> > If you are selected as a participant, we expect you to pay for your own
>> > travel
>> > to and from Cambridge. However, there is no fee for this workshop, and
>> > we
>> > will
>> > provide accommodation (at the HolidayInn Express in nearby Duxford;
>> > http://www.hiexpresscambridgeduxford.co.uk/), lunches and a workshop
>> > dinner
>> > on
>> > the 20th (all thanks to generous funding from the Wellcome Trust to
>> > PDBe).
>> >
>> > Who can apply and how?
>> > ----------------------
>> > This workshop is intended for "high-powered" software developers in any
>> > area
>> > of structural biology and structural bioinformatics whose products
>> > process
>> > (read/write) PDB data - e.g., X-ray, NMR, 3DEM, SAXS/SANS, hybrid
>> > methods,
>> > visualisation, validation, modelling, docking, structure prediction,
>> > etc. To
>> > ensure a high ratio of tutors to workshop participants, the number of
>> > participants is limited to 15.
>> >
>> > You can apply for the workshop by sending an e-mail to Sameer Velankar
>> > at
>> > PDBe
>> > (sameer at ebi.ac.uk) no later than 31 August 2013. Please include:
>> >
>> > - a brief description of the software program(s) or package(s) you have
>> > developed or are developing, what it does, in which field, how many
>> > users,
>> > relevant publications, etc.;
>> > - what programming language(s) you are specifically interested in;
>> > - how you would benefit from this workshop;
>> > - any specific topics or questions you would like to see addressed in
>> > the
>> > workshop.
>> >
>> > If the workshop is oversubscribed, we will use the information and
>> > motivation
>> > provided by the applicants to select the participants.
>> >
>> > Participants are expected to bring their own laptop with compilers etc.
>> > installed. No previous knowledge of mmCIF/PDBx is strictly needed, but
>> > participants who are aware of the basic principles of the format will
>> > probably
>> > gain more from the workshop.
>> >
>> > Applicants will be informed by mid-September if they have been selected
>> > or
>> > not, or if they are on the stand-by list.
>> >
>> > For informal inquiries about the workshop, please contact Sameer
>> > Velankar at
>> > PDBe (sameer at ebi.ac.uk).
>> >
>> > Please feel free to distribute this announcement to other interested
>> > people
>> > or
>> > fora!
>> >
>> >
>> > --Gerard Kleywegt & Sameer Velankar
>> >    Protein Data Bank in Europe
>> >    A member of the Worldwide Protein Data Bank
>> >
>> > ---
>> > Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
>> > gerard at ebi.ac.uk ..................... pdbe.org
>> > Secretary: Pauline Haslam  pdbe_admin at ebi.ac.uk
>> > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
>> > https://lists.sdsc.edu/mailman/listinfo/pdb-l .
>> > _______________________________________________
>> > Biopython-dev mailing list
>> > Biopython-dev at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
>


From ben at benfulton.net  Mon Aug  5 01:28:34 2013
From: ben at benfulton.net (Ben Fulton)
Date: Sun, 4 Aug 2013 21:28:34 -0400
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
Message-ID: <CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>

Fixed the following:

I had installed Mafft version 6.850-1 from apt-get, which apparently is
more than a year old and doesn't work. The tests ran after I installed it
from source.

I had not gotten a path set up properly for XXMotif; once I did the tests
all ran.

The DiAlign tests passed after I removed the precondition checks.

Did not fix:

The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
find anywhere else to install the PopGen software from.


So with all of those modifications, I ran coverage against the latest code
from GitHub. Results are once again available on my website,
http://benfulton.net/BioPython162_Coverage , and the following issues
remain:

EmbossPhylipNew - skipped, too hard to install
Fasttree - error, apparently a versioning issue
PopGen_FDist and PopGen_DFdist - skipped, unavailable
Prank - failed, recent versions of the tool have some kind of output change


On Fri, Aug 2, 2013 at 8:00 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Aug 2, 2013 at 10:31 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >
> >> The NCBI_BLAST error involves rpsblast not being in the install.
> >> Version 2.2.25-7 using apt-get.
> >
> > I believe this is down to an NCBI stupidity with binary name
> > clashes, both the old 'legacy' C BLAST and the new C++
> > BLAST+ suite have a binary called rpsblast.
> >
> > Our test code copes with this by searching the path and checking
> > each rpsblast binary found - looking for the new version only.
> >
> > However, Debian policy is to resolve ambiguities like this with
> > a unilateral renaming - in this case I *think* they called the new
> > binary rpsblast+ instead. Can you confirm that? I don't have
> > access to a Debian machine right now.
>
> Certainly this was their plan and was done on Bio-Linux,
> http://lists.debian.org/debian-med/2011/05/msg00025.html
>
> > So, strictly speaking the Biopython test is correct - you don't
> > have the new rpsblast installed. However, it would be more
> > helpful if we also checked for the Debian alias rpsblast+ too.
> >
> > That shouldn't be too complicated to do - especially if you
> > could rerun the tests using Biopython from git for me?
>
> This commit is now on our master branch,
>
>
> https://github.com/biopython/biopython/commit/148b681a66061cc03d70f940a2efdede29adc64a
>
> Thanks,
>
> Peter
>


From yeyanbo289 at gmail.com  Mon Aug  5 08:57:34 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 5 Aug 2013 16:57:34 +0800
Subject: [Biopython-dev] GSOC weekly update 8
Message-ID: <CADoMHjxT7pQ81T8KSkTfb+-LKtOM-5dATVcf5EACdxiN0TU4Qw@mail.gmail.com>

Hi all,

I post an update for the Biopython.Phylo project here:
http://blog.yeyanbo.com/posts/google-summer-of-code-8.html

Thanks,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*


From p.j.a.cock at googlemail.com  Mon Aug  5 11:46:00 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 12:46:00 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
Message-ID: <CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>

On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
>
> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
> find anywhere else to install the PopGen software from.
>

There seems to be a fairly recent snapshot on archive.org,
http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html

Meanwhile, I have emailed Dr. Mark Beaumont at Reading
University to ask about the server status.

Regards,

Peter


From p.j.a.cock at googlemail.com  Mon Aug  5 12:14:04 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 13:14:04 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
	<CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
Message-ID: <CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>

On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
>>
>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
>> find anywhere else to install the PopGen software from.
>>
>
> There seems to be a fairly recent snapshot on archive.org,
> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html
>
> Meanwhile, I have emailed Dr. Mark Beaumont at Reading
> University to ask about the server status.

Mark has moved to Bristol:
http://www.maths.bris.ac.uk/people/profile/mamab

FDist and DFDist are available here now:
http://www.maths.bris.ac.uk/~mamab/

We need to update the Biopython documentation (and check
those versions from Bristol still work with our tests).

Tiago, could you handle that?

Thanks,

Peter


From arklenna at gmail.com  Mon Aug  5 13:11:19 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Mon, 5 Aug 2013 09:11:19 -0400
Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues?
In-Reply-To: <CAKVJ-_7oiLZM0EOEE_Y_Z6Ob-sYdk2KM518DoaQbohRxyhiXyA@mail.gmail.com>
References: <CAKVJ-_7U8HW4wa657oEYsR=vC=+cXV1O1nREps118O6F1uYjTQ@mail.gmail.com>
	<CADEGkF6LAVmLd1SmVp5UNexaWe5irzxD+9NHm2kAvR7r0KmxXA@mail.gmail.com>
	<CAKVJ-_7Ae2wsZurbNDjTHnAEEwtESDjqKMDbqVOcy66-emeW3w@mail.gmail.com>
	<CAHQkFddA8VHDvAmt_ThwfhRHTjF5HptZCE6xayvJH1aW2nLqYg@mail.gmail.com>
	<CAKVJ-_4C-Qf4qSsWCCasc5Mv6r93rDgiZX1f-imyF5joN+PjvA@mail.gmail.com>
	<CAKVJ-_4DLafSQ6NPnda_BUf1eRQhVLUHGdi834K-4RpBfS9uEg@mail.gmail.com>
	<CAKVJ-_7oiLZM0EOEE_Y_Z6Ob-sYdk2KM518DoaQbohRxyhiXyA@mail.gmail.com>
Message-ID: <CAHQkFdc=7NUYsaCjy5cgAHL1Bo14idGonxcETRSo04gaOzjjZA@mail.gmail.com>

Peter,

It's been a few days that I can't connect to redmine. I just got a error
page saying RoR couldn't start or connect to the MySQL server.

Cheers,

Lenna


On Mon, Jul 22, 2013 at 10:36 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Mon, Jul 22, 2013 at 12:43 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >
> > Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now.
> >
> > In the absence of any dissenting views, and the fact that RedMine is
> > also offline right now (which I've raised with the OBF admin volunteers),
>
> Fixed again :)
>
> > I've enabled GitHub issues & linked to this from the main page:
> >
> > https://github.com/biopython/biopython/issues
> >
> > You'll notice there are already lots of issues there - all pull request
> > related. This is one reason why an automated import of the old
> > Bugzilla/RedMine issues could be complicated.
> >
> > Various other bits of our documentation will need to be updated...
>
> Hopefully done now, e.g.
>
> https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079
>
> https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079
>
> Note that there doesn't seem to be a way to turn off new issues in
> a RedMine project - there are hacks via removing the ability from
> the roles, but I fear that would affect the other projects still using
> the RedMine server (e.g. BioPerl).
>
> Instead we may just have to do the triage/migration and then
> drop the links to the old RedMine server from the website etc.
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From p.j.a.cock at googlemail.com  Mon Aug  5 13:43:19 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 14:43:19 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
	<CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
	<CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>
Message-ID: <CAKVJ-_5dL-Fsd3UtDneGgLR2PJVDfHaFFbi6+tSeYqV1DZYNNw@mail.gmail.com>

On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
>>>
>>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I can't
>>> find anywhere else to install the PopGen software from.
>>>
>>
>> There seems to be a fairly recent snapshot on archive.org,
>> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html
>>
>> Meanwhile, I have emailed Dr. Mark Beaumont at Reading
>> University to ask about the server status.
>
> Mark has moved to Bristol:
> http://www.maths.bris.ac.uk/people/profile/mamab
>
> FDist and DFDist are available here now:
> http://www.maths.bris.ac.uk/~mamab/
>
> We need to update the Biopython documentation (and check
> those versions from Bristol still work with our tests).
>
> Tiago, could you handle that?

According to his email auto-reply, Tiago is away right now.

I've updated a couple of URLs in the source code:
https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850

Ben - did you see anything else which needs updating here?

Thanks,

Peter


From p.j.a.cock at googlemail.com  Mon Aug  5 14:01:12 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 15:01:12 +0100
Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues?
In-Reply-To: <CAHQkFdc=7NUYsaCjy5cgAHL1Bo14idGonxcETRSo04gaOzjjZA@mail.gmail.com>
References: <CAKVJ-_7U8HW4wa657oEYsR=vC=+cXV1O1nREps118O6F1uYjTQ@mail.gmail.com>
	<CADEGkF6LAVmLd1SmVp5UNexaWe5irzxD+9NHm2kAvR7r0KmxXA@mail.gmail.com>
	<CAKVJ-_7Ae2wsZurbNDjTHnAEEwtESDjqKMDbqVOcy66-emeW3w@mail.gmail.com>
	<CAHQkFddA8VHDvAmt_ThwfhRHTjF5HptZCE6xayvJH1aW2nLqYg@mail.gmail.com>
	<CAKVJ-_4C-Qf4qSsWCCasc5Mv6r93rDgiZX1f-imyF5joN+PjvA@mail.gmail.com>
	<CAKVJ-_4DLafSQ6NPnda_BUf1eRQhVLUHGdi834K-4RpBfS9uEg@mail.gmail.com>
	<CAKVJ-_7oiLZM0EOEE_Y_Z6Ob-sYdk2KM518DoaQbohRxyhiXyA@mail.gmail.com>
	<CAHQkFdc=7NUYsaCjy5cgAHL1Bo14idGonxcETRSo04gaOzjjZA@mail.gmail.com>
Message-ID: <CAKVJ-_6EHt1Y-hjYSVg5HkCd=N-6Yacym4qRGAE+w6m+5svjxA@mail.gmail.com>

On Mon, Aug 5, 2013 at 2:11 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> Peter,
>
> It's been a few days that I can't connect to redmine. I just got a error
> page saying RoR couldn't start or connect to the MySQL server.
>
> Cheers,
>
> Lenna

OK, Chris Dag has got RedMine to work again, and told
me what he did in case I need to restart if this happens
again. If any RedMine guru is reading and has some
thoughts on the cause and long term solution, drop us
an email please.

As to issue triage - I suggest you start with anything you
filed or commented on, then things you are familiar with.
But any order is fine really.

I suggest for "moving" an issue, we file the new GitHub
issue (linking to the old issue, but also trying to capture
any relevant information from the old bug tracker to be
self sufficient), and then close the old RedMine issue
with a link to its replacement.

Thanks,

Peter


From p.j.a.cock at googlemail.com  Mon Aug  5 14:26:32 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 5 Aug 2013 15:26:32 +0100
Subject: [Biopython-dev] Bio.XXX.Applications vs Bio.motifs.applications
Message-ID: <CAKVJ-_7LGhWHBrT1JDSkB2GyC9f-mToNVs=TD2nitP5FLskZtQ@mail.gmail.com>

Hi all,

I've noticed that as part of migrating from Bio.Motif to Bio.motifs,
the Applications module has acquired a lower case name.

Lower case module names are in principle a good thing (PEP8)
but elsewhere in Biopython the Applications modules are all
using title case.

Would a lower case shorter name be better, such as apps
(i.e. Bio.motifs.apps in this case)? This could also be adopted
in other modules for a gradual conversion if desired (e.g.
introduce Bio.Phylo.apps as an alias for Bio.Phylo.Applications).

What do people think?

Thanks,

Peter


From dalke at dalkescientific.com  Tue Aug  6 01:18:06 2013
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Tue, 6 Aug 2013 03:18:06 +0200
Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython?
In-Reply-To: <CAKVJ-_61GFD3QFEYKwhwyNXfvPnGQc9EHPk3CUs=7CbqErFjnw@mail.gmail.com>
References: <CAKVJ-_5i0M0LHWpR=eWcDEP-X-Dmm9jeggWY7aYdDFXhxO01xQ@mail.gmail.com>
	<CAKVJ-_61GFD3QFEYKwhwyNXfvPnGQc9EHPk3CUs=7CbqErFjnw@mail.gmail.com>
Message-ID: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com>

On Jul 24, 2013, at 11:13 AM, Peter Cock wrote:
> The current Biopython License is very short and liberal, and I have
> long described it as an MIT/BSD type licence. However the actual
> wording matches neither of these exactly (as far as I could tell):

That's my doing. When Jeff and I started Biopython in 1999 we
needed to choose a license. We started with the Python license,
which (for 1.5.2) was:

  Permission to use, copy, modify, and distribute this software and its
  documentation for any purpose and without fee is hereby granted,
  provided that the above copyright notice appear in all copies and that
  both that copyright notice and this permission notice appear in
  supporting documentation, and that the names of Stichting Mathematisch
  Centrum or CWI or Corporation for National Research Initiatives or
  CNRI not be used in advertising or publicity pertaining to
  distribution of the software without specific, written prior
  permission.

  While CWI is the initial source for this software, a modified version
  is made available by the Corporation for National Research Initiatives
  (CNRI) at the Internet address ftp://ftp.python.org.

  STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH
  REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
  MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH
  CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
  DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
  PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
  TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
  PERFORMANCE OF THIS SOFTWARE.

Compare that to the Biopython license, with the alterations marked:

  Permission to use, copy, modify, and distribute this software
  and its documentation >>>with or without modifications<< and for
  any purpose and without fee is hereby granted, provided that
  >>any copyright notices<<< appear in all copies and that both
  >>>those copyright notices<<< and this permission notice appear
  in supporting documentation, and that the names of >>>the
  contributors or copyright holders<<< not be used in advertising
  or publicity pertaining to distribution of the software without
  specific prior permission.

  [2nd paragraph of original Python license omitted]

  >>>THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE<<<
  DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
  ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT
  SHALL >>>THE CONTRIBUTORS OR COPYRIGHT HOLDERS<<< BE LIABLE FOR
  ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
  IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
  ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
  THIS SOFTWARE.

This was called a "Python-style license", and you can see an
example at http://effbot.org/zone/copyright.htm . Indeed, his
PIL package is an example of a current Python module which
still uses that license:
  http://www.pythonware.com/products/pil/license.htm


You'll see that Fredrik Lundh refers to it as the "Historical
Permission Notice and Disclaimer", and points to:

  http://opensource.org/licenses/historical.php

Further note that the OSI comments that "This License has been
voluntarily deprecated by its author" .. whatever that
means ... and that that http://opensource.org/proliferation-report
describes it as "redundant with more popular licenses", and
more specifically the BSD.


> In theory we could ask the OSI to approve our current license, but as
> they explain "yet another license" is not a good thing to encourage:
> http://opensource.org/proliferation

It wouldn't be a "yet another license" as it's already
registered with the OSI ... almost.

The one odd alteration I made was to add "with or without
modifications", because some people on comp.lang.python
expressed concern that "use, copy, modify, and distribute"
could be interpreted to be restrictive, as in "you can
modify it original source code, or distribute the original
source code, but you can't distribute the modified source
code. I've since learned that this is a hyper-picky
interpretation with no legal bearing.

I don't know if that "with or without modifications" is
enough different that the OSI would say it's doesn't fall
under the 'Historical Permission Notice and Disclaimer',


In any case, I agree with a relicensing. The current
license is from a bygone era. Nowadays I just pick the MIT
license.

If there's anything copyright by me still remaining in
Biopython, I hereby relicense it under the MIT and/or one
of the standard n-clause BSD licenses, at your choice.


Cheers,

				Andrew
				dalke at dalkescientific.com


From p.j.a.cock at googlemail.com  Tue Aug  6 09:11:33 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 6 Aug 2013 10:11:33 +0100
Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython?
In-Reply-To: <9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com>
References: <CAKVJ-_5i0M0LHWpR=eWcDEP-X-Dmm9jeggWY7aYdDFXhxO01xQ@mail.gmail.com>
	<CAKVJ-_61GFD3QFEYKwhwyNXfvPnGQc9EHPk3CUs=7CbqErFjnw@mail.gmail.com>
	<9B34F2CB-2D39-40C5-A462-3C99CFB317D3@dalkescientific.com>
Message-ID: <CAKVJ-_4Vtp6+_vNmH0wP9r3z=Egm-u-e_+bqGkdgQHsr+BGEHg@mail.gmail.com>

On Tue, Aug 6, 2013 at 2:18 AM, Andrew Dalke <dalke at dalkescientific.com> wrote:
> On Jul 24, 2013, at 11:13 AM, Peter Cock wrote:
>> The current Biopython License is very short and liberal, and I have
>> long described it as an MIT/BSD type licence. However the actual
>> wording matches neither of these exactly (as far as I could tell):
>
> That's my doing. When Jeff and I started Biopython in 1999 we
> needed to choose a license. We started with the Python license,
> which (for 1.5.2) was:
>
> ...

Ah - with hindsight I should have checked the older Python
licenses, but I was thinking more of their current very long
version.

> You'll see that Fredrik Lundh refers to it as the "Historical
> Permission Notice and Disclaimer", and points to:
>
>   http://opensource.org/licenses/historical.php
>
> Further note that the OSI comments that "This License has been
> voluntarily deprecated by its author" .. whatever that
> means ... and that that http://opensource.org/proliferation-report
> describes it as "redundant with more popular licenses", and
> more specifically the BSD.
>
>> In theory we could ask the OSI to approve our current license, but as
>> they explain "yet another license" is not a good thing to encourage:
>> http://opensource.org/proliferation
>
> It wouldn't be a "yet another license" as it's already
> registered with the OSI ... almost.
>
> The one odd alteration I made was to add "with or without
> modifications", because some people on comp.lang.python
> expressed concern that "use, copy, modify, and distribute"
> could be interpreted to be restrictive, as in "you can
> modify it original source code, or distribute the original
> source code, but you can't distribute the modified source
> code. I've since learned that this is a hyper-picky
> interpretation with no legal bearing.
>
> I don't know if that "with or without modifications" is
> enough different that the OSI would say it's doesn't fall
> under the 'Historical Permission Notice and Disclaimer',

Thanks for that background information. Educational.

> In any case, I agree with a relicensing. The current
> license is from a bygone era. Nowadays I just pick the MIT
> license.
>
> If there's anything copyright by me still remaining in
> Biopython, I hereby relicense it under the MIT and/or one
> of the standard n-clause BSD licenses, at your choice.

That's great Andrew - thank you,

Peter


From p.j.a.cock at googlemail.com  Tue Aug  6 22:51:22 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 6 Aug 2013 23:51:22 +0100
Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application
	plans
Message-ID: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>

Hi Christian et al.,

I've just noticed something in the XXmotif wrapper which
I should have raised back in November 2012 when it was
committed. This is to do with the way the options were
define, e.g.

      _Option(["--negSet", "negSet", "negset", "NEGSET"],
                   "sequence set which has to be used as a reference set",
                   filename = True,
                   equate = False),

The first argument is a list of names, aliases which can
be used via the (legacy) set_parameter method. Of
these the first is what goes in the actual command
string, and the last must be a valid Python identifier
and becomes a property and a keyword argument
for the __init__ method (and ideally follow PEP8
guidelines).

Normally the _Option would just have TWO alias,
in this case ["--negSeq, "negset"] would seem best.

Clearly I'd not documented this well enough, but
I've tried to make this more explicit now:
https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293

Was there a special reason for all these case variants
in the XXmotif options??

We could perhaps just change this now in the newer
Bio.motifs module, despite this being live in the
Biopython 1.61 release... since right now the nasty
all upper case aliases are being used as the property
names and keyword names. But that could break a
few scripts already using Bio.motifs.application's
XXmotif wrapper.

Looking ahead, other than set_parameter, all the other
legacy bits in Bio.Application have all been removed -
so we could take a fresh look at if we can transition to
a more explicit application definition, which I hope is
possible with the class files defining these properties
explicitly (perhaps with decorators for things like
validation methods) - rather than implicitly as now
via the __init__ method which doesn't suit things
like autogenerated API docs.

There may be a catch in how to best make the
parameter order explicit (currently done via the
parameters being in a list) which can be vital for
many command line tools.

Regards,

Peter


From christian at brueffer.de  Thu Aug  8 10:37:19 2013
From: christian at brueffer.de (Christian Brueffer)
Date: Thu, 08 Aug 2013 12:37:19 +0200
Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application
	plans
In-Reply-To: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>
References: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>
Message-ID: <520374DF.9070301@brueffer.de>

On 8/7/13 0:51 , Peter Cock wrote:
> Hi Christian et al.,
> 
> I've just noticed something in the XXmotif wrapper which
> I should have raised back in November 2012 when it was
> committed. This is to do with the way the options were
> define, e.g.
> 
>       _Option(["--negSet", "negSet", "negset", "NEGSET"],
>                    "sequence set which has to be used as a reference set",
>                    filename = True,
>                    equate = False),
> 
> The first argument is a list of names, aliases which can
> be used via the (legacy) set_parameter method. Of
> these the first is what goes in the actual command
> string, and the last must be a valid Python identifier
> and becomes a property and a keyword argument
> for the __init__ method (and ideally follow PEP8
> guidelines).
> 

Yeah, unfortunately I wasn't aware of this detail.

> Normally the _Option would just have TWO alias,
> in this case ["--negSeq, "negset"] would seem best.
> 
> Clearly I'd not documented this well enough, but
> I've tried to make this more explicit now:
> https://github.com/biopython/biopython/commit/39a88714ab7ee7a8dc4ed2b7a7ea71569fdd4293
> 
> Was there a special reason for all these case variants
> in the XXmotif options??
> 

I basically followed the example set by
Bio/Align/Applications/_Clustalw.py.  The "rationale" was to allow for
people to use their
favourite spelling variety.

I guess it was bad luck this happened to serve as an example, as it
was the first piece of code I ever touched in BioPython.

It would be nice to streamline all application wrappers in this regard
sometime...

Chris


From p.j.a.cock at googlemail.com  Thu Aug  8 11:00:22 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 8 Aug 2013 12:00:22 +0100
Subject: [Biopython-dev] Adjusting the xxMotif wrapper / Bio.Application
	plans
In-Reply-To: <520374DF.9070301@brueffer.de>
References: <CAKVJ-_4S7NeyFmjss5hEUN+2NVBGT4Lmd0AD_FgWPeO9LOymLg@mail.gmail.com>
	<520374DF.9070301@brueffer.de>
Message-ID: <CAKVJ-_5BRKC9Du28dNxZRkSsjKKFqn0vcVRfgD6eBZd0oNr+CQ@mail.gmail.com>

On Thu, Aug 8, 2013 at 11:37 AM, Christian Brueffer
<christian at brueffer.de> wrote:
>>
>> Was there a special reason for all these case variants
>> in the XXmotif options??
>
> I basically followed the example set by
> Bio/Align/Applications/_Clustalw.py.

Ah. Without checking I think maybe the ClustalW documentation
used both cases - but the order was deliberately with the lower
case one last as that was used in the Python object as the
property name and keyword.

> The "rationale" was to allow for people to use their favourite
> spelling variety.
>
> I guess it was bad luck this happened to serve as an example, as it
> was the first piece of code I ever touched in BioPython.
>
> It would be nice to streamline all application wrappers in this regard
> sometime...

Yeah, perhaps we can formally deprecate set_parameter in
the next release which means all the aliases 'go away' and
that leaves us with just the final entry exposed as the usable
property name and keyword.

Peter


From arklenna at gmail.com  Thu Aug  8 19:54:58 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 8 Aug 2013 15:54:58 -0400
Subject: [Biopython-dev] PDB occupancy behavior
Message-ID: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>

Hi all,

I just submitted a pull request I'd like wider feedback on.

https://github.com/biopython/biopython/pull/207

In summary, I am using software-produced PDB files that simply stop after
the coordinate data, so occupancy data is missing. Currently, the Biopython
PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing
this to 1.0.

I would like to see if anyone knows of situations in which this would be a
bad idea.

Cheers,

Lenna


From anaryin at gmail.com  Thu Aug  8 20:02:39 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Thu, 8 Aug 2013 13:02:39 -0700
Subject: [Biopython-dev] [Biopython] PDB occupancy behavior
In-Reply-To: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
References: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
Message-ID: <CAJ9sUYNLHwpGsSWXC4k-bEdGeao60ZvCf+hnKKgTH3GrZF+8SA@mail.gmail.com>

Hi Lenna,

As I mentioned in the Github email, I think it's fine. It doesn't matter if
the occupancy is 0 or 1 in case of a model most of the time. I agree with
it. The only bad thing I can think about is having occupancy for a certain
atom larger than 1 in some bogus cases but to be honest, no software that I
know of bothers checking that...

Cheers,

Jo?o


2013/8/8 Lenna Peterson <arklenna at gmail.com>

> Hi all,
>
> I just submitted a pull request I'd like wider feedback on.
>
> https://github.com/biopython/biopython/pull/207
>
> In summary, I am using software-produced PDB files that simply stop after
> the coordinate data, so occupancy data is missing. Currently, the Biopython
> PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing
> this to 1.0.
>
> I would like to see if anyone knows of situations in which this would be a
> bad idea.
>
> Cheers,
>
> Lenna
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From p.j.a.cock at googlemail.com  Thu Aug  8 22:37:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 8 Aug 2013 23:37:27 +0100
Subject: [Biopython-dev] [Biopython] PDB occupancy behavior
In-Reply-To: <AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
References: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
	<CAJ9sUYNLHwpGsSWXC4k-bEdGeao60ZvCf+hnKKgTH3GrZF+8SA@mail.gmail.com>
	<AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
Message-ID: <CAKVJ-_5o_sef+fUo6e-=Hfo=wiS4FrB8hS72CN31Yh2vdw4waw@mail.gmail.com>

Thanks everyone - that seems like a clear consensus, patch applied :)

Peter

On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared <Jared.Sampson at nyumc.org> wrote:
> Thanks, Lenna and Jo?o -
>
> I also agree, 1.0 is a better default occupancy value.  For most
> structural manipulation purposes, unless specified otherwise, we must assume
> the atoms listed are present in the structure at full occupancy.  Setting a
> reduced occupancy can be useful for partially bound ligands, disordered
> loops, and so forth, but doing so is the exception, not the rule.
>
> Cheers,
> Jared
>
> --
> Jared Sampson
> Xiangpeng Kong Lab
> NYU Langone Medical Center
> Old Public Health Building, Room 610
> 341 East 25th Street
> New York, NY 10016
> 212-263-7898
> http://kong.med.nyu.edu/
>
>
>
>
> On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues
> <anaryin at gmail.com<mailto:anaryin at gmail.com>> wrote:
>
> Hi Lenna,
>
> As I mentioned in the Github email, I think it's fine. It doesn't matter
> if the occupancy is 0 or 1 in case of a model most of the time. I agree
> with it. The only bad thing I can think about is having occupancy for
> a certain atom larger than 1 in some bogus cases but to be honest,
> no software that I know of bothers checking that...
>
> Cheers,
>
> Jo?o
>
>
> 2013/8/8 Lenna Peterson <arklenna at gmail.com<mailto:arklenna at gmail.com>>
>
> Hi all,
>
> I just submitted a pull request I'd like wider feedback on.
>
> https://github.com/biopython/biopython/pull/207
>
> In summary, I am using software-produced PDB files that simply stop after
> the coordinate data, so occupancy data is missing. Currently, the
> Biopython PDBParser sets missing or blank occupancy to 0.0. I am
> suggesting changing this to 1.0.
>
> I would like to see if anyone knows of situations in which this would be a
> bad idea.
>
> Cheers,
>
> Lenna


From p.j.a.cock at googlemail.com  Thu Aug  8 22:37:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 8 Aug 2013 23:37:27 +0100
Subject: [Biopython-dev] [Biopython] PDB occupancy behavior
In-Reply-To: <AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
References: <CAHQkFdd9W+OAMuiKUWz7LoPKo_BudzyZgu_5gaGesuwfV2NSZw@mail.gmail.com>
	<CAJ9sUYNLHwpGsSWXC4k-bEdGeao60ZvCf+hnKKgTH3GrZF+8SA@mail.gmail.com>
	<AA0AA81E-B2C5-4CF3-97B3-AC9A51D25999@nyumc.org>
Message-ID: <CAKVJ-_5o_sef+fUo6e-=Hfo=wiS4FrB8hS72CN31Yh2vdw4waw@mail.gmail.com>

Thanks everyone - that seems like a clear consensus, patch applied :)

Peter

On Thu, Aug 8, 2013 at 9:30 PM, Sampson, Jared <Jared.Sampson at nyumc.org> wrote:
> Thanks, Lenna and Jo?o -
>
> I also agree, 1.0 is a better default occupancy value.  For most
> structural manipulation purposes, unless specified otherwise, we must assume
> the atoms listed are present in the structure at full occupancy.  Setting a
> reduced occupancy can be useful for partially bound ligands, disordered
> loops, and so forth, but doing so is the exception, not the rule.
>
> Cheers,
> Jared
>
> --
> Jared Sampson
> Xiangpeng Kong Lab
> NYU Langone Medical Center
> Old Public Health Building, Room 610
> 341 East 25th Street
> New York, NY 10016
> 212-263-7898
> http://kong.med.nyu.edu/
>
>
>
>
> On Aug 8, 2013, at 4:02 PM, Jo?o Rodrigues
> <anaryin at gmail.com<mailto:anaryin at gmail.com>> wrote:
>
> Hi Lenna,
>
> As I mentioned in the Github email, I think it's fine. It doesn't matter
> if the occupancy is 0 or 1 in case of a model most of the time. I agree
> with it. The only bad thing I can think about is having occupancy for
> a certain atom larger than 1 in some bogus cases but to be honest,
> no software that I know of bothers checking that...
>
> Cheers,
>
> Jo?o
>
>
> 2013/8/8 Lenna Peterson <arklenna at gmail.com<mailto:arklenna at gmail.com>>
>
> Hi all,
>
> I just submitted a pull request I'd like wider feedback on.
>
> https://github.com/biopython/biopython/pull/207
>
> In summary, I am using software-produced PDB files that simply stop after
> the coordinate data, so occupancy data is missing. Currently, the
> Biopython PDBParser sets missing or blank occupancy to 0.0. I am
> suggesting changing this to 1.0.
>
> I would like to see if anyone knows of situations in which this would be a
> bad idea.
>
> Cheers,
>
> Lenna


From ben at benfulton.net  Fri Aug  9 01:03:10 2013
From: ben at benfulton.net (Ben Fulton)
Date: Thu, 8 Aug 2013 21:03:10 -0400
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_5dL-Fsd3UtDneGgLR2PJVDfHaFFbi6+tSeYqV1DZYNNw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAKVJ-_4Cc2VQQBe0Jf_n0kNC9nEozPAXp75Ltv86s5kwqGiSdQ@mail.gmail.com>
	<CA+ijMsm-hCNoekmFY5gPOMb_TpfUXj4Mkb2TE7bhOHR-DSEOgA@mail.gmail.com>
	<CAKVJ-_6ZTQBeScaDOsOEUq3rm7GaR-KE0zV4NAvDP+WKcxY=WQ@mail.gmail.com>
	<CAKVJ-_55Mbhsqz17PHEVBm5w0zR=Uf+TNM4Acq8FNxGDSpcNDw@mail.gmail.com>
	<CAKVJ-_5dL-Fsd3UtDneGgLR2PJVDfHaFFbi6+tSeYqV1DZYNNw@mail.gmail.com>
Message-ID: <CA+ijMs=S9gh2Ys4ac3kLUwco+zfyTJ=SK5eBJwC4AG4MFNLt7A@mail.gmail.com>

Everything else is passing. The PopGen files pass as well after installing
them from source.


On Mon, Aug 5, 2013 at 9:43 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Mon, Aug 5, 2013 at 1:14 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > On Mon, Aug 5, 2013 at 12:46 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> On Mon, Aug 5, 2013 at 2:28 AM, Ben Fulton <ben at benfulton.net> wrote:
> >>>
> >>> The site http://www.rubic.rdg.ac.uk/~mab/software.html is down, and I
> can't
> >>> find anywhere else to install the PopGen software from.
> >>>
> >>
> >> There seems to be a fairly recent snapshot on archive.org,
> >>
> http://web.archive.org/web/20120510013219/http://www.rubic.rdg.ac.uk/~mab/software.html
> >>
> >> Meanwhile, I have emailed Dr. Mark Beaumont at Reading
> >> University to ask about the server status.
> >
> > Mark has moved to Bristol:
> > http://www.maths.bris.ac.uk/people/profile/mamab
> >
> > FDist and DFDist are available here now:
> > http://www.maths.bris.ac.uk/~mamab/
> >
> > We need to update the Biopython documentation (and check
> > those versions from Bristol still work with our tests).
> >
> > Tiago, could you handle that?
>
> According to his email auto-reply, Tiago is away right now.
>
> I've updated a couple of URLs in the source code:
>
> https://github.com/biopython/biopython/commit/70667063701041b73147c502c933fa8bfde1d850
>
> Ben - did you see anything else which needs updating here?
>
> Thanks,
>
> Peter
>


From mok at bioxray.dk  Fri Aug  9 08:39:55 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 10:39:55 +0200
Subject: [Biopython-dev] PDB occupancy behavior
Message-ID: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>

Lenna wrote:

>  In summary, I am using software-produced PDB files that simply stop after
>  the coordinate data, so occupancy data is missing. Currently, the Biopython
>  PDBParser sets missing or blank occupancy to 0.0. I am suggesting changing
>  this to 1.0.

I think it is an incorrect default behaviour to set the occupancy to 1 if it's not present in the file. If the occupancy is not there, you can't say anything about it, and it should be set to 0, so the current defaults are correct IMO.

If, for some reason, you NEED the occupancy to be 1, and it is not, it is very simple to write a loop modifying it. I.e. special needs should be taken care of in the users program, not Bio.PDB.

Cheers,
Morten

-- 
Morten Kjeldgaard, asc. professor, MSc, PhD
Dept. of Molecular Biology and Genetics, Aarhus University
Gustav Wieds Vej 10C, Building 3135, DK-8000 Aarhus C, Denmark.


From mok at bioxray.dk  Fri Aug  9 08:33:37 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 10:33:37 +0200
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>

Hi,

I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record.  In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done.

Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example.

My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future.

Cheers,
Morten


[1] https://redmine.open-bio.org/issues/2727
[2] https://github.com/mok0/biopython/tree/pdbwork


From p.j.a.cock at googlemail.com  Fri Aug  9 08:47:15 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 09:47:15 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
Message-ID: <CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>

On Fri, Aug 9, 2013 at 9:39 AM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

> Lenna wrote:
>
> >  In summary, I am using software-produced PDB files that simply stop
> after
> >  the coordinate data, so occupancy data is missing. Currently, the
> Biopython
> >  PDBParser sets missing or blank occupancy to 0.0. I am suggesting
> changing
> >  this to 1.0.
>
> I think it is an incorrect default behaviour to set the occupancy

to 1 if it's not present in the file. If the occupancy is not there,

you can't say anything about it, and it should be set to 0, so the

current defaults are correct IMO.
>
> If, for some reason, you NEED the occupancy to be 1, and it

is not, it is very simple to write a loop modifying it. I.e. special

needs should be taken care of in the users program, not Bio.PDB.
>
> Cheers,
> Morten
>
>
How about the special float values NaN or NA instead?
Or the Python special value None?

Peter


From mok at bioxray.dk  Fri Aug  9 08:33:37 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 10:33:37 +0200
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
Message-ID: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>

Hi,

I've finally gotten around to following up to a very old patch I sent to the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB does not parse the important CRYST1 record.  In the bug comments, Peter Cock asked to include the explanation of the new keys in the docstring. That has now been done.

Peter also asks about the default values chosen (if the CRYST1 header is not present). These are probably universally chosen default values in various crystallographic programs, and these values are also used in PDB entries containinging NMR entries, for example.

My github branch containing the patch #2727 is in [2]. I am using Bio.PDB quite a lot, and I would like to contribute more to it in the future.

Cheers,
Morten


[1] https://redmine.open-bio.org/issues/2727
[2] https://github.com/mok0/biopython/tree/pdbwork


From mok at bioxray.dk  Fri Aug  9 09:07:13 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Fri, 9 Aug 2013 11:07:13 +0200
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
Message-ID: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>

On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> How about the special float values NaN or NA instead?
> Or the Python special value None?

TBH I don't think there is any good reason to change the current defaults. On the contrary, we should be careful when changing default values since this might break users' programs.

My point is, that Lenna wants to read files that does not follow the PDB standard, and so she needs to make provisions for that in her own program, not the toolkit. 

Putting None in the value of a field that isn't there, but should be according the format specification is more reasonable, since it alerts the user to the fact that something is fishy. However, it should only be done this way if that is a philosophy used throughout the Biopython toolkit. Is it?

I would warn against using NaN since it is non-pythonic and a nightmare to deal with in practice.

Cheers,
Morten


From p.j.a.cock at googlemail.com  Fri Aug  9 11:06:46 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 12:06:46 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
Message-ID: <CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>

On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

> On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > How about the special float values NaN or NA instead?
> > Or the Python special value None?
>
> TBH I don't think there is any good reason to change the current defaults.
> On the contrary, we should be careful when changing default values since
> this might break users' programs.
>
> My point is, that Lenna wants to read files that does not follow the PDB
> standard, and so she needs to make provisions for that in her own program,
> not the toolkit.
>
>
Do you think this should be something handled differently in strict and
permissive mode? Should missing occupancy give a warning or error in strict
mode?

Peter


From arklenna at gmail.com  Fri Aug  9 13:07:41 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Fri, 9 Aug 2013 09:07:41 -0400
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
Message-ID: <CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>

On Friday, 9 August 2013, Peter Cock wrote:

> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk<javascript:;>>
> wrote:
>
> > On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>>
> wrote:
> >
> > > How about the special float values NaN or NA instead?
> > > Or the Python special value None?
> >
> > TBH I don't think there is any good reason to change the current
> defaults.
> > On the contrary, we should be careful when changing default values since
> > this might break users' programs.
> >
> > My point is, that Lenna wants to read files that does not follow the PDB
> > standard, and so she needs to make provisions for that in her own
> program,
> > not the toolkit.
> >
> >
> Do you think this should be something handled differently in strict and
> permissive mode? Should missing occupancy give a warning or error in strict
> mode?


(Resending to dev list)

None in permissive mode makes a lot of sense to me.

Missing occupancy is a fatal error in strict mode.

Lenna


From p.j.a.cock at googlemail.com  Fri Aug  9 13:14:44 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 14:14:44 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
	<CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
Message-ID: <CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>

On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> On Friday, 9 August 2013, Peter Cock wrote:
>
>> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk<javascript:;>>
>> wrote:
>>
>> > On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>>
>> wrote:
>> >
>> > > How about the special float values NaN or NA instead?
>> > > Or the Python special value None?
>> >
>> > TBH I don't think there is any good reason to change the current
>> defaults.
>> > On the contrary, we should be careful when changing default values since
>> > this might break users' programs.
>> >
>> > My point is, that Lenna wants to read files that does not follow the PDB
>> > standard, and so she needs to make provisions for that in her own
>> > program, not the toolkit.
>> >
>> >
>> Do you think this should be something handled differently in strict and
>> permissive mode? Should missing occupancy give a warning or error in strict
>> mode?
>
> (Resending to dev list)
>
> None in permissive mode makes a lot of sense to me.
>
> Missing occupancy is a fatal error in strict mode.
>
> Lenna

Good (error in strict mode).

Do you think a warning in permissive mode for missing occupancy
is also worth adding, or would using None as the value indicate
that nicely?

Peter


From arklenna at gmail.com  Fri Aug  9 13:46:54 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Fri, 9 Aug 2013 09:46:54 -0400
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
	<CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
	<CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>
Message-ID: <CAHQkFdf6cbYyCcXhZixP89RtP8a4-RV8MO0_y8goyVM-wReUyw@mail.gmail.com>

On Friday, 9 August 2013, Peter Cock wrote:

> On Fri, Aug 9, 2013 at 2:07 PM, Lenna Peterson <arklenna at gmail.com<javascript:;>>
> wrote:
> > On Friday, 9 August 2013, Peter Cock wrote:
> >
> >> On Fri, Aug 9, 2013 at 10:07 AM, Morten Kjeldgaard <mok at bioxray.dk<javascript:;>
> <javascript:;>>
> >> wrote:
> >>
> >> > On 09/08/2013, at 10:47, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>
> <javascript:;>>
> >> wrote:
> >> >
> >> > > How about the special float values NaN or NA instead?
> >> > > Or the Python special value None?
> >> >
> >> > TBH I don't think there is any good reason to change the current
> >> defaults.
> >> > On the contrary, we should be careful when changing default values
> since
> >> > this might break users' programs.
> >> >
> >> > My point is, that Lenna wants to read files that does not follow the
> PDB
> >> > standard, and so she needs to make provisions for that in her own
> >> > program, not the toolkit.
> >> >
> >> >
> >> Do you think this should be something handled differently in strict and
> >> permissive mode? Should missing occupancy give a warning or error in
> strict
> >> mode?
> >
> > (Resending to dev list)
> >
> > None in permissive mode makes a lot of sense to me.
> >
> > Missing occupancy is a fatal error in strict mode.
> >
> > Lenna
>
> Good (error in strict mode).
>
> Do you think a warning in permissive mode for missing occupancy
> is also worth adding, or would using None as the value indicate
> that nicely?
>
> Peter
>


I have some concern about changing the type of an attribute but I imagine
any end user who cares about occupancy doesn't want spurious values of
either 1.0 or 0.0 anyway.

I'm not at a computer right now but I believe most problems in the PDB
parser are fatal in strict and warnings in permissive. So there should
already be a warning in place.

It occurred to me it would also be possible o create an "ultra-permissive"
mode designed for parsing computationally produced files, and suppress some
of the warnings (e.g. missing occupancy and B-factor). That way,
the current behavior could be left unchanged. Possibly a permissiveness
level (0 for strict, 1 for current permissive, 2 for even more permissive).

Anyway, I'd be happy to implement any of these options (current parser to
None, restore previous behavior and None in a new permissiveness level,
other?) and of course update the unit test.

Cheers,

Lenna


From p.j.a.cock at googlemail.com  Fri Aug  9 14:22:29 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 15:22:29 +0100
Subject: [Biopython-dev] PDB occupancy behavior
In-Reply-To: <CAHQkFdf6cbYyCcXhZixP89RtP8a4-RV8MO0_y8goyVM-wReUyw@mail.gmail.com>
References: <F019C881-C8A3-4D8E-830F-0D1E30739622@bioxray.dk>
	<CAKVJ-_5=Z6aQ4gsCT3MXa3Rk-9RaGbt5vt7pAPdS6i=ufuiPyg@mail.gmail.com>
	<3626CAF5-41E2-43C7-8C0E-49FC83786EE0@bioxray.dk>
	<CAKVJ-_7chOdtn+H+947QG+j1j_++btNqTO326QByuT5Cx95Xng@mail.gmail.com>
	<CAHQkFddQ2H4BV8+MpYL7QxBKWFo7C-1+cno8p6-AAxPaPzRErg@mail.gmail.com>
	<CAKVJ-_60-5tCa+dPwHEmtyMcr5O2v4xpZzNLnEwCwnRP4jfV+Q@mail.gmail.com>
	<CAHQkFdf6cbYyCcXhZixP89RtP8a4-RV8MO0_y8goyVM-wReUyw@mail.gmail.com>
Message-ID: <CAKVJ-_7yb2pexpJqTbGmTGs3aW5goJBxBZAq2EJOhxCugOursQ@mail.gmail.com>

On Fri, Aug 9, 2013 at 2:46 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> On Friday, 9 August 2013, Peter Cock wrote:
>>
>> Good (error in strict mode).
>>
>> Do you think a warning in permissive mode for missing occupancy
>> is also worth adding, or would using None as the value indicate
>> that nicely?
>>
>> Peter
>
>
>
> I have some concern about changing the type of an attribute but I imagine
> any end user who cares about occupancy doesn't want spurious values of
> either 1.0 or 0.0 anyway.
>
> I'm not at a computer right now but I believe most problems in the PDB
> parser are fatal in strict and warnings in permissive. So there should
> already be a warning in place.
>
> It occurred to me it would also be possible o create an "ultra-permissive"
> mode designed for parsing computationally produced files, and suppress some
> of the warnings (e.g. missing occupancy and B-factor). That way, the current
> behavior could be left unchanged. Possibly a permissiveness level (0 for
> strict, 1 for current permissive, 2 for even more permissive).
>
> Anyway, I'd be happy to implement any of these options (current parser to
> None, restore previous behavior and None in a new permissiveness level,
> other?) and of course update the unit test.

You should be able to silence the PDB warnings in two lines anyway,
so I don't think we really need an ultra-permissive no-warnings mode.

Peter


From anaryin at gmail.com  Fri Aug  9 17:26:59 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Fri, 9 Aug 2013 10:26:59 -0700
Subject: [Biopython-dev] Moratorium on commits?
Message-ID: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>

Dear all,

The situation with the occupancy in the PDBParser led to think of one
thing.

Since not everybody is in the same timezone, has the same availability,
etc, what about we introduce a brief moratorium over commits of say 3 days
(except for critical bug fixes)? This will give everybody probably enough
time to read the email and give their opinion.

The downside is that it will make things roll a bit slower but then again,
3 days is not so much..

Cheers,

Jo?o


From p.j.a.cock at googlemail.com  Fri Aug  9 19:06:21 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 9 Aug 2013 20:06:21 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
Message-ID: <CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>

On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
> Dear all,
>
> The situation with the occupancy in the PDBParser led to think of one
> thing.
>
> Since not everybody is in the same timezone, has the same availability,
> etc, what about we introduce a brief moratorium over commits of say 3 days
> (except for critical bug fixes)? This will give everybody probably enough
> time to read the email and give their opinion.
>
> The downside is that it will make things roll a bit slower but then again,
> 3 days is not so much..
>
> Cheers,
>
> Jo?o

I don't think that's really needed for small commits like
this which are simple to interpret. In this case there were
three opinions in favour of the idea, with a fourth counter
view appearing later, resulting in a further tweak.

Longer periods of discussion are far more important on
large code additions or major changes.

Peter


From arklenna at gmail.com  Sun Aug 11 00:43:36 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Sat, 10 Aug 2013 20:43:36 -0400
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
In-Reply-To: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
Message-ID: <CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>

Hi Morten,

I think this looks great. Why not submit a pull request?

Cheers,
Lenna


On Fri, Aug 9, 2013 at 4:33 AM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

> Hi,
>
> I've finally gotten around to following up to a very old patch I sent to
> the redmine bug tracker [1]. The patch addresses the problem that Bio.PDB
> does not parse the important CRYST1 record.  In the bug comments, Peter
> Cock asked to include the explanation of the new keys in the docstring.
> That has now been done.
>
> Peter also asks about the default values chosen (if the CRYST1 header is
> not present). These are probably universally chosen default values in
> various crystallographic programs, and these values are also used in PDB
> entries containinging NMR entries, for example.
>
> My github branch containing the patch #2727 is in [2]. I am using Bio.PDB
> quite a lot, and I would like to contribute more to it in the future.
>
> Cheers,
> Morten
>
>
> [1] https://redmine.open-bio.org/issues/2727
> [2] https://github.com/mok0/biopython/tree/pdbwork
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From mok at bioxray.dk  Sun Aug 11 18:33:05 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Sun, 11 Aug 2013 20:33:05 +0200
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
In-Reply-To: <CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>
References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
	<CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>
Message-ID: <BCB0465E-7439-4562-B746-7585D39E3C48@bioxray.dk>


On 11/08/2013, at 02:43, Lenna Peterson <arklenna at gmail.com> wrote:

> I think this looks great. Why not submit a pull request?

Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I thought that is what I did by posting to the -dev list). 

Cheers,
Morten


From mok at bioxray.dk  Sun Aug 11 18:28:36 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Sun, 11 Aug 2013 20:28:36 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
Message-ID: <CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>


On 09/08/2013, at 21:06, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
>> Dear all,
>> 
>> The situation with the occupancy in the PDBParser led to think of one
>> thing.
>> 
>> Since not everybody is in the same timezone, has the same availability,
>> etc, what about we introduce a brief moratorium over commits of say 3 days
>> (except for critical bug fixes)? This will give everybody probably enough
>> time to read the email and give their opinion.
>> 
>> The downside is that it will make things roll a bit slower but then again,
>> 3 days is not so much..
>> 
>> Cheers,
>> 
>> Jo?o
> 
> I don't think that's really needed for small commits like
> this which are simple to interpret. In this case there were
> three opinions in favour of the idea, with a fourth counter
> view appearing later, resulting in a further tweak.
> 
> Longer periods of discussion are far more important on
> large code additions or major changes.

Sorry, but I don't agree that this is a "small commit". It may not be large in terms of number of bytes, but it is large in terms of impact, since it affects users' programs in unpredictable ways. Whenever a change is made that affects values returned to the user, it is worth spending a few days discussing it,  to let people have a chance to think through the consequences of the change.

Cheers,
Morten


From arklenna at gmail.com  Sun Aug 11 18:40:38 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Sun, 11 Aug 2013 14:40:38 -0400
Subject: [Biopython-dev] Redmine issue 2727 ready for pull
In-Reply-To: <BCB0465E-7439-4562-B746-7585D39E3C48@bioxray.dk>
References: <0743AFDE-D1B2-4348-AFFE-3CE5CC227FE4@bioxray.dk>
	<CAHQkFdcWA+f40nRXioD-_-gO=n0FHEs=BU5zVjbjy_rb8gJ0HA@mail.gmail.com>
	<BCB0465E-7439-4562-B746-7585D39E3C48@bioxray.dk>
Message-ID: <CAHQkFdfE2GJK-5Pc2JZhZSL17oPZSQLpXAKZ87QAAqmtqZdvSQ@mail.gmail.com>

On Sun, Aug 11, 2013 at 2:33 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

>
> On 11/08/2013, at 02:43, Lenna Peterson <arklenna at gmail.com> wrote:
>
> > I think this looks great. Why not submit a pull request?
>
> Thanks! Excuse me for my ignorance, but how do I submit a pull request? (I
> thought that is what I did by posting to the -dev list).
>
> Cheers,
> Morten


Hey Morten,

It's good to let the dev list know you have code ready to merge in, but if
you do it on github, it will show up here too:
https://github.com/biopython/biopython/pulls

Here's github's instructions:

https://help.github.com/articles/creating-a-pull-request

Cheers,

Lenna


From p.j.a.cock at googlemail.com  Sun Aug 11 20:50:46 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sun, 11 Aug 2013 21:50:46 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
Message-ID: <CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>

On Sun, Aug 11, 2013 at 7:28 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:
>
> On 09/08/2013, at 21:06, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
>> On Fri, Aug 9, 2013 at 6:26 PM, Jo?o Rodrigues <anaryin at gmail.com> wrote:
>>> Dear all,
>>>
>>> The situation with the occupancy in the PDBParser led to think of one
>>> thing.
>>>
>>> Since not everybody is in the same timezone, has the same availability,
>>> etc, what about we introduce a brief moratorium over commits of say 3
>>> days (except for critical bug fixes)? This will give everybody probably
>>> enough time to read the email and give their opinion.
>>>
>>> The downside is that it will make things roll a bit slower but then
>>> again, 3 days is not so much..
>>>
>>> Cheers,
>>>
>>> Jo?o
>>
>> I don't think that's really needed for small commits like
>> this which are simple to interpret. In this case there were
>> three opinions in favour of the idea, with a fourth counter
>> view appearing later, resulting in a further tweak.
>>
>> Longer periods of discussion are far more important on
>> large code additions or major changes.
>
> Sorry, but I don't agree that this is a "small commit". It may
> not be large in terms of number of bytes, but it is large in
> terms of impact, since it affects users' programs in
> unpredictable ways.

Hello again Morten,

I did mean small in number of code change, which I
tried to make clear from the rest of the email, but
as discussed below, I also think the PDB occupancy
change was also small in terms of behaviour.

> Whenever a change is made that affects values
> returned to the user, it is worth spending a few days
> discussing it,  to let people have a chance to think
> through the consequences of the change.

Almost any change impacts the user in some way.

I still feel this was a minor change (although of
course important to some, including you). This is
parsing of malformed PDF files where the user
ALREADY gets a warning (or error in strict mode,
where there would be no functional change) that
there is a problem with the occupancy data.

One reason why I specifically talked about small
commits (in the sense of a simple diff) above is
they are trivial to revert if the need arises, or as
in this case, modify:
https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e

This change was suggested and supported by
people who've been actively contributing to the
Biopython structural module for some time, so I
had reason to trust their good judgement, and as
I wrote at the time there was a clear consensus
with three people in all happy with the idea:
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html

Changes where there isn't clear agreement are
generally discussed over a longer time period.

Note that Biopython is already relatively strict
about not breaking things and preserving backwards
compatibility (to the point where it does delay new
features). We do care about not breaking existing
scripts without warning - so when people speak up
on the list that something is likely to cause them
trouble, we do listen.

Is that any clearer?

Regards,

Peter


From zruan1991 at gmail.com  Sun Aug 11 22:04:10 2013
From: zruan1991 at gmail.com (Zheng Ruan)
Date: Sun, 11 Aug 2013 18:04:10 -0400
Subject: [Biopython-dev] Codon Alignment GSoC Update
Message-ID: <CABM7aFpcMT+BapVya1ESpfKNvvUpdFFTodEo250xkk00yKQZQw@mail.gmail.com>

Hi all,

An update of Codon Alignment Project can be found at (http://zruanweb.com/).
In the next week, I will be implementing the Maximum Likelihood method for
dN/dS ratio estimation. I do not anticipate to write any code for the
optimization and Scipy's functionality is most suitable to be used here.
This might be a new dependency for Biopython. Is it okay to add this? Or
are there some other functions in Biopython for optimization problems?
Thanks!

Best,
Zheng Ruan


From kai.blin at biotech.uni-tuebingen.de  Mon Aug 12 10:53:17 2013
From: kai.blin at biotech.uni-tuebingen.de (Kai Blin)
Date: Mon, 12 Aug 2013 12:53:17 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
Message-ID: <5208BE9D.1090900@biotech.uni-tuebingen.de>

On 2013-08-09 19:26, Jo?o Rodrigues wrote:

Dear biopython devs,

> Since not everybody is in the same timezone, has the same availability,
> etc, what about we introduce a brief moratorium over commits of say 3 days
> (except for critical bug fixes)? This will give everybody probably enough
> time to read the email and give their opinion.

I've been through discussions like this before, in a lot of open source 
projects I'm involved in. I don't think this is a good step to take. 
Saying that "all patches need to wait unless they're special" will 
eventually lead to a dilution of what is considered special, and then 
lead to a point where most patches by core contributors happen to be 
special and patches by new contributors aren't. Because the policy 
doesn't explicitly state this, you then create a very unwelcoming 
atmosphere for the project. I would recommend to consider if avoiding 
the occasional revert is worth that cost.

Personally, one of the things I like about BioPython is how fast I'm 
able to get bugfixes in.

My two cents,
Kai

-- 
Dipl.-Inform. Kai Blin         kai.blin at biotech.uni-tuebingen.de
Institute for Microbiology and Infection Medicine
Division of Microbiology/Biotechnology
Eberhard-Karls-Universit?t T?bingen
Auf der Morgenstelle 28                 Phone : ++49 7071 29-78841
D-72076 T?bingen                        Fax :   ++49 7071 29-5979
Germany
Homepage: http://www.mikrobio.uni-tuebingen.de/ag_wohlleben


From tiagoantao at gmail.com  Mon Aug 12 11:33:40 2013
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 12 Aug 2013 12:33:40 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <5208BE9D.1090900@biotech.uni-tuebingen.de>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<5208BE9D.1090900@biotech.uni-tuebingen.de>
Message-ID: <CAA9RGEMdWw2NXC3vWBmUouC6YD=4sNAbaS3LzQSQr5ULJT0yFg@mail.gmail.com>

Hi,


On 12 August 2013 11:53, Kai Blin <kai.blin at biotech.uni-tuebingen.de> wrote:

> Personally, one of the things I like about BioPython is how fast I'm able
> to get bugfixes in.
>
>

I agree that the light approach to process is great. 99% of the patches are
pacific and would suffer from a heavier process.

For the rare cases where there are problems, revert can be used. My code
has been reverted a couple of times and I am fine with that (when one
commits to a public project with shared ownership one should expect
peer-review, sometimes heated discussion and corrections - it is normal).

If one thinks a change can be problematic, an initial discussion would be a
good idea. Of course, some times we do not know until after the fact, then
again, the good thing about version control is that we can undo things...

Generally things have been working very well and I would not change the
process to something heavier just because of a single case. Single cases
should be sorted on a case-by-case basis, with no stress.

My 2p,
Tiago


From yeyanbo289 at gmail.com  Mon Aug 12 13:25:22 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 12 Aug 2013 21:25:22 +0800
Subject: [Biopython-dev] GSOC weekly update 8
Message-ID: <CADoMHjzscM52YLsW0HiYRSMPvxbO=11gLnKgJBApMr-ChGKA1w@mail.gmail.com>

Hi all,

My update about Biopython.Phylo project can be found here:
http://blog.yeyanbo.com/posts/google-summer-of-code-9.html

Best,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*


From mok at bioxray.dk  Mon Aug 12 18:33:26 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Mon, 12 Aug 2013 20:33:26 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
Message-ID: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>


On 11/08/2013, at 22:50, Peter Cock <p.j.a.cock at googlemail.com> wrote:

> I still feel this was a minor change (although of
> course important to some, including you). This is
> parsing of malformed PDF files where the user
> ALREADY gets a warning (or error in strict mode,
> where there would be no functional change) that
> there is a problem with the occupancy data.
> 
> One reason why I specifically talked about small
> commits (in the sense of a simple diff) above is
> they are trivial to revert if the need arises, or as
> in this case, modify:
> https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e
> 
> This change was suggested and supported by
> people who've been actively contributing to the
> Biopython structural module for some time, so I
> had reason to trust their good judgement, and as
> I wrote at the time there was a clear consensus
> with three people in all happy with the idea:
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html


I respect that you listen more to developers that have been contributing for a long time. That is quite understandable, but I hope that does not prevent me from contributing my opinions.

What prompted my response was the suggestion that the occupancy should be set to 1.0 if it is abscent from the file, i.e. if the PDB file is malformed. I think that is an incorrect behavior, and I say that not as a core developer, but as a crystallographer. If invalid data is present in the file, you do not want the toolkit transforming it to valid data.

After thinking about it, the suggestion to set values to None when they are not defined in a malformed file now appears quite reasonable, but if it is done this way with occupancies, it should also done this way with B-factors, chain identifiers and other values that are mandatory in the file according to the format specs. From the users perspective, if the values returned are None, you are alerted to the fact that something is wrong, and you should make an appropriate choice, whatever that may be.

Cheers,
Morten


From arklenna at gmail.com  Mon Aug 12 19:25:20 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Mon, 12 Aug 2013 15:25:20 -0400
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
	<677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
Message-ID: <CAHQkFddu6WGWwrRrC_1M50k24-1AiYAV6azmBP+EtOmA1DOSkg@mail.gmail.com>

On Mon, Aug 12, 2013 at 2:33 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:

>
> On 11/08/2013, at 22:50, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> > I still feel this was a minor change (although of
> > course important to some, including you). This is
> > parsing of malformed PDF files where the user
> > ALREADY gets a warning (or error in strict mode,
> > where there would be no functional change) that
> > there is a problem with the occupancy data.
> >
> > One reason why I specifically talked about small
> > commits (in the sense of a simple diff) above is
> > they are trivial to revert if the need arises, or as
> > in this case, modify:
> >
> https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e
> >
> > This change was suggested and supported by
> > people who've been actively contributing to the
> > Biopython structural module for some time, so I
> > had reason to trust their good judgement, and as
> > I wrote at the time there was a clear consensus
> > with three people in all happy with the idea:
> >
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html
>
>
> I respect that you listen more to developers that have been contributing
> for a long time. That is quite understandable, but I hope that does not
> prevent me from contributing my opinions.
>
> What prompted my response was the suggestion that the occupancy should be
> set to 1.0 if it is abscent from the file, i.e. if the PDB file is
> malformed. I think that is an incorrect behavior, and I say that not as a
> core developer, but as a crystallographer. If invalid data is present in
> the file, you do not want the toolkit transforming it to valid data.
>

 I appreciate the physical/practical feedback about the commits.

After thinking about it, the suggestion to set values to None when they are
> not defined in a malformed file now appears quite reasonable, but if it is
> done this way with occupancies, it should also done this way with
> B-factors, chain identifiers and other values that are mandatory in the
> file according to the format specs. From the users perspective, if the
> values returned are None, you are alerted to the fact that something is
> wrong, and you should make an appropriate choice, whatever that may be.
>
>
I agree that `None` is a good warning value for missing data.

I just skimmed the code and summarized how some of the missing values are
handled:

* Serial number: 0
* Chain: fatal in both strict and permissive modes (i.e. no try/except)
* Coordinates: fatal in both strict and permissive modes
* Occupancy: we recently decided to set as None in permissive
* B-factor: 0.0 in permissive (code comment states this is PDB default)
* Model seq id: 0

The StructureBuilder class also has certain ways of handling duplicate
residues and atoms that I'm not particularly familiar with. For example,
I'm not quite sure what will happen if successive atoms have missing serial
numbers.

PDB is a format where there's always a balance between absolute adherence
to the format and enough flexibility to deal with the wide range of
malformed files.

Lenna


From mok at bioxray.dk  Mon Aug 12 19:42:28 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Mon, 12 Aug 2013 21:42:28 +0200
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <CAHQkFddu6WGWwrRrC_1M50k24-1AiYAV6azmBP+EtOmA1DOSkg@mail.gmail.com>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
	<677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
	<CAHQkFddu6WGWwrRrC_1M50k24-1AiYAV6azmBP+EtOmA1DOSkg@mail.gmail.com>
Message-ID: <0F6D9BF5-BFAA-4118-8D90-936AC44A29FA@bioxray.dk>


On 12/08/2013, at 21:25, Lenna Peterson <arklenna at gmail.com> wrote:

> * B-factor: 0.0 in permissive (code comment states this is PDB default)

The default referred to in that code comment is what the PDB annotators put in that field if the information is not provided by the depositor (which could be the case for i.e. an NMR model). From the PDB Atomic Coordinate Entry Format Description, Version 3.30:

	* If the depositor provides the data, then the isotropic B value is given for the temperature factor.
	
	* If there are neither isotropic B values from the depositor, nor anisotropic temperature factors in ANISOU, then the default value of 0.0 is used for the temperature factor.

In other words, the PDB format specification has no recommendations for what default values should be used if the field is blank in a malformed file, only what their staff should put in the entry when they receive it from the depositor.

So IMO Biopython is free to use None if the B-value is missing in a malformed file.

(I haven't checked the other items that Lenna mentions.)

Cheers,
Morten


From anaryin at gmail.com  Mon Aug 12 19:51:03 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Mon, 12 Aug 2013 12:51:03 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
Message-ID: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>

Hi all,

Moving to a new thread because this is a very specific issue.

I think that, from a programming point of view (but I'm a biologist so
correct me if I'm wrong) having None values upon parsing is probably a
better idea. Then, when writing, these should be translated to whatever
default there is in the PDB documentation.

Cheers,

Jo?o


From anaryin at gmail.com  Mon Aug 12 19:51:03 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Mon, 12 Aug 2013 12:51:03 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
Message-ID: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>

Hi all,

Moving to a new thread because this is a very specific issue.

I think that, from a programming point of view (but I'm a biologist so
correct me if I'm wrong) having None values upon parsing is probably a
better idea. Then, when writing, these should be translated to whatever
default there is in the PDB documentation.

Cheers,

Jo?o


From p.j.a.cock at googlemail.com  Mon Aug 12 20:36:15 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 12 Aug 2013 21:36:15 +0100
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
Message-ID: <CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>

On Monday, August 12, 2013, Jo?o Rodrigues wrote:

> Hi all,
>
> Moving to a new thread because this is a very specific issue.
>
> I think that, from a programming point of view (but I'm a biologist so
> correct me if I'm wrong) having None values upon parsing is probably a
> better idea. Then, when writing, these should be translated to whatever
> default there is in the PDB documentation.
>

Or throw an error to force the user to fix it?

Or write a blank occupancy to allow preservation of the
(flawed) input?

(Thank you for raising the output question now, it is a logically
consequence of putting None in the parsed structure)

Peter


From anaryin at gmail.com  Mon Aug 12 20:39:30 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Mon, 12 Aug 2013 13:39:30 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
Message-ID: <CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>

Throwing an error might not be a good idea because when dealing with models
they sometimes have missing fields... then we'd have to fix them all
somehow before parsing them.

The None value seems a good indicator that something is amiss, while not
putting any value there. There should also be a warning upon writing that
the value is being replaced by a default value. Blank is also good
actually, maybe we could add an option to the writer/parser to "preserve"
values?

Cheers,

Jo?o


2013/8/12 Peter Cock <p.j.a.cock at googlemail.com>

>
>
> On Monday, August 12, 2013, Jo?o Rodrigues wrote:
>
>> Hi all,
>>
>> Moving to a new thread because this is a very specific issue.
>>
>> I think that, from a programming point of view (but I'm a biologist so
>> correct me if I'm wrong) having None values upon parsing is probably a
>> better idea. Then, when writing, these should be translated to whatever
>> default there is in the PDB documentation.
>>
>
> Or throw an error to force the user to fix it?
>
> Or write a blank occupancy to allow preservation of the
> (flawed) input?
>
> (Thank you for raising the output question now, it is a logically
> consequence of putting None in the parsed structure)
>
> Peter
>
>


From p.j.a.cock at googlemail.com  Mon Aug 12 20:40:24 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 12 Aug 2013 21:40:24 +0100
Subject: [Biopython-dev] Moratorium on commits?
In-Reply-To: <677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
References: <CAJ9sUYO2sHeQRFsa1A-NYt-_9wSmZ6v7S4knMek_eyoZhhAzZw@mail.gmail.com>
	<CAKVJ-_4YeZycGLpXarODFK8Uyy-P8RiA7x23v-a7LT0V9CzPVA@mail.gmail.com>
	<CEAB55B5-AC7B-46D6-A865-2CFBCD3DC93B@bioxray.dk>
	<CAKVJ-_6OGVBrLrKgrs4Xo1Y7mzS1B01ZYd5V3em7vhp1NSGOPQ@mail.gmail.com>
	<677A1A76-6B62-43E4-A54E-695A834D6088@bioxray.dk>
Message-ID: <CAKVJ-_5BbciDv7qMyJTcs=Z73Zxcj3YdFM74zQy+-jyt1m=7gw@mail.gmail.com>

On Monday, August 12, 2013, Morten Kjeldgaard wrote:

>
> On 11/08/2013, at 22:50, Peter Cock <p.j.a.cock at googlemail.com<javascript:;>>
> wrote:
>
> > I still feel this was a minor change (although of
> > course important to some, including you). This is
> > parsing of malformed PDF files where the user
> > ALREADY gets a warning (or error in strict mode,
> > where there would be no functional change) that
> > there is a problem with the occupancy data.
> >
> > One reason why I specifically talked about small
> > commits (in the sense of a simple diff) above is
> > they are trivial to revert if the need arises, or as
> > in this case, modify:
> >
> https://github.com/biopython/biopython/commit/500c3c2ea900fd8c8f5123f571d4d9a244ee898e
> >
> > This change was suggested and supported by
> > people who've been actively contributing to the
> > Biopython structural module for some time, so I
> > had reason to trust their good judgement, and as
> > I wrote at the time there was a clear consensus
> > with three people in all happy with the idea:
> >
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010773.html
>
>
> I respect that you listen more to developers that

have been contributing for a long time. That is quite

understandable, but I hope that does not prevent

me from contributing my opinions.


Of course not - your input (which was after the initial
change) has already resulted in a review of that
change and the adoption of None instead.

So thank you for speaking up,

Peter


From eric.talevich at gmail.com  Mon Aug 12 22:35:05 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Mon, 12 Aug 2013 15:35:05 -0700
Subject: [Biopython-dev] Codon Alignment GSoC Update
In-Reply-To: <CABM7aFpcMT+BapVya1ESpfKNvvUpdFFTodEo250xkk00yKQZQw@mail.gmail.com>
References: <CABM7aFpcMT+BapVya1ESpfKNvvUpdFFTodEo250xkk00yKQZQw@mail.gmail.com>
Message-ID: <CAMC681=HhPimOqV3Rmmm6jDtt9aJhVdQEisiOp_-sL4g1YfjOQ@mail.gmail.com>

Hi Zheng,

Nice work this week. For the next tasks:

1. It's probably not a high priority to implement all of the dN/dS
approaches described in Yang's book (i.e. LWL85m, LPB93, Ina95), beyond the
simple early methods (NG86, LWL85)  and the finale, YN00. If you get around
to doing them all, cool, but if you only have time to do one more I'd pick
YN00.

2. SciPy is a relatively large dependency, so I recommend making it a
runtime import -- do the import from within the function that needs it,
rather than at the top-level scope of the module. E.g.:
Bio.Phylo._utils.to_networkx

3. Where are you focusing your documentation efforts? If you're keeping
most of the descriptions in the docstrings, it would be convenient to
format the text as reStructuredText for processing with Epydoc and Sphinx.
Time permitting, it would also be nice to have a chapter on this work in
the Tutorial, see Doc/Tutorial.tex (also fine to write this up as a
separate LaTeX document first and roll it in later).

Cheers,
Eric


On Sun, Aug 11, 2013 at 3:04 PM, Zheng Ruan <zruan1991 at gmail.com> wrote:

> Hi all,
>
> An update of Codon Alignment Project can be found at (http://zruanweb.com/).
> In the next week, I will be implementing the Maximum Likelihood method for
> dN/dS ratio estimation. I do not anticipate to write any code for the
> optimization and Scipy's functionality is most suitable to be used here.
> This might be a new dependency for Biopython. Is it okay to add this? Or
> are there some other functions in Biopython for optimization problems?
> Thanks!
>
> Best,
> Zheng Ruan
>


From eric.talevich at gmail.com  Mon Aug 12 23:03:07 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Mon, 12 Aug 2013 16:03:07 -0700
Subject: [Biopython-dev] GSOC weekly update 8
In-Reply-To: <CADoMHjzscM52YLsW0HiYRSMPvxbO=11gLnKgJBApMr-ChGKA1w@mail.gmail.com>
References: <CADoMHjzscM52YLsW0HiYRSMPvxbO=11gLnKgJBApMr-ChGKA1w@mail.gmail.com>
Message-ID: <CAMC681m+W1gTT1hUJUthnbnjKyDg106BJdmWQuS+rQKApvx5=g@mail.gmail.com>

Hi Yanbo,

Looks like excellent progress.

At some point, would you mind documenting how the bit array operations are
used to represent trees, e.g. how a bit array (BitString instance) should
be interpreted in terms of taxa and tree topologies?

Thanks,
Eric


On Mon, Aug 12, 2013 at 6:25 AM, Yanbo Ye <yeyanbo289 at gmail.com> wrote:

> Hi all,
>
> My update about Biopython.Phylo project can be found here:
> http://blog.yeyanbo.com/posts/google-summer-of-code-9.html
>
> Best,
> Yanbo
>
> --
>
> *Yanbo Ye*
> *Guangzhou Institutes of Biomedicine and Health, *
> *Chinese Academy of Sciences*
> *190 Kaiyuan Avenue, Science Park, Guangzhou, China**
> *
> *
> *
> *Email: ye_yanbo at gibh.ac.cn*
> *Web: http://www.yeyanbo.com*
> *Phone: (86)-020-32093810*
>


From p.j.a.cock at googlemail.com  Wed Aug 14 09:44:24 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 14 Aug 2013 10:44:24 +0100
Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation
In-Reply-To: <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com>
References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com>
	<CAKVJ-_6SYCoL+2d=Cogkf4ws=iDTyxc0qdAfZZRJi-5jYc3TqA@mail.gmail.com>
	<86a9lcl1nt.fsf@fastmail.fm>
	<CAKVJ-_4kvUGeOW4CZ15swd_e48t8B_MrWCaO4mESAqC9_uLdYA@mail.gmail.com>
	<CAKVJ-_5ABuwE59WOfXW26yyEWff8ob-8KE+vMipMigv4bFLZfQ@mail.gmail.com>
	<1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com>
Message-ID: <CAKVJ-_4yYku2hbPATWOjjRc7J6sbpeqKhM8q8woBfedax54w7Q@mail.gmail.com>

On Friday, July 26, 2013 Peter wrote:
> On Wed, Jul 24, 2013  Peter Cock wrote:
>> On Wed, Jul 24, 2013 Brad Chapman wrote:
>>>
>>> Peter and Michiel;
>>>
>>>>> Do we actually need setuptools?
>>>>> Looking at setup.py, it seems that distutils is sufficient for our
>>>>> needs.
>>>>> If so, let's remove the dependency on setuptools.
>>>
>>> We used setuptools/distribute to install dependencies, although
>>> practically this doesn't work well since pip doesn't finish NumPy
>>> installation before installing Biopython. So I'm fine with taking it out
>>> if you want to simplify the setup and avoid the extra dependency.
>>
>> Sounds like a plan - but we should all test this change, especially
>> users of PIP, easy_install, virtual env etc.
>>
>
> So who's going to do the commit - Brad or Michiel?
>
> Peter
>

On Fri, Jul 26, 2013 at 1:09 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> Brad, can you do it?
> Best,
> -Michiel.

I've done it:
https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9

This needs some more testing now - particularly with the
non-standard install options like pip, easy_install, etc.

Peter


From p.j.a.cock at googlemail.com  Thu Aug 15 11:28:47 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 15 Aug 2013 12:28:47 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 next week?
Message-ID: <CAKVJ-_4+Qh6HQcWPPeb-yFmdDchCY2oQ2kths=GtJfK2srx+7w@mail.gmail.com>

Hello all,

Are there any remaining issues people think need to be
resolve prior to releasing Biopython 1.62? If not, unless
anyone else volunteers, I will make time for this next week.

Possible issues worth reviewing - please reply on the
existing threads:

Changes to setup.py to remove use of setuptools,
this would benefit from wider testing:
https://github.com/biopython/biopython/commit/f8e51906709d0c85be9f2b921eb3f68eed5524f9
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010806.html

Changes to PDB occupancy, do we need to change
PDB writing in light of this?
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010802.html

Update the Prank tool test to work with recent versions:
http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010757.html

Note that PyPy now have a beta out support Python 3,
it would be nice to fully test with that as well...
http://morepypy.blogspot.co.uk/2013/07/pypy3-21-beta-1.html

Thanks,

Peter


From arklenna at gmail.com  Thu Aug 15 13:18:35 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 15 Aug 2013 09:18:35 -0400
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
Message-ID: <CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>

On Monday, 12 August 2013, Jo?o Rodrigues wrote:

> Throwing an error might not be a good idea because when dealing with models
> they sometimes have missing fields... then we'd have to fix them all
> somehow before parsing them.
>
> The None value seems a good indicator that something is amiss, while not
> putting any value there. There should also be a warning upon writing that
> the value is being replaced by a default value. Blank is also good
> actually, maybe we could add an option to the writer/parser to "preserve"
> values?
>
>
I don't think writing string "None" into a fixed width field would be a
good idea. So it's probably best to change occupancy (and any other missing
values set to None) to blank, correct width fields for writing.

I've never tangled with the writer and I have incoming PhD students this
week but I can attempt to add this functionality early next week.

Lenna


From p.j.a.cock at googlemail.com  Thu Aug 15 13:23:50 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 15 Aug 2013 14:23:50 +0100
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
Message-ID: <CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>

On Thu, Aug 15, 2013 at 2:18 PM, Lenna Peterson <arklenna at gmail.com> wrote:
> On Monday, 12 August 2013, Jo?o Rodrigues wrote:
>>
>> Throwing an error might not be a good idea because when dealing with
>> models
>> they sometimes have missing fields... then we'd have to fix them all
>> somehow before parsing them.
>>
>> The None value seems a good indicator that something is amiss, while not
>> putting any value there. There should also be a warning upon writing that
>> the value is being replaced by a default value. Blank is also good
>> actually, maybe we could add an option to the writer/parser to "preserve"
>> values?
>>
>
> I don't think writing string "None" into a fixed width field would be a good
> idea. So it's probably best to change occupancy (and any other missing
> values set to None) to blank, correct width fields for writing.

I didn't mean to suggest writing the string "None" in the field, and
I'm not sure if Jo?o did - it would certainly be an invalid PDB file.

I agree that where the data structure has None (e.g. from our parser)
then the writer could use a blank string (of the appropriate width).
For mandatory fields like occupancy, this should give a warning.

> I've never tangled with the writer and I have incoming PhD students this
> week but I can attempt to add this functionality early next week.

That would be great (assuming no-one else want to tackle it sooner).

Thanks,

Peter


From arklenna at gmail.com  Thu Aug 15 14:54:53 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 15 Aug 2013 10:54:53 -0400
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
Message-ID: <CAHQkFdenX9S8d6de6RYBW1wAHpxXi4aP7qTtzNm3Hy6H-nkn5Q@mail.gmail.com>

> > I don't think writing string "None" into a fixed width field would be a
> good
> > idea. So it's probably best to change occupancy (and any other missing
> > values set to None) to blank, correct width fields for writing.
>
> I didn't mean to suggest writing the string "None" in the field, and
> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>
>
I didn't mean anyone was suggesting we intentionally do this, but I bet
that's what the writer is doing now!


From eric.talevich at gmail.com  Thu Aug 15 17:35:00 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Thu, 15 Aug 2013 10:35:00 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
Message-ID: <CAMC681k-zTdyM4_WEoF2dH70tCBxfVm=a=0iB+F0gHxdWfWDRA@mail.gmail.com>

On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Thanks for these details Ben - it sounds like a mixture of real
> test failures, and mere warnings that an external tool wasn't
> found.
>
> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton <ben at benfulton.net> wrote:
> > My test machine was running Ubuntu 12.04.
> >
> > For fasttree I installed version 2.1.4-1~ubuntu12.04.1 using apt-get, and
> > got this error:
> > ApplicationError: Command 'fasttree -out temp_test.tree
> > Quality/example.fasta' returned non-zero exit status 1, 'Unknown or
> > incorrect use of option -out'
>
> I don't seem to have fasttree installed at all, and from the
> test and wrapper it is not explicit about which version is
> was originally written for.
>

I pushed a patch to not use the potentially problematic '-out' flag:
https://github.com/biopython/biopython/commit/771c1ed23bbb39dcf37805b4cb7bb23ffcb0c46a

According to FastTree's changelog (
http://www.microbesonline.org/fasttree/ChangeLog), the -out option was
added in version 2.1.5, released August 30, 2012. So the 'fasttree' package
on the stable Ubuntu (12.04) does not have the -out flag, but the package
in subsequent Ubuntus and other Debian derivatives does.

-Eric


From eric.talevich at gmail.com  Thu Aug 15 23:44:38 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Thu, 15 Aug 2013 16:44:38 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
Message-ID: <CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>

On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Thanks for these details Ben - it sounds like a mixture of real
> test failures, and mere warnings that an external tool wasn't
> found.
>
> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton <ben at benfulton.net> wrote:
> > My test machine was running Ubuntu 12.04.
>
[...]
> > I downloaded version 130708 of Prank from
> > http://code.google.com/p/prank-msa/downloads/list. The error is on line
> 165
> > of the test file:
> >
> > AssertionError:
> > -----------------
> >  PRANK v.130708:
> > -----------------
> >
> > Input for the analysis
> >  - converting 'Quality/example.fasta' to 'temp with space.phy'
>
> This sounds like a minor change in the stdout with recent
> versions of PRANK.
>
>
It's more exciting than that: Old versions of Prank created .xml and .dnd
files by default, and had "-noxml" and "-notree" options to avoid creating
them (or clean them up, whichever). New Pranks do not create these files by
default, but do have "-showxml" and "-showtree" flags if you want them.

I removed the use of these flags in the unit test. One of the tests used
the set_parameter method, so I substituted the "-dots" flag for "-notree".
It passes on my machine now:
https://github.com/biopython/biopython/commit/30d7bcfb6eab8283a53372b2ad64b59be7461eb3

The doctests in Bio/Align/Applications/_Prank.py should probably change,
too, since the same flags are used there. (I have not done this.)

-Eric


From w.arindrarto at gmail.com  Fri Aug 16 07:14:24 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Fri, 16 Aug 2013 09:14:24 +0200
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
Message-ID: <CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>

Hi Michiel, Peter,

In preparation for the 1.62 release, I've made the following changes
to Bio.NCBIStandalone and Bio.ParserSupport:

* Migrated the two modules under Bio.SearchIO._legacy
* Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning

I've pushed the changes to this branch:
https://github.com/bow/biopython/tree/bio_blast_migrate

Tests seem to be running fine still, but now there is the awkward
situation where if users import Bio.NCBIStandalone and/or
Bio.ParserSupport directly they will be greeted with two warnings: the
BiopythonWarning for the modules' deprecation and the
BiopythonExperimentalWarning for SearchIO.

We could suppress the SearchIO warning in Bio.NCBIStandalone and
Bio.ParserSupport. But before this is done, I was wondering if we have
a defined timeline for removing a BiopythonExperimentalWarning? (i.e.
if it will be removed in this release, then we could do that instead).

Any opinions on this :)?

Cheers,
Bow

On Sat, Jul 13, 2013 at 12:54 PM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> Hi Bow,
>
>
>> Would it be ok if we move parts that are used by SearchIO into their own
>> private classes in
>> Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current
>> files?
>
> That sounds fine to me. Any other opinions, anybody?
>
> Best,
> -Michiel.
>
> ________________________________
> From: Wibowo Arindrarto <w.arindrarto at gmail.com>
> To: Michiel de Hoon <mjldehoon at yahoo.com>
> Cc: Peter Cock <p.j.a.cock at googlemail.com>; Eric Talevich
> <eric.talevich at gmail.com>; Zheng Ruan <zruan1991 at gmail.com>; Biopython-Dev
> Mailing List <biopython-dev at biopython.org>
> Sent: Saturday, July 13, 2013 3:58 PM
> Subject: Re: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
>
> Hi Michiel,
>
> There are two classes from Bio.Blast.NCBIStandalone still being used
> by Bio.SearchIO internally (for the BLAST text parser): the
> BlastParser and the Iterator classes. The BlastParser class itself
> still relies on Bio.ParserSupport. Would it be ok if we move parts
> that are used by SearchIO into their own private classes in
> Bio.SearchIO, while putting the BiopythonDeprecationWarning on the
> current files?
>
> Best regards,
> Bow
>
> On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon <mjldehoon at yahoo.com>
> wrote:
>> The following pieces of code had a PendingDeprecationWarning in Biopython
>> release 1.61, and can be upgraded to a BiopythonDeprecationWarning:
>>
>> Bio.Blast.NCBIStandalone (entire module). This module has had a
>> PendingDeprecationWarning since September 2010.
>>
>> Bio.Motif (entire module). Its functionality is available from Bio.motifs,
>> so Bio.Motif can be deprecated.
>>
>> Bio.ParserSupport (entire module). This module is currently only being
>> used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning
>> since September 2011.
>>
>> Any final objections?
>>
>> Best,
>> -Michiel
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
>


From p.j.a.cock at googlemail.com  Fri Aug 16 09:31:13 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 16 Aug 2013 10:31:13 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
Message-ID: <CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>

On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote:
> On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote:
>> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote:
>> > I downloaded version 130708 of Prank from
>> > http://code.google.com/p/prank-msa/downloads/list.
>> >  The error is on line 165 of the test file:
>> >
>> > AssertionError:
>> > -----------------
>> >  PRANK v.130708:
>> > -----------------
>> >
>> > Input for the analysis
>> >  - converting 'Quality/example.fasta' to 'temp with space.phy'
>>
>> This sounds like a minor change in the stdout with recent
>> versions of PRANK.
>>
>
> It's more exciting than that: Old versions of Prank created .xml and .dnd
> files by default, and had "-noxml" and "-notree" options to avoid creating
> them (or clean them up, whichever). New Pranks do not create these files by
> default, but do have "-showxml" and "-showtree" flags if you want them.

Well that API break is a bit annoying, but your test changes make sense.

Do we need to add these new switches to the wrapper itself?

Peter


From eric.talevich at gmail.com  Sun Aug 18 18:14:13 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Sun, 18 Aug 2013 11:14:13 -0700
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
	<CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>
Message-ID: <CAMC681=tq26Y+M4zE=F++nHHc+jsC1mwZ-pW37Gp3B2BHa-SPA@mail.gmail.com>

On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Aug 16, 2013 at 12:44 AM, Eric Talevich wrote:
> > On Fri, Aug 2, 2013 at 2:31 AM, Peter Cock wrote:
> >> On Fri, Aug 2, 2013 at 3:20 AM, Ben Fulton wrote:
> >> > I downloaded version 130708 of Prank from
> >> > http://code.google.com/p/prank-msa/downloads/list.
> >> >  The error is on line 165 of the test file:
> >> >
> >> > AssertionError:
> >> > -----------------
> >> >  PRANK v.130708:
> >> > -----------------
> >> >
> >> > Input for the analysis
> >> >  - converting 'Quality/example.fasta' to 'temp with space.phy'
> >>
> >> This sounds like a minor change in the stdout with recent
> >> versions of PRANK.
> >>
> >
> > It's more exciting than that: Old versions of Prank created .xml and .dnd
> > files by default, and had "-noxml" and "-notree" options to avoid
> creating
> > them (or clean them up, whichever). New Pranks do not create these files
> by
> > default, but do have "-showxml" and "-showtree" flags if you want them.
>
> Well that API break is a bit annoying, but your test changes make sense.
>
> Do we need to add these new switches to the wrapper itself?
>

Here's the commit to add those switches to the wrapper:
https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1

I suppose it would be helpful if the wrapper detected the version of Prank
and handled the show(tree|xml) flags appropriately to avoid errors. But
that would require running the executable first, I think, which is not
something our wrappers normally do. (And then it would make sense to cache
the result for the duration of the running process.)

-Eric


From p.j.a.cock at googlemail.com  Sun Aug 18 18:39:08 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sun, 18 Aug 2013 19:39:08 +0100
Subject: [Biopython-dev] 1.62b test coverage report
In-Reply-To: <CAMC681=tq26Y+M4zE=F++nHHc+jsC1mwZ-pW37Gp3B2BHa-SPA@mail.gmail.com>
References: <CA+ijMs=yT-=CFr+qwkOZ107oBN0wEFdjC9uMFCh+j1YfDD4DZw@mail.gmail.com>
	<CAKVJ-_5+mSW5NkmxN94w-8qu+e=q4COyZaAx6UzCmrwuJdU9aQ@mail.gmail.com>
	<CAMC681mkn3FBBdyEMTkba1T60KoQmcR1NHLKHNLc1pUo9A5rWw@mail.gmail.com>
	<CA+ijMs=XOe5Q6cE5vCg_OdnjcTGA=ZjuCJVCjcWY+reW1=jnnQ@mail.gmail.com>
	<CAKVJ-_6Rp5RC4pM1wMJ4k2qyZKhOCqWmd8e-x+ak7dQqHOyXqw@mail.gmail.com>
	<CAMC681m5_VXZr1psoqTozuwN=5XcENkDDxP48DVKs-fPydfrKw@mail.gmail.com>
	<CAKVJ-_6k5NjNhOMhSUbxYo6-fN1zitwEsr0Xm1kBd57PTsAVuA@mail.gmail.com>
	<CAMC681=tq26Y+M4zE=F++nHHc+jsC1mwZ-pW37Gp3B2BHa-SPA@mail.gmail.com>
Message-ID: <CAKVJ-_6eMHt-qbC7G7HZ9oafYW4i8+xDiq24AUKapKjtFnPq-A@mail.gmail.com>

On Sun, Aug 18, 2013 at 7:14 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
> On Fri, Aug 16, 2013 at 2:31 AM, Peter Cock wrote:
>>
>> Well that API break is a bit annoying, but your test changes make sense.
>>
>> Do we need to add these new switches to the wrapper itself?
>
>
> Here's the commit to add those switches to the wrapper:
> https://github.com/biopython/biopython/commit/cc234b75e6e82cf9f51e3384a4fbfa1e888a3af1
>
> I suppose it would be helpful if the wrapper detected the version of Prank
> and handled the show(tree|xml) flags appropriately to avoid errors. But that
> would require running the executable first, I think, which is not something
> our wrappers normally do. (And then it would make sense to cache the result
> for the duration of the running process.)
>
> -Eric

Historically we've just documented this kind of issue in the
parameter docstring - the idea of auto-running the tool in
the background to check the version just sounds like Trouble.

Peter


From yeyanbo289 at gmail.com  Mon Aug 19 07:36:00 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 19 Aug 2013 15:36:00 +0800
Subject: [Biopython-dev] GSOC weekly update 10
Message-ID: <CADoMHjwSFvZnRcY7i_GTwHtj84rrchJVo96uPAQai0Tej600nw@mail.gmail.com>

Hi all,

Biopython.Phylo project update of last week is here:
http://blog.yeyanbo.com/posts/google-summer-of-code-10.html

Thanks,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*


From zruan1991 at gmail.com  Mon Aug 19 15:06:05 2013
From: zruan1991 at gmail.com (Zheng Ruan)
Date: Mon, 19 Aug 2013 11:06:05 -0400
Subject: [Biopython-dev] Codon Alignment GSoC Weekly Update
Message-ID: <CABM7aFohv-E-MzWGbVNrnpPS4BHWoui_9MaNZuEYH8YTFwLqfA@mail.gmail.com>

Hi all,

An update of CodonAlignment GSoC can be found at (http://zruanweb.com/).
Thanks for your comments and suggestions.

Best,
Zheng Ruan


From michael.maher at ucsf.edu  Mon Aug 19 19:24:04 2013
From: michael.maher at ucsf.edu (Cyrus Maher)
Date: Mon, 19 Aug 2013 12:24:04 -0700
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
Message-ID: <CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>

Hi everybody!!-

My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab
of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...

I am writing because I'm interested in submitting a new Biopython module.
Since this is likely a one-time event, the wiki recommends proceeding
through a developer. After speaking with Peter Cock, he recommended that I
open things up for discussion on the mailing list.

Attached is a draft that describes a new method, termed MOSAIC, which
integrates multiple sequence alignments from an arbitrary number number of
sources. We show that it greatly increases the number of orthologs that we
are able to detect while maintaining or improving functional-,
phylogenetic-, and sequence identity-based measures of ortholog quality.

Code and documentation may be found here:

https://dl.dropboxusercontent.com/u/43327584/html/index.html

Looking forward to hearing what you think!

Best,

-Cyrus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OD_fullpaper_8_5_13.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 1666812 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython-dev/attachments/20130819/85ee506a/attachment.docx>

From davidjosephcain at gmail.com  Mon Aug 19 21:18:48 2013
From: davidjosephcain at gmail.com (David Cain)
Date: Mon, 19 Aug 2013 17:18:48 -0400
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
Message-ID: <CAPyP4u+jjFx6pwNC8406BR9QvUq7OHov-jKGdoORmT_=TcdVPA@mail.gmail.com>

Hi, Cyrus! Before the constructive criticism, I just wanted to say your
module looks excellent and thank you for opening it up as free software!

I'm by no means a developer (just interested in Biopython's development),
but I noticed your code generally doesn't adhere to
PEP8<http://www.python.org/dev/peps/pep-0008/>.
If you're interested in getting feedback from others, it's quite valuable
to format your code by the standards. (Proper PEP 8 code has a look and
feel that's easier for the trained eye to view).

Key things that detract from your module's readability:
- CamelCase method, module, and field names (when a Python developer sees
these, they're prone to assuming the name is for a class). Of course,
Biopython doesn't provide the best example here, but there are reasons for
that<http://www.biopython.org/pipermail/biopython-dev/2012-September/009938.html>
(it'll
be fixed eventually). All-caps names are either refrained from use, or used
for constants (i.e. you may wish to rename your module `mosaic`).
- Very long line wrapping - you should really try to keep your lines to 79
characters
- Using integers as booleans (you should stick to True/False, e.g. `while
True` in lieu of `while 1`)
- module renamings: it's much easier to see `random.shuffle` over
`r.shuffle`, as one can assume `random` is the standard module, whereas `r`
might be completely different.

Also, your module should definitely remove usage of pdb if you wish to
publish it as part of an official Python package.

Would you be open to hosting a development branch of your code on GitHub or
a similar community-editable resource? Any acceptance to the official
Biopython distribution would of course be up to the main devs, but I'd be
more than happy to test your code and make suggestions, regardless of its
integration to a third-party package.

David


From christian at brueffer.de  Tue Aug 20 11:36:09 2013
From: christian at brueffer.de (Christian Brueffer)
Date: Tue, 20 Aug 2013 13:36:09 +0200
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
Message-ID: <521354A9.6020701@brueffer.de>

On 8/19/13 21:24 , Cyrus Maher wrote:
> Hi everybody!!-
> 
> My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the lab
> of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> 
> I am writing because I'm interested in submitting a new Biopython module.
> Since this is likely a one-time event, the wiki recommends proceeding
> through a developer. After speaking with Peter Cock, he recommended that I
> open things up for discussion on the mailing list.
> 
> Attached is a draft that describes a new method, termed MOSAIC, which
> integrates multiple sequence alignments from an arbitrary number number of
> sources. We show that it greatly increases the number of orthologs that we
> are able to detect while maintaining or improving functional-,
> phylogenetic-, and sequence identity-based measures of ortholog quality.
> 
> Code and documentation may be found here:
> 
> https://dl.dropboxusercontent.com/u/43327584/html/index.html
> 
> Looking forward to hearing what you think!
> 

Hi Cyrus,

I agree with David on the PEP8 issue.  A very nice tool to use is the
pep8 checker, https://pypi.python.org/pypi/pep8

I see that you use MSAProbs.  I have an MSAProbs application wrapper in
the works.  I haven't submitted it yet due to incomplete unit tests,
but maybe it's useful to you:

https://github.com/cbrueffer/biopython/tree/msaprobs

Cheers,

Chris


From michael.maher at ucsf.edu  Tue Aug 20 18:24:43 2013
From: michael.maher at ucsf.edu (Cyrus Maher)
Date: Tue, 20 Aug 2013 11:24:43 -0700
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <521354A9.6020701@brueffer.de>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
Message-ID: <CAME4z058aywiZJmAVnXqDosfSsgOmgWPJaoDURL11s_eHR3saA@mail.gmail.com>

Thanks for your feedback, guys!! I did a bit of general clean-up and I've
made all the recommended PEP8 changes, with the exception that I kept
capital letters if they were part of an acronym. I've also switched the
link in the documentation over to github and configured mosaic to use the
MSAProbs application wrapper if it's installed. Let me know what you think!!

Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html
Code: https://github.com/cyrusmaher/mosaic

Cheers,

-Cyrus


On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer
<christian at brueffer.de>wrote:

> On 8/19/13 21:24 , Cyrus Maher wrote:
> > Hi everybody!!-
> >
> > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> lab
> > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> >
> > I am writing because I'm interested in submitting a new Biopython module.
> > Since this is likely a one-time event, the wiki recommends proceeding
> > through a developer. After speaking with Peter Cock, he recommended that
> I
> > open things up for discussion on the mailing list.
> >
> > Attached is a draft that describes a new method, termed MOSAIC, which
> > integrates multiple sequence alignments from an arbitrary number number
> of
> > sources. We show that it greatly increases the number of orthologs that
> we
> > are able to detect while maintaining or improving functional-,
> > phylogenetic-, and sequence identity-based measures of ortholog quality.
> >
> > Code and documentation may be found here:
> >
> > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> >
> > Looking forward to hearing what you think!
> >
>
> Hi Cyrus,
>
> I agree with David on the PEP8 issue.  A very nice tool to use is the
> pep8 checker, https://pypi.python.org/pypi/pep8
>
> I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> the works.  I haven't submitted it yet due to incomplete unit tests,
> but maybe it's useful to you:
>
> https://github.com/cbrueffer/biopython/tree/msaprobs
>
> Cheers,
>
> Chris
>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From mok at bioxray.dk  Tue Aug 20 18:35:14 2013
From: mok at bioxray.dk (Morten Kjeldgaard)
Date: Tue, 20 Aug 2013 20:35:14 +0200
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdenX9S8d6de6RYBW1wAHpxXi4aP7qTtzNm3Hy6H-nkn5Q@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
	<CAHQkFdenX9S8d6de6RYBW1wAHpxXi4aP7qTtzNm3Hy6H-nkn5Q@mail.gmail.com>
Message-ID: <43FD0A6C-ED54-4861-AADA-9F3E8FB6172A@bioxray.dk>


On 15/08/2013, at 16:54, Lenna Peterson <arklenna at gmail.com> wrote:

>>> I don't think writing string "None" into a fixed width field would be a
>> good
>>> idea. So it's probably best to change occupancy (and any other missing
>>> values set to None) to blank, correct width fields for writing.
>> 
>> I didn't mean to suggest writing the string "None" in the field, and
>> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>> 
>> 
> I didn't mean anyone was suggesting we intentionally do this, but I bet
> that's what the writer is doing now!

I think the output should be identical to the input if a PDB file is read and then written again (apart from the fact that  Bio.PDB currently doesn't save all headers.)

Cheers,
Morten


From davidjosephcain at gmail.com  Tue Aug 20 21:25:07 2013
From: davidjosephcain at gmail.com (David Cain)
Date: Tue, 20 Aug 2013 17:25:07 -0400
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z058aywiZJmAVnXqDosfSsgOmgWPJaoDURL11s_eHR3saA@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
	<CAME4z058aywiZJmAVnXqDosfSsgOmgWPJaoDURL11s_eHR3saA@mail.gmail.com>
Message-ID: <CAPyP4uLWAR4MVkeZ9ZafsW+zTgp_CYdZ-ZQYWmkCQnEeB_=tTg@mail.gmail.com>

Hi, Cyrus - I took a quick look at your code on GitHub. Did you publish a
different version of MOSAIC? By my linter, there are 309 PEP8 errors on
mosaic.py.

Also, as a general comment, your code seems to rely on sys.exit
extensively. Python's exception framework is pretty handy - maybe your
module could raise its own custom exceptions (Biopython's PDB parser is a
good example of this design strategy).


David Cain
+1 (339) 222 4452


On Tue, Aug 20, 2013 at 2:24 PM, Cyrus Maher <michael.maher at ucsf.edu> wrote:

> Thanks for your feedback, guys!! I did a bit of general clean-up and I've
> made all the recommended PEP8 changes, with the exception that I kept
> capital letters if they were part of an acronym. I've also switched the
> link in the documentation over to github and configured mosaic to use the
> MSAProbs application wrapper if it's installed. Let me know what you
> think!!
>
> Docs: https://dl.dropboxusercontent.com/u/43327584/html/index.html
> Code: https://github.com/cyrusmaher/mosaic
>
> Cheers,
>
> -Cyrus
>
>
> On Tue, Aug 20, 2013 at 4:36 AM, Christian Brueffer
> <christian at brueffer.de>wrote:
>
> > On 8/19/13 21:24 , Cyrus Maher wrote:
> > > Hi everybody!!-
> > >
> > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> > lab
> > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> > >
> > > I am writing because I'm interested in submitting a new Biopython
> module.
> > > Since this is likely a one-time event, the wiki recommends proceeding
> > > through a developer. After speaking with Peter Cock, he recommended
> that
> > I
> > > open things up for discussion on the mailing list.
> > >
> > > Attached is a draft that describes a new method, termed MOSAIC, which
> > > integrates multiple sequence alignments from an arbitrary number number
> > of
> > > sources. We show that it greatly increases the number of orthologs that
> > we
> > > are able to detect while maintaining or improving functional-,
> > > phylogenetic-, and sequence identity-based measures of ortholog
> quality.
> > >
> > > Code and documentation may be found here:
> > >
> > > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> > >
> > > Looking forward to hearing what you think!
> > >
> >
> > Hi Cyrus,
> >
> > I agree with David on the PEP8 issue.  A very nice tool to use is the
> > pep8 checker, https://pypi.python.org/pypi/pep8
> >
> > I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> > the works.  I haven't submitted it yet due to incomplete unit tests,
> > but maybe it's useful to you:
> >
> > https://github.com/cbrueffer/biopython/tree/msaprobs
> >
> > Cheers,
> >
> > Chris
> >
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> >
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From arklenna at gmail.com  Tue Aug 20 21:31:40 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Tue, 20 Aug 2013 17:31:40 -0400
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <521354A9.6020701@brueffer.de>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
Message-ID: <CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>

Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8
(it can be a bit aggressive but that's what version control is for, right?)

Cheers,

Lenna


On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer
<christian at brueffer.de>wrote:

> On 8/19/13 21:24 , Cyrus Maher wrote:
> > Hi everybody!!-
> >
> > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> lab
> > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> >
> > I am writing because I'm interested in submitting a new Biopython module.
> > Since this is likely a one-time event, the wiki recommends proceeding
> > through a developer. After speaking with Peter Cock, he recommended that
> I
> > open things up for discussion on the mailing list.
> >
> > Attached is a draft that describes a new method, termed MOSAIC, which
> > integrates multiple sequence alignments from an arbitrary number number
> of
> > sources. We show that it greatly increases the number of orthologs that
> we
> > are able to detect while maintaining or improving functional-,
> > phylogenetic-, and sequence identity-based measures of ortholog quality.
> >
> > Code and documentation may be found here:
> >
> > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> >
> > Looking forward to hearing what you think!
> >
>
> Hi Cyrus,
>
> I agree with David on the PEP8 issue.  A very nice tool to use is the
> pep8 checker, https://pypi.python.org/pypi/pep8
>
> I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> the works.  I haven't submitted it yet due to incomplete unit tests,
> but maybe it's useful to you:
>
> https://github.com/cbrueffer/biopython/tree/msaprobs
>
> Cheers,
>
> Chris
>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From arklenna at gmail.com  Tue Aug 20 22:16:18 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Tue, 20 Aug 2013 18:16:18 -0400
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
Message-ID: <CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>

On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:
>
>
> I didn't mean to suggest writing the string "None" in the field, and
> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>
> I agree that where the data structure has None (e.g. from our parser)
> then the writer could use a blank string (of the appropriate width).
> For mandatory fields like occupancy, this should give a warning.
>
>
As I suspected, the writer currently fails on None (it's expecting a
float). Test-driven development!

However, I don't see a simple or elegant way to force writing of a blank
occupancy. ATOM lines are currently written using C-style string
formatting, and the occupancy field is `%6.2f`.

Off the top of my head, I'd:

1. Store the original format string
2. Modify the format string to have "%6s" at the appropriate position
3. Modify the occupancy to be an empty string or a space
4. Set the return value to the formatted string
5. Restore the original format string
6. Return the return value

However, this seems...ugly at best. I don't know that switching formatting
styles (e.g. to string.format() or others) will help. And in most
circumstances, the type checking of the format string is useful.

Any thoughts?

Cheers,

Lenna


From anaryin at gmail.com  Tue Aug 20 22:25:57 2013
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 20 Aug 2013 15:25:57 -0700
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
	<CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
Message-ID: <CAJ9sUYP4CVc5uA1La2b8hKRzmGX4NcnoEEOR+ZSPA8NQ_yu_jg@mail.gmail.com>

Hi,

We should probably change it to str.format() regardless of advantages.

If we indeed have None in the parser then writing becomes a bit more
complicated. But I guess it's more correct? I'd vote for having a small
check/conversion on the writer, besides on the formatting of the string.

As a biologist, I don't care if it is none of empty string, or whatever,
but for scripting maybe it makes more sense to be None? That's what I mean
with more correct.

Cheers,

Jo?o


From michael.maher at ucsf.edu  Wed Aug 21 22:00:04 2013
From: michael.maher at ucsf.edu (Cyrus Maher)
Date: Wed, 21 Aug 2013 15:00:04 -0700
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
	<CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>
Message-ID: <CAME4z078NxcRxMFVs4trJHmz5VQ_0FGYDDLtKz3o_o5nmFZn2g@mail.gmail.com>

Thanks for sending that along Lenna! And thanks everybody for being patient
with me! This is my first experience sharing software, so it's great to
learn from you guys...

As far as updates:
-I've fixed all pep8 errors, with the exception of some finicky
continuation indent complaints.
-I've also uploaded example files so that the file "mosaic_example.py" can
be run without modification. From the mosaic directory, just type:
    python mosaic_example.py testfiles.txt
-The documentation has be updated as well.

I would of course be open to any additional feedback you guys could offer
for improving the code.

That said, I was also hoping to get your thoughts on whether this seemed
like the type of project that would fit in with Biopython. Peter said that
Eric might have some good comments on this matter?


Cheers,

-Cyrus


On Tue, Aug 20, 2013 at 2:31 PM, Lenna Peterson <arklenna at gmail.com> wrote:

> Also worth noting is autopep8: https://pypi.python.org/pypi/autopep8
> (it can be a bit aggressive but that's what version control is for, right?)
>
> Cheers,
>
> Lenna
>
>
> On Tue, Aug 20, 2013 at 7:36 AM, Christian Brueffer
> <christian at brueffer.de>wrote:
>
> > On 8/19/13 21:24 , Cyrus Maher wrote:
> > > Hi everybody!!-
> > >
> > > My name is (Michael) Cyrus Maher, and I'm a PhD student at UCSF in the
> > lab
> > > of Dr. Ryan D. Hernandez (http://bts.ucsf.edu/hernandez_lab/)...
> > >
> > > I am writing because I'm interested in submitting a new Biopython
> module.
> > > Since this is likely a one-time event, the wiki recommends proceeding
> > > through a developer. After speaking with Peter Cock, he recommended
> that
> > I
> > > open things up for discussion on the mailing list.
> > >
> > > Attached is a draft that describes a new method, termed MOSAIC, which
> > > integrates multiple sequence alignments from an arbitrary number number
> > of
> > > sources. We show that it greatly increases the number of orthologs that
> > we
> > > are able to detect while maintaining or improving functional-,
> > > phylogenetic-, and sequence identity-based measures of ortholog
> quality.
> > >
> > > Code and documentation may be found here:
> > >
> > > https://dl.dropboxusercontent.com/u/43327584/html/index.html
> > >
> > > Looking forward to hearing what you think!
> > >
> >
> > Hi Cyrus,
> >
> > I agree with David on the PEP8 issue.  A very nice tool to use is the
> > pep8 checker, https://pypi.python.org/pypi/pep8
> >
> > I see that you use MSAProbs.  I have an MSAProbs application wrapper in
> > the works.  I haven't submitted it yet due to incomplete unit tests,
> > but maybe it's useful to you:
> >
> > https://github.com/cbrueffer/biopython/tree/msaprobs
> >
> > Cheers,
> >
> > Chris
> >
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython-dev
> >
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From p.j.a.cock at googlemail.com  Thu Aug 22 13:01:27 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 22 Aug 2013 14:01:27 +0100
Subject: [Biopython-dev] Fwd: New Biopython (sub)module?
In-Reply-To: <CAME4z078NxcRxMFVs4trJHmz5VQ_0FGYDDLtKz3o_o5nmFZn2g@mail.gmail.com>
References: <CAME4z04Se6sQ73wgWqqWD7+_ZUxArHGuqzZFAGMx2SqhoGoYJw@mail.gmail.com>
	<CAME4z05YS0c-A0m-2-_y+Et1GJcnZ+_Nm3vigbWPSKGgC=ju4w@mail.gmail.com>
	<521354A9.6020701@brueffer.de>
	<CAHQkFdcG4dgm3JgwW4F3SYKsaKOWdKwS3Thwa4KxNsgxsDxwfQ@mail.gmail.com>
	<CAME4z078NxcRxMFVs4trJHmz5VQ_0FGYDDLtKz3o_o5nmFZn2g@mail.gmail.com>
Message-ID: <CAKVJ-_4JOLva-8j9xoD2LDMNcsLTPGxn867FdVihK4e+m0y77w@mail.gmail.com>

On Wed, Aug 21, 2013 at 11:00 PM, Cyrus Maher <michael.maher at ucsf.edu> wrote:
>
> That said, I was also hoping to get your thoughts on whether this seemed
> like the type of project that would fit in with Biopython. Peter said that
> Eric might have some good comments on this matter?

Right - I was thinking Eric and this year's phylogenetic focused GSoC
students should have some good comments, e.g. about adding
something like pal2nal into Biopython.

Peter


From p.j.a.cock at googlemail.com  Fri Aug 23 08:54:35 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 23 Aug 2013 09:54:35 +0100
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
Message-ID: <CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>

On Fri, Aug 16, 2013 at 8:14 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Michiel, Peter,
>
> In preparation for the 1.62 release, I've made the following changes
> to Bio.NCBIStandalone and Bio.ParserSupport:
>
> * Migrated the two modules under Bio.SearchIO._legacy
> * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning

So basically you're proposing formally deprecating parsing plain
text BLAST output (via NCBIStandalone and Bio.ParserSupport)
but continuing to support this format via SearchIO (using a copy
of the current parser as a private module)?

This then gives you the freedom to rewrite the old text parser
more simply (e.g. assuming only recent versions of the BLAST
suite), which might be nice.

> I've pushed the changes to this branch:
> https://github.com/bow/biopython/tree/bio_blast_migrate
>
> Tests seem to be running fine still, but now there is the awkward
> situation where if users import Bio.NCBIStandalone and/or
> Bio.ParserSupport directly they will be greeted with two warnings: the
> BiopythonWarning for the modules' deprecation and the
> BiopythonExperimentalWarning for SearchIO.
>
> We could suppress the SearchIO warning in Bio.NCBIStandalone and
> Bio.ParserSupport. But before this is done, I was wondering if we have
> a defined timeline for removing a BiopythonExperimentalWarning? (i.e.
> if it will be removed in this release, then we could do that instead).

It doesn't make sense to have a defined timetime for removing a
BiopythonExperimentalWarning - it will be on a case by case basis.

Do you think SearchIO is ready for that now (or in Biopython 1.63)?

Peter


From p.j.a.cock at googlemail.com  Fri Aug 23 09:05:02 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 23 Aug 2013 10:05:02 +0100
Subject: [Biopython-dev] Bio.PDB - Missing values (was Moratorium on
	commits?)
In-Reply-To: <CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
References: <CAJ9sUYOtO8xd2mVSnUNhbLMnhWXtd4z67vEuADwK8ChmHOLLLw@mail.gmail.com>
	<CAKVJ-_47xSbDKwv85i5TtVVNfmaSHX0Vt3xq8JSyQrx2kKOAMA@mail.gmail.com>
	<CAJ9sUYOnjg5SqxikOL9_fYDSi2P46J8SKs5tQwKA8Mcf5PtxKg@mail.gmail.com>
	<CAHQkFdf0kHMkMaTipz7Odr_1=6qwPNyv2DHYwO3X4=fDnmcwKg@mail.gmail.com>
	<CAKVJ-_4VkgXev6ni5n_oVLwJvyUGMuUttF-oKek3-6xLFQDvXw@mail.gmail.com>
	<CAHQkFdcXrRj6LUBLMpMFtG47KAKo2JDcMYbwsXnEZe0aESONxA@mail.gmail.com>
Message-ID: <CAKVJ-_7Xx24v0ekLX_dMBU_+dymVG4VyryKuwrk2vpoi60g8Pg@mail.gmail.com>

On Tue, Aug 20, 2013 at 11:16 PM, Lenna Peterson <arklenna at gmail.com> wrote:
>
> On Thu, Aug 15, 2013 at 9:23 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
>>
>>
>> I didn't mean to suggest writing the string "None" in the field, and
>> I'm not sure if Jo?o did - it would certainly be an invalid PDB file.
>>
>> I agree that where the data structure has None (e.g. from our parser)
>> then the writer could use a blank string (of the appropriate width).
>> For mandatory fields like occupancy, this should give a warning.
>>
>
> As I suspected, the writer currently fails on None (it's expecting a float).
> Test-driven development!
>
> However, I don't see a simple or elegant way to force writing of a blank
> occupancy. ATOM lines are currently written using C-style string formatting,
> and the occupancy field is `%6.2f`.
>
> Off the top of my head, I'd:
>
> 1. Store the original format string
> 2. Modify the format string to have "%6s" at the appropriate position
> 3. Modify the occupancy to be an empty string or a space
> 4. Set the return value to the formatted string
> 5. Restore the original format string
> 6. Return the return value
>
> However, this seems...ugly at best. I don't know that switching formatting
> styles (e.g. to string.format() or others) will help. And in most
> circumstances, the type checking of the format string is useful.
>
> Any thoughts?

I would suggest something like this (untested):

$ git diff
diff --git a/Bio/PDB/PDBIO.py b/Bio/PDB/PDBIO.py
index 2f64571..11a52ca 100644
--- a/Bio/PDB/PDBIO.py
+++ b/Bio/PDB/PDBIO.py
@@ -8,7 +8,7 @@
 from Bio.PDB.StructureBuilder import StructureBuilder # To allow
saving of chains, residues, etc..
 from Bio.Data.IUPACData import atom_weights # Allowed Elements

-_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c
%8.3f%8.3f%8.3f%6.2f%6.2f      %4s%2s%2s\n"
+_ATOM_FORMAT_STRING="%s%5i %-4s%c%3s %c%4i%c   %8.3f%8.3f%8.3f%s%6.2f
     %4s%2s%2s\n"


 class Select(object):
@@ -85,8 +85,21 @@ class PDBIO(object):
         x, y, z=atom.get_coord()
         bfactor=atom.get_bfactor()
         occupancy=atom.get_occupancy()
+        # Handle a missing occupancy (None) with a blank entry:
+        try:
+            occupancy_str = "%6.2f" % occupancy
+        except TypeError:
+            if occupancy is None:
+                occupancy_str = " " * 6
+                import warnings
+                from Bio import BiopythonWarning
+                # TODO - Introduce exception BiopythonWriterWarning?
+                warning.warn("Missing occupancy will be recorded as blank",
+                             BiopythonWarning)
+            else:
+                raise TypeError("Invalid occupancy %r in atom %r" %
(occupancy, atom))
         args=(record_type, atom_number, name, altloc, resname, chain_id,
-            resseq, icode, x, y, z, occupancy, bfactor, segid,
+            resseq, icode, x, y, z, occupancy_str, bfactor, segid,
             element, charge)
         return _ATOM_FORMAT_STRING % args


The error message could be improved (e.g. a more helpful identification
of the ATOM at fault)?

Peter


From w.arindrarto at gmail.com  Sat Aug 24 10:22:56 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Sat, 24 Aug 2013 12:22:56 +0200
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
Message-ID: <CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>

Hi Peter, everyone,

>> In preparation for the 1.62 release, I've made the following changes
>> to Bio.NCBIStandalone and Bio.ParserSupport:
>>
>> * Migrated the two modules under Bio.SearchIO._legacy
>> * Upgraded their PendingDeprecationWarning to BiopythonDeprecationWarning
>
> So basically you're proposing formally deprecating parsing plain
> text BLAST output (via NCBIStandalone and Bio.ParserSupport)
> but continuing to support this format via SearchIO (using a copy
> of the current parser as a private module)?
>
> This then gives you the freedom to rewrite the old text parser
> more simply (e.g. assuming only recent versions of the BLAST
> suite), which might be nice.

Yes. This seems like a sensible thing to do now.

>> I've pushed the changes to this branch:
>> https://github.com/bow/biopython/tree/bio_blast_migrate
>>
>> Tests seem to be running fine still, but now there is the awkward
>> situation where if users import Bio.NCBIStandalone and/or
>> Bio.ParserSupport directly they will be greeted with two warnings: the
>> BiopythonWarning for the modules' deprecation and the
>> BiopythonExperimentalWarning for SearchIO.
>>
>> We could suppress the SearchIO warning in Bio.NCBIStandalone and
>> Bio.ParserSupport. But before this is done, I was wondering if we have
>> a defined timeline for removing a BiopythonExperimentalWarning? (i.e.
>> if it will be removed in this release, then we could do that instead).
>
> It doesn't make sense to have a defined timetime for removing a
> BiopythonExperimentalWarning - it will be on a case by case basis.
>
> Do you think SearchIO is ready for that now (or in Biopython 1.63)?

Hmm..what I have in mind is actually as soon as we lift SearchIO's
BiopythonExperimentalWarning, we give Bio.Blast a
PendingDeprecationWarning. I think this gives users a clearer / firmer
choice, since it could be confusing to have two different modules that
handle BLAST parsing in Biopython.

As for the readiness, I think the important features that we planned
have been implemented in SearchIO. I don't have any major feature
change that I would like to implement anytime soon, too. So yes, I
think it is ready.

Best,
Bow


From yeyanbo289 at gmail.com  Mon Aug 26 03:53:50 2013
From: yeyanbo289 at gmail.com (Yanbo Ye)
Date: Mon, 26 Aug 2013 11:53:50 +0800
Subject: [Biopython-dev] GSOC weekly update 11
Message-ID: <CADoMHjw0e9-orsE+qrbe9S5YeB38oad2k+UsiaEnVT9=2xiZoQ@mail.gmail.com>

Hi all,

Biopython.Phylo project update for last week is here:
http://blog.yeyanbo.com/posts/google-summer-of-code-11.html

Thanks,
Yanbo

-- 

*Yanbo Ye*
*Guangzhou Institutes of Biomedicine and Health, *
*Chinese Academy of Sciences*
*190 Kaiyuan Avenue, Science Park, Guangzhou, China**
*
*
*
*Email: ye_yanbo at gibh.ac.cn*
*Web: http://www.yeyanbo.com*
*Phone: (86)-020-32093810*


From p.j.a.cock at googlemail.com  Mon Aug 26 14:04:35 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 26 Aug 2013 15:04:35 +0100
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
	<CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
Message-ID: <CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>

On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Peter, everyone,
>
> As for the readiness, I think the important features that we planned
> have been implemented in SearchIO. I don't have any major feature
> change that I would like to implement anytime soon, too. So yes, I
> think it is ready.

So you'd be comfortable with removing the experimental warning
for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
thing is resolved)?

And you would like to officially support plain text BLAST parsing
(despite it not being recommend by the NCBI, and known to have
been quite a lot of work in the past to keep the parser working)?

We should probably also give you (Bow) commit rights too, so you
can handle basic parser updates within SearchIO directly - assuming
you're happy with that?

Regards,

Peter


From w.arindrarto at gmail.com  Mon Aug 26 16:04:38 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Mon, 26 Aug 2013 18:04:38 +0200
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
	<CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
	<CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>
Message-ID: <CADEGkF6otBZncQxhGq3iHNjGkOsoXcso4JTkbTmBfsVJMga1DA@mail.gmail.com>

On Mon, Aug 26, 2013 at 4:04 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Sat, Aug 24, 2013 at 11:22 AM, Wibowo Arindrarto
> <w.arindrarto at gmail.com> wrote:
>> Hi Peter, everyone,
>>
>> As for the readiness, I think the important features that we planned
>> have been implemented in SearchIO. I don't have any major feature
>> change that I would like to implement anytime soon, too. So yes, I
>> think it is ready.
>
> So you'd be comfortable with removing the experimental warning
> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
> thing is resolved)?

Yes. I think all public-facing modules are ok now. There are still two
issue which I consider minor, but I think should be mentioned before
we lift the warning:

1. Storing [T]FAST[X|Y] query and hit strand information (see
https://redmine.open-bio.org/issues/3419). I'm not sure yet if I
should do the commit, but Jason's patch look sensible (and I can
probably add some more so that the parser knows whether to set the
strand on hit or query sequence).

2. Collapsing / merging overlapping HSPs. I've received one (or two)
mail(s) asking whether it is possible to merge overlapping HSPs
(apparently BLAST sometimes do this). I haven't figured a way to
cleanly implement this, so this is on hold for now.

In addition, we had a discussion some months ago about the Bio._utils
module that SearchIO uses (see
http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010219.html,
http://lists.open-bio.org/pipermail/biopython-dev/2013-January/010240.html,
and http://lists.open-bio.org/pipermail/biopython-dev/2013-February/010290.html).
We had an extensive discussion about this last time, which went as far
as considering a change on how we run our tests. Since the Bio._utils
module itself is private, however, no public-facing functions in
SearchIO is affected.

Other than these, some planned features are implementing the HMMER3.1
parser (which I think should not interfere with lifting the warning).

> And you would like to officially support plain text BLAST parsing
> (despite it not being recommend by the NCBI, and known to have
> been quite a lot of work in the past to keep the parser working)?

Looking at http://lists.open-bio.org/pipermail/biopython/2012-September/008166.html,
the most sensible approach seems to be to put the current parser under
SearchIO (hence the module reorganization I did; so we can deprecate
Bio.Blast as a whole without losing functionality), without actually
advertising that we have full support of parsing the text output
(perhaps put a disclaimer that plain text is not guaranteed to work?).
I feel like some people may still want to use previous BLAST versions
anyway, and we do have a functioning parser tested up to 2.2.26+, so
throwing it away doesn't seem to be the best thing to do here. And in
the case that someone does want to extend the parser (could be me,
could be someone else) to work with the latest BLAST version, (s)he
can then extend the existing parser.

> We should probably also give you (Bow) commit rights too, so you
> can handle basic parser updates within SearchIO directly - assuming
> you're happy with that?

This is fine with me.

Best,
Bow

P.S. I made the pull request for the reorganization here:
https://github.com/biopython/biopython/pull/223, comments are welcomed
:).


From p.j.a.cock at googlemail.com  Tue Aug 27 08:41:39 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 27 Aug 2013 09:41:39 +0100
Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week?
In-Reply-To: <CADEGkF6otBZncQxhGq3iHNjGkOsoXcso4JTkbTmBfsVJMga1DA@mail.gmail.com>
References: <CAKVJ-_6Osh2DyicHg7GoLu=Q1UssjYJzgjFV3K2A9TBJSjxYyg@mail.gmail.com>
	<1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com>
	<CAKVJ-_7wmyZLj5iDd2ThhBsUHMw3Yy75jr1xR7vN1ObvaRLxMA@mail.gmail.com>
	<1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com>
	<CADEGkF4FfoszNWaPEa=S4KwVFwV9vVOGGb6Wz+yBAVnmnD3rzw@mail.gmail.com>
	<1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com>
	<CADEGkF5nk1qmpu4vdG1dueYYS5xNxufaX+RQkf8am0pjDbmmfA@mail.gmail.com>
	<CAKVJ-_6TzOuMfmPSTvrdA2nica7n9saT3UY3-33DV8Wov9jXUA@mail.gmail.com>
	<CADEGkF6hAouo5xA3uSVZ8GrpEyA=LB0TKNCke8RQ0kt0W9Kcww@mail.gmail.com>
	<CAKVJ-_6pPaDgZkVV7RTp3mXYZCPn1ufi-gUAYQADAQUoR=2ADg@mail.gmail.com>
	<CADEGkF6otBZncQxhGq3iHNjGkOsoXcso4JTkbTmBfsVJMga1DA@mail.gmail.com>
Message-ID: <CAKVJ-_5dEA9daJLnK9VDqHs2dma8gT2sFM0SSEjmrR4DVVRhjA@mail.gmail.com>

On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
>
>> So you'd be comfortable with removing the experimental warning
>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
>> thing is resolved)?
>
> Yes. I think all public-facing modules are ok now. There are still two
> issue which I consider minor, but I think should be mentioned before
> we lift the warning:
>
> ...
>
> Other than these, some planned features are implementing the HMMER3.1
> parser (which I think should not interfere with lifting the warning).

We'll also want to update the Tutorial as well, merging the BLAST
and SearchIO chapters. Let's start work on this just after releasing
Biopython 1.62 then, which I think we can now go ahead with :)

Lenna has sorted out the PDB occupancy issue, and Eric has
updated the PRANK unit tests.

I think this means we are OK to do the release in the next day
or two? Any objections?

Regards,

Peter


From p.j.a.cock at googlemail.com  Tue Aug 27 08:43:17 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 27 Aug 2013 09:43:17 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
Message-ID: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>

Continuing this thread under a new title, as below, I would
like to do the Biopython 1.62 release in the next day or two:

http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html

Peter

On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote:
> On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote:
>>
>>> So you'd be comfortable with removing the experimental warning
>>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
>>> thing is resolved)?
>>
>> Yes. I think all public-facing modules are ok now. There are still two
>> issue which I consider minor, but I think should be mentioned before
>> we lift the warning:
>>
>> ...
>>
>> Other than these, some planned features are implementing the HMMER3.1
>> parser (which I think should not interfere with lifting the warning).
>
> We'll also want to update the Tutorial as well, merging the BLAST
> and SearchIO chapters. Let's start work on this just after releasing
> Biopython 1.62 then, which I think we can now go ahead with :)
>
> Lenna has sorted out the PDB occupancy issue, and Eric has
> updated the PRANK unit tests.
>
> I think this means we are OK to do the release in the next day
> or two? Any objections?
>
> Regards,
>
> Peter


From w.arindrarto at gmail.com  Tue Aug 27 09:41:32 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Tue, 27 Aug 2013 11:41:32 +0200
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
Message-ID: <CADEGkF4-QCSOeHO6HO0Jqz=5j2XmCQvJ6xjxY92uCTGg0fJrjQ@mail.gmail.com>

Hi Peter, everyone,

On Tue, Aug 27, 2013 at 10:43 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Continuing this thread under a new title, as below, I would
> like to do the Biopython 1.62 release in the next day or two:
>
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html
>
> Peter
>
> On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote:
>> On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote:
>>>
>>>> So you'd be comfortable with removing the experimental warning
>>>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
>>>> thing is resolved)?
>>>
>>> Yes. I think all public-facing modules are ok now. There are still two
>>> issue which I consider minor, but I think should be mentioned before
>>> we lift the warning:
>>>
>>> ...
>>>
>>> Other than these, some planned features are implementing the HMMER3.1
>>> parser (which I think should not interfere with lifting the warning).
>>
>> We'll also want to update the Tutorial as well, merging the BLAST
>> and SearchIO chapters. Let's start work on this just after releasing
>> Biopython 1.62 then, which I think we can now go ahead with :)

Ah yes. I missed the tutorial. Then yes, it should be updated as well.
If we are doing this after 1.62 is released, is worth it to aim for a
larger change (I recall there was a discussion some time ago about
porting the tutorial to Sphinx).

>> Lenna has sorted out the PDB occupancy issue, and Eric has
>> updated the PRANK unit tests.
>>
>> I think this means we are OK to do the release in the next day
>> or two? Any objections?

No objections from me :).

Best,
Bow


From eric.talevich at gmail.com  Tue Aug 27 18:45:58 2013
From: eric.talevich at gmail.com (Eric Talevich)
Date: Tue, 27 Aug 2013 11:45:58 -0700
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
Message-ID: <CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>

On Tue, Aug 27, 2013 at 1:43 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Continuing this thread under a new title, as below, I would
> like to do the Biopython 1.62 release in the next day or two:
>
> http://lists.open-bio.org/pipermail/biopython-dev/2013-August/010836.html
>
> Peter
>
> On Tue, Aug 27, 2013 at 9:41 AM, Peter Cock wrote:
> > On Mon, Aug 26, 2013 at 5:04 PM, Wibowo Arindrarto wrote:
> >>
> >>> So you'd be comfortable with removing the experimental warning
> >>> for SearchIO in Biopython 1.62 final (this week if the PDB occupancy
> >>> thing is resolved)?
> >>
> >> Yes. I think all public-facing modules are ok now. There are still two
> >> issue which I consider minor, but I think should be mentioned before
> >> we lift the warning:
> >>
> >> ...
> >>
> >> Other than these, some planned features are implementing the HMMER3.1
> >> parser (which I think should not interfere with lifting the warning).
> >
> > We'll also want to update the Tutorial as well, merging the BLAST
> > and SearchIO chapters. Let's start work on this just after releasing
> > Biopython 1.62 then, which I think we can now go ahead with :)
> >
> > Lenna has sorted out the PDB occupancy issue, and Eric has
> > updated the PRANK unit tests.
> >
> > I think this means we are OK to do the release in the next day
> > or two? Any objections?
> >
> > Regards,
> >
> > Peter
>


Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the
Tutorial to mention CDAO support?

Also, has anything else noteworthy been added since the beta that we can
announce in the NEWS file?

Thanks,
Eric


From p.j.a.cock at googlemail.com  Tue Aug 27 19:27:48 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 27 Aug 2013 20:27:48 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
	<CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
Message-ID: <CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>

On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>
> Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the
> Tutorial to mention CDAO support?

Go for it - I need to retest the DSSP unit test tomorrow anyway.

> Also, has anything else noteworthy been added since the beta that we can
> announce in the NEWS file?

Minor bug fixes and more tests? Perhaps the PDB occupancy change?

Peter


From w.arindrarto at gmail.com  Wed Aug 28 12:12:24 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Wed, 28 Aug 2013 14:12:24 +0200
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
	<CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
	<CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>
Message-ID: <CADEGkF4nxEBwJkNh1H75mYX3ga2SJZmcHRDRCwwf7AZQ_C+4kw@mail.gmail.com>

Hi Peter, everyone,

On Tue, Aug 27, 2013 at 9:27 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Tue, Aug 27, 2013 at 7:45 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>>
>> Sounds good. Mind if I sneak in a quick update to the Phylo chapter of the
>> Tutorial to mention CDAO support?
>
> Go for it - I need to retest the DSSP unit test tomorrow anyway.
>
>> Also, has anything else noteworthy been added since the beta that we can
>> announce in the NEWS file?
>
> Minor bug fixes and more tests? Perhaps the PDB occupancy change?
>
> Peter

I don't like to believe in coincidences, but just last night a user
emailed me about an issue in SearchIO's exonerate parser which I feel
should be mentioned here (exchange attached on his permission). He
stumbled on an error where an
exonerate output file is unparseable because of split codon
alignments. In short, I feel we should not lift the
BiopythonExperimentalWarning for the 1.62 release.

The issue is caused by protein to genome alignments in exonerate (in
the protein2genome alignment mode) that has split codons in it. When
split codons are present, SearchIO splits these HSPs into fragments
which are basically a single contiguous sequence alignment. These
fragments have their own Seq objects (representing hit and query
sequences). The problem is, these Seq objects have to be full
sequences and the query sequence fragment (protein) do not represent a
full sequence here (since the underlying codon is split).

Currently, SearchIO raises an AssertionError when this type of
alignment is found and simply says it can not deal with it. This
should not remain the case, though. A test case was actually put up
for this (https://github.com/biopython/biopython/blob/master/Tests/Exonerate/exn_22_m_protein2genome.exn#L173).
However, since I have yet to find a way to properly represent these
fragments with Seq objects, the actual test have not been written (and
I missed this when doing the last review).

I have thought of several alternatives:

* I saw a ThreeLetterProtein Alphabet in
https://github.com/biopython/biopython/blob/master/Bio/Alphabet/__init__.py#L136,
maybe we could use this to create Seq objects that allows partial
codons?

* Change HSPFragment to not use full Seq objects anymore (which may
require some rework on the HSP objects as well)

But have not explored them thoroughly. I should note that Zheng Ruan's
GSoC project on Codon alignments
(http://zruanweb.com/category/gsoc.html) may prove useful as well
here.

While I don't expect the issue to pop up often (it shows up only when
exonerate is used with the protein2genome mode out of the many modes
it has and when the alignment hits a split codon), I feel like it
should be discussed (if not, mentioned) here first since dealing with
the issue may require some more reworking.

So I'm sorry for the late warning and missing this. I hope this is not
too late :).

Best,
Bow
-------------- next part --------------
On Wed, Aug 28, 2013 at 10:31 AM, Wibowo Arindrarto <w.arindrarto at gmail.com> wrote:
> Hi Somak,
>
>> Do you have any idea whether Bioperl based Exonerate parser can handle such cases?
>> I'm yet to try Bioperl.
>
> I tried your file with Bioperl's parser, and while it can parse the
> entire file without errors, I don't know whether all the information
> in the file (sequence, sequence coordinates) are parsed properly. But
> maybe that's just me being less familiar with Bioperl. I suggest
> posting to their mailing list
> (http://lists.open-bio.org/pipermail/bioperl-l/) or searching the list
> archive if you have any questions regarding this. The library also
> have an active community behind it.
>
>> And please feel free to forward this mail to Biopythonlist or any other discussion forum you
>> think is appropriate,
>
> Ok, thanks :).
>
>> Thanks again
>>
>> Somak Ray
>
> Best,
> Bow
>
>> ________________________________________
>> From: w.arindrarto at gmail.com [w.arindrarto at gmail.com] on behalf of Wibowo Arindrarto [bow at bow.web.id]
>> Sent: Tuesday, August 27, 2013 8:01 PM
>> To: Ray, Somak
>> Subject: Re: On parsing of exonerate output
>>
>> Hi Somak,
>>
>>> Dear Dr. Arindrarto,
>>>
>>> I came across your blog about parsing outputs from Exonerate . I have some
>>> generated some files using exonarates protein2dna model. However when
>>> running your scripts on them I'm getting some assertion error in python 2.7.
>>> I'm attaching  two of such exonerate outputs.The "Result_goodfile.txt" can
>>> be passed by the parser whereas "Result_badfile.txt" can't be parsed.
>>>
>>> Please let me know if there's any solution to the problem.
>>>
>>> Thanks in advance
>>
>> Hmm..looking at the files, it seems that this is caused by a split
>> codon in the alignment (Results_badfile.txt, line 25). The problem is,
>> the three-letter amino acid sequence needs to be translated into a
>> single-letter amino acid sequence since Biopython could not create Seq
>> objects with three-letter amino acid codes. However, this conversion
>> means that codons that span introns (as the one on line 25) could not
>> be dealt with properly since a single fragment expects a full Seq
>> object (hence the error you're seeing;  it expects the three-letter
>> amino acid sequence length to be multiples of three).
>>
>> So the short answer is no, there is not yet an immediate solution to this issue.
>>
>> I should mention that this came at an appropriate time, though, so
>> thanks for the email :). I am reviewing known SearchIO issues and this
>> was apparently an issue that I have lost track of (checking at the
>> test suite, there is a test for this case but it has not been included
>> in the test suite).
>>
>> Do you mind if I forward this email to the Biopython list
>> (http://biopython.org/wiki/Mailing_lists)? I think other developers /
>> users may be interested in this.
>>
>> Best,
>> Bow

From p.j.a.cock at googlemail.com  Wed Aug 28 17:31:19 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 18:31:19 +0100
Subject: [Biopython-dev] Releasing Biopython 1.62 this week?
In-Reply-To: <CAKVJ-_46o8X9==60xrB0mZCsApJZKQS9GdaxWgPVVRW05OnXdA@mail.gmail.com>
References: <CAKVJ-_4-ucrZUCkvsURVO19B2-d4d_DdxGzv2ehbbW6z5ucFsw@mail.gmail.com>
	<CAMC681n1Tv=C2BtDG-gvDGSU=e_bqGrmKSFzQzSAh8coFSKhdg@mail.gmail.com>
	<CAKVJ-_5xsn3uhONzog7-YsmN_kXyzcSpfHfytzkABcfk96m-hg@mail.gmail.com>
	<CADEGkF4nxEBwJkNh1H75mYX3ga2SJZmcHRDRCwwf7AZQ_C+4kw@mail.gmail.com>
	<CAKVJ-_46o8X9==60xrB0mZCsApJZKQS9GdaxWgPVVRW05OnXdA@mail.gmail.com>
Message-ID: <CAKVJ-_6J-j2S5rWGs-v51UK3fxXQQqvLDqqK6i85+KARX3UYrQ@mail.gmail.com>

Hello all,

I'm starting the release 1.62 process now, getting the new DSSP
test working cross platform was more work than I expected -
thank goodness for the BuildBot server yet again :)

Please don't commit anything to the master branch until further
notice,

Thanks,

Peter


From p.j.a.cock at googlemail.com  Wed Aug 28 18:28:43 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 19:28:43 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
Message-ID: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>

On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hello all,
>
> I'm starting the release 1.62 process now, getting the new DSSP
> test working cross platform was more work than I expected -
> thank goodness for the BuildBot server yet again :)
>
> Please don't commit anything to the master branch until further
> notice,
>
> Thanks,
>
> Peter

While I finish off the Windows installers etc, and have dinner,
would anyone like to volunteer to write a draft for the release
announcement to go out on the mailing lists and news blog?
http://news.open-bio.org/news/category/obf-projects/biopython/

These are usually based on the rather dry NEWS file information,
and the previous announcement for style/links/etc.

Thanks,

Peter


From p.j.a.cock at googlemail.com  Wed Aug 28 18:53:21 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 19:53:21 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
Message-ID: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>

Hello all - especially newcomers,

There are going to be several boring but useful things to do to
the Biopython code base once we're finished with Python 2.5
(the imminent release of Biopython 1.62 has been clearly
described as the final Biopython release to support it).

Some of these tasks are quite easy, and might tempt some
of our non-core contributors or new-comers to have a go,
however to avoid too much duplication of effort I'd suggest
**replying in this thread if you want to tackle anything** - and
then start working out how to send us your first pull request.

Things which will need doing:

(0) Disable the Python 2.5 and Jython 2.5 buildbot
(this will be done by me or Tiago)

(1) Disable the Python 2.5 target in TravisCI, see
https://travis-ci.org/biopython/biopython/
(this is a simple one line edit to the .travis.yml file)

(2) Remove all the with statement imports (and any
comment lines associated with them):

from __future__ import with_statement

(3) Remove Bio/_py3k/_namedtuple.py and adjust
import lines accordingly

(4) Scan over the code base looking for any comments
about Python 2.5 (e.g. using the grep command), and
reviewing them one by one to see if there is an old
workaround we can now remove.

(5) More advanced code review, for example looking
for places we can better take advantage of context
managers (with statements) for file handles.

Of this list, (1), (2) and (3) are certainly things suitable
for relative newcomers - and assuming I'm not away I
will happily do the pull request reviews.

For the more advances issues (4) and (5) we may need
more eyes on the code...

Thank you,

Peter


From p.j.a.cock at googlemail.com  Wed Aug 28 19:01:36 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 20:01:36 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
Message-ID: <CAKVJ-_7yBM0znHk-N91mzBOe-=3gFExzO9N4dXaBnaW7uWzG3Q@mail.gmail.com>

On Wed, Aug 28, 2013 at 7:28 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all,
>>
>> I'm starting the release 1.62 process now, getting the new DSSP
>> test working cross platform was more work than I expected -
>> thank goodness for the BuildBot server yet again :)
>>
>> Please don't commit anything to the master branch until further
>> notice,
>>
>> Thanks,
>>
>> Peter
>
> While I finish off the Windows installers etc, and have dinner,
> would anyone like to volunteer to write a draft for the release
> announcement to go out on the mailing lists and news blog?
> http://news.open-bio.org/news/category/obf-projects/biopython/
>
> These are usually based on the rather dry NEWS file information,
> and the previous announcement for style/links/etc.
>
> Thanks,
>
> Peter

A provisional tar-ball, zip file, and four Windows installers are
up now (but deliberately not yet listed on the download wiki page):
http://biopython.org/DIST/

If anyone would care to sanity test those in the next hour or two,
that would be great.

Thanks,

Peter


From p.j.a.cock at googlemail.com  Wed Aug 28 20:43:58 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 21:43:58 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
Message-ID: <CAKVJ-_5Ybyo7oi5atbjm9fyFjNZiP635WDX_V-cKwt+nBo517Q@mail.gmail.com>

On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hello all - especially newcomers,
>
> There are going to be several boring but useful things to do to
> the Biopython code base once we're finished with Python 2.5
> (the imminent release of Biopython 1.62 has been clearly
> described as the final Biopython release to support it).
>
> Some of these tasks are quite easy, and might tempt some
> of our non-core contributors or new-comers to have a go,
> however to avoid too much duplication of effort I'd suggest
> **replying in this thread if you want to tackle anything** - and
> then start working out how to send us your first pull request.

I tweeted this earlier,
https://twitter.com/pjacock/status/372796602760855552

> Things which will need doing:
>
> ...
>
> (1) Disable the Python 2.5 target in TravisCI, see
> https://travis-ci.org/biopython/biopython/
> (this is a simple one line edit to the .travis.yml file)

The first easy task has been claimed already:
https://github.com/biopython/biopython/pull/226

Wayne wrote:
>> Via Twitter, I saw your note"
>> (1) Disable the Python 2.5 target in TravisCI, see
>> https://travis-ci.org/biopython/biopython/
>> (this is a simple one line edit to the .travis.yml file)"
>>
>> Turned out it really was as easy as you said.

Once the release is out, that fix can go in - thanks :)

Wayne (BCC'd), please sign up to the biopython-dev
list if you haven't already:

http://lists.open-bio.org/mailman/listinfo/biopython-dev

Thank you,

Peter


From arklenna at gmail.com  Wed Aug 28 20:57:10 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Wed, 28 Aug 2013 16:57:10 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
Message-ID: <CAHQkFdeLZSrX4MVJhNvnLOoeX_1_b+nuX9ULKBm-fz-y=hRbsQ@mail.gmail.com>

On Wed, Aug 28, 2013 at 2:53 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

>
> (2) Remove all the with statement imports (and any
> comment lines associated with them):
>
> from __future__ import with_statement
>

As I demonstrated, I regularly forget that `with` is "new"!


>
> (4) Scan over the code base looking for any comments
> about Python 2.5 (e.g. using the grep command), and
> reviewing them one by one to see if there is an old
> workaround we can now remove.
>

If I count:

    find Bio -name "*.py" -exec grep -H -n ".*#.*2\.5" {} \;

I only see 24 - not too bad. Many are `with` related.


>
> (5) More advanced code review, for example looking
> for places we can better take advantage of context
> managers (with statements) for file handles.
>

For this one:

    find Bio -name "*.py" -exec grep -H -n -P "= ?open\(" {} \;

I find 145...although not all `open()` statements can be easily swapped for
`with`.

I'm currently prepping for my UK trip so I may not be able to do any of
this before I get back mid-September.

Cheers,

Lenna


From p.j.a.cock at googlemail.com  Wed Aug 28 20:58:58 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 21:58:58 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_5Ybyo7oi5atbjm9fyFjNZiP635WDX_V-cKwt+nBo517Q@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_5Ybyo7oi5atbjm9fyFjNZiP635WDX_V-cKwt+nBo517Q@mail.gmail.com>
Message-ID: <CAKVJ-_77G9pqtJieJXSsaUuGwj-jnU4tA7JMVmgp_O1ca4qmAA@mail.gmail.com>

On Wed, Aug 28, 2013 at 9:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all - especially newcomers,
>>
>> There are going to be several boring but useful things to do to
>> the Biopython code base once we're finished with Python 2.5
>> (the imminent release of Biopython 1.62 has been clearly
>> described as the final Biopython release to support it).
>>
>> Some of these tasks are quite easy, and might tempt some
>> of our non-core contributors or new-comers to have a go,
>> however to avoid too much duplication of effort I'd suggest
>> **replying in this thread if you want to tackle anything** - and
>> then start working out how to send us your first pull request.
>
> I tweeted this earlier,
> https://twitter.com/pjacock/status/372796602760855552
>
>> Things which will need doing:
>>
>> ...
>>
>> (1) Disable the Python 2.5 target in TravisCI, see
>> https://travis-ci.org/biopython/biopython/
>> (this is a simple one line edit to the .travis.yml file)
>
> The first easy task has been claimed already:
> https://github.com/biopython/biopython/pull/226

And task (2) as well on the same pull request - keen!

Wayne (BCC'd), could you delay trying task (3) for a
few days to give someone else a chance please ;)

Maybe have a look for things under (4) instead,
Lenna's quick count suggests plenty of things
need looking at...

Peter


From w.arindrarto at gmail.com  Wed Aug 28 21:17:57 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Wed, 28 Aug 2013 23:17:57 +0200
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
Message-ID: <CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>

Hi everyone,

I've written a draft of our 1.62 release (below). I'd appreciate it if
somebody gives it another look (for typos, etc.). Also, if I miss
somebody in the contributors list, please let me know :).

---

Biopython 1.62 released
=======================


Source distributions and Windows installers for **Biopython** 1.62 are
now available from the [downloads
page](http://biopython.org/wiki/Download) on the [official Biopython
website](http://biopython.org/wiki/Main_Page) and from the [Python
Package Index (PyPI)](https://pypi.python.org/pypi/biopython).


# Python support

This is our first official release that supports Python 3.
Specifically, we tested under Python 3.3. Other versions of Python 3
may still work albeit with some issues.

We still fully support Python 2.5, 2.6, and 2.7. Support under
[Jython](http://www.jython.org/) is available for versions 2.5 and 2.7
and under [PyPy](http://pypy.org/) for versions 1.9 and 2.0. However,
unlike CPython, Jython and PyPy support is partial: NumPy and our C
extensions are not covered.

Please note that this release marks our last official support Python
2.5. Beginning from Biopython 1.63, the minimum supported Python
version will be 2.6.


# Highlights

* The translation functions will give a warning on any partial codons
(and this will probably become an error in a future release). If you
know you are dealing with partial sequences, either pad with N to
extend the sequence length to a multiple of three, or explicitly trim
the sequence.

* The handling of joins and related complex features in Genbank/EMBL
files has been changed with the introduction of a CompoundLocation
object. Previously a SeqFeature for something like a multi-exon CDS
would have a child SeqFeature (under the sub_features attribute) for
each exon. The sub_features property will still be populated for now,
but is deprecated and will in future be removed. Please consult the
examples in the help (docstrings) and Tutorial.

* Thanks to the efforts of Ben Morris, the Phylo module now supports
the file formats NeXML and CDAO. The Newick parser is also
significantly faster, and can now optionally extract bootstrap values
from the Newick comment field (like Molphy and Archaeopteryx do). Nate
Sutton added a wrapper for FastTree to Bio.Phylo.Applications.

* New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats
from UniProt-GOA.

* The BioSQL module is now supported in Jython. MySQL and PostgreSQL
databases can be used. The relevant JDBC driver should be available in
the CLASSPATH.

* Feature labels on circular GenomeDiagram figures now support the
label_position argument (start, middle or end) in addition to the
current default placement, and in a change to prior releases these
labels are outside the features which is now consistent with the
linear diagrams.

* The code for parsing 3D structures in mmCIF files was updated to use
the Python standard library's shlex module instead of C code using
flex.

* The Bio.Sequencing.Applications module now includes a BWA command
line wrapper.

* Bio.motifs supports JASPAR format files with multiple
position-frequence matrices.

Additionally there have been other minor bug fixes and more unit tests.


# Contributors

Many thanks to the Biopython developers and community for making this release
possible, especially the following contributors:


Alexander Campbell (first contribution)
Andrea Rizzi (first contribution)
Anthony Mathelier (first contribution)
Ben Morris (first contribution)
Brad Chapman
Christian Brueffer
David Arenillas (first contribution)
David Martin (first contribution)
Eric Talevich
Iddo Friedberg
Jian-Long Huang (first contribution)
Joao Rodrigues
Kai Blin
Michiel de Hoon
Nate Sutton (first contribution)
Peter Cock
Petra Kubincov? (first contribution)
Phillip Garland
Saket Choudhary (first contribution)
Tiago Antao
Wibowo 'Bow' Arindrarto
Xabier Bello (first contribution)

----

Best,
Bow

On Wed, Aug 28, 2013 at 8:28 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, Aug 28, 2013 at 6:31 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all,
>>
>> I'm starting the release 1.62 process now, getting the new DSSP
>> test working cross platform was more work than I expected -
>> thank goodness for the BuildBot server yet again :)
>>
>> Please don't commit anything to the master branch until further
>> notice,
>>
>> Thanks,
>>
>> Peter
>
> While I finish off the Windows installers etc, and have dinner,
> would anyone like to volunteer to write a draft for the release
> announcement to go out on the mailing lists and news blog?
> http://news.open-bio.org/news/category/obf-projects/biopython/
>
> These are usually based on the rather dry NEWS file information,
> and the previous announcement for style/links/etc.
>
> Thanks,
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev


From p.j.a.cock at googlemail.com  Wed Aug 28 21:30:33 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 22:30:33 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
	<CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
Message-ID: <CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>

On Wed, Aug 28, 2013 at 10:17 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi everyone,
>
> I've written a draft of our 1.62 release (below). I'd appreciate it if
> somebody gives it another look (for typos, etc.). Also, if I miss
> somebody in the contributors list, please let me know :).

Thanks Bow - I don't think the WordPress blog understands
markdown style markup, but bonus marks anyway :)

I'm about to update the tar-ball and zip file to include the
NEWS file updated with the two names Bow spotted as
missing - hopefully there are no more and this commit
will get the release tag:

https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff

(The Windows installers I prepared earlier should not be
affected as they don't include the NEWS file)

> # Python support
>
> This is our first official release that supports Python 3.
> Specifically, we tested under Python 3.3. Other versions
> of Python 3 may still work albeit with some issues.

I'd be a bit more explicit:

Specifically, this is supported under Python 3.3. Older
versions of Python 3 may still work albeit with some
issues, but are *not* supported.

> Please note that this release marks our last official support Python
> 2.5. Beginning from Biopython 1.63, the minimum supported Python
> version will be 2.6.

Minor typo, needs a for/of, e.g.

Please note that this release marks our last official support for
Python 2.5

Thanks Bow,

Peter


From w.arindrarto at gmail.com  Wed Aug 28 22:17:44 2013
From: w.arindrarto at gmail.com (Wibowo Arindrarto)
Date: Thu, 29 Aug 2013 00:17:44 +0200
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
	<CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
	<CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>
Message-ID: <CADEGkF4SN5-VDSvhiRXEnKPNwdUU7vbrUCPWP-0-qfU0PtQkfg@mail.gmail.com>

Hi Peter,

> Thanks Bow - I don't think the WordPress blog understands
> markdown style markup, but bonus marks anyway :)

Ah yes, I was planning to convert it later to HTML (I find writing
markdown first easier ~ and also more mailing-list friendly).

> I'm about to update the tar-ball and zip file to include the
> NEWS file updated with the two names Bow spotted as
> missing - hopefully there are no more and this commit
> will get the release tag:
>
> https://github.com/biopython/biopython/commit/73f8483f23910c8205cd9a4ff1283f2747d4f4ff
>
> (The Windows installers I prepared earlier should not be
> affected as they don't include the NEWS file)
>
>> # Python support
>>
>> This is our first official release that supports Python 3.
>> Specifically, we tested under Python 3.3. Other versions
>> of Python 3 may still work albeit with some issues.
>
> I'd be a bit more explicit:
>
> Specifically, this is supported under Python 3.3. Older
> versions of Python 3 may still work albeit with some
> issues, but are *not* supported.
>
>> Please note that this release marks our last official support Python
>> 2.5. Beginning from Biopython 1.63, the minimum supported Python
>> version will be 2.6.
>
> Minor typo, needs a for/of, e.g.
>
> Please note that this release marks our last official support for
> Python 2.5
>
> Thanks Bow,
>
> Peter

Fixes applied, thanks too :).

Best,
Bow


From p.j.a.cock at googlemail.com  Wed Aug 28 22:21:54 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 23:21:54 +0100
Subject: [Biopython-dev] Biopython 1.62 release in progress
In-Reply-To: <CADEGkF4SN5-VDSvhiRXEnKPNwdUU7vbrUCPWP-0-qfU0PtQkfg@mail.gmail.com>
References: <CAKVJ-_72_azX9SZfN-9P6M7hT5pA8Tvi2AZZ5FFO9G+VwbPo=g@mail.gmail.com>
	<CADEGkF5=-reKxXf+WkLOXsJqcxA1tqMyhFKEEeDgs4p9DCQiWg@mail.gmail.com>
	<CAKVJ-_55NdYZ1nFgyYSPetsCi=9T+UbE6z5-RBG0K_knbJeJwg@mail.gmail.com>
	<CADEGkF4SN5-VDSvhiRXEnKPNwdUU7vbrUCPWP-0-qfU0PtQkfg@mail.gmail.com>
Message-ID: <CAKVJ-_4CuA_OoahrMomHra7qMnbuCcEyFH1cpBNQuxsrWnXEXQ@mail.gmail.com>

On Wed, Aug 28, 2013 at 11:17 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Peter,
>
>> Thanks Bow - I don't think the WordPress blog understands
>> markdown style markup, but bonus marks anyway :)
>
> Ah yes, I was planning to convert it later to HTML (I find writing
> markdown first easier ~ and also more mailing-list friendly).

Thank you :)

This is live now but can be edited - so we can fix any
remaining issues before sending round the emails:
http://news.open-bio.org/news/2013/08/biopython-1-62-released/

Tagged on GitHub too,
https://github.com/biopython/biopython/tree/biopython-162

Note I have not yet pushed to PyPI - I'd like one or two
positive reports first before doing that (just in case).

Thanks all,

Peter


From p.j.a.cock at googlemail.com  Wed Aug 28 22:47:04 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 28 Aug 2013 23:47:04 +0100
Subject: [Biopython-dev] Biopython 1.62 released
Message-ID: <CAKVJ-_750vVWnooCx4zPqQf+KOBjFOzF48Zp+F66=MS6o-1c2A@mail.gmail.com>

Dear Biopythoneers,

Source distributions and Windows installers for Biopython 1.62 are now
available from the downloads page on the official Biopython website
and (soon) from the Python Package Index (PyPI).

Python support

This is our first release of Biopython which officially supports
Python 3. Specifically, this is supported under Python 3.3. Older
versions of Python 3 may still work albeit with some issues, but are
not supported.

We still fully support Python 2.5, 2.6, and 2.7. Support under Jython
is available for versions 2.5 and 2.7 and under PyPy for versions 1.9
and 2.0. However, unlike CPython, Jython and PyPy support is partial:
NumPy and our C extensions are not covered.

Please note that this release marks our last official for support
Python 2.5. Beginning from Biopython 1.63, the minimum supported
Python version will be 2.6.

Highlights

The translation functions will give a warning on any partial codons
(and this will probably become an error in a future release). If you
know you are dealing with partial sequences, either pad with ?N? to
extend the sequence length to a multiple of three, or explicitly trim
the sequence.

The handling of joins and related complex features in Genbank/EMBL
files has been changed with the introduction of a CompoundLocation
object. Previously a SeqFeaturefor something like a multi-exon CDS
would have a child SeqFeature (under thesub_features attribute) for
each exon. The sub_features property will still be populated for now,
but is deprecated and will in future be removed. Please consult the
examples in the help (docstrings) and Tutorial.

Thanks to the efforts of Ben Morris, the Phylo module now supports the
file formats NeXML and CDAO. The Newick parser is also significantly
faster, and can now optionally extract bootstrap values from the
Newick comment field (like Molphy and Archaeopteryx do). Nate Sutton
added a wrapper for FastTree toBio.Phylo.Applications.

New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats
from UniProt-GOA.

The BioSQL module is now supported in Jython. MySQL and PostgreSQL
databases can be used. The relevant JDBC driver should be available in
the CLASSPATH.

Feature labels on circular GenomeDiagram figures now support the
label_positionargument (start, middle or end) in addition to the
current default placement, and in a change to prior releases these
labels are outside the features which is now consistent with the
linear diagrams.

The code for parsing 3D structures in mmCIF files was updated to use
the Python standard library?s shlex module instead of C code using
flex.

The Bio.Sequencing.Applications module now includes a BWA command line wrapper.
Bio.motifs supports JASPAR format files with multiple
position-frequence matrices.

Additionally there have been other minor bug fixes and more unit tests.

Contributors

Many thanks to the Biopython developers and community for making this
release possible, especially the following contributors:

Alexander Campbell (first contribution)
Andrea Rizzi (first contribution)
Anthony Mathelier (first contribution)
Ben Morris (first contribution)
Brad Chapman
Christian Brueffer
David Arenillas (first contribution)
David Martin (first contribution)
Eric Talevich
Iddo Friedberg
Jian-Long Huang (first contribution)
Joao Rodrigues
Kai Blin
Lenna Peterson
Michiel de Hoon
Matsuyuki Shirota (first contribution)
Nate Sutton (first contribution)
Peter Cock
Petra Kubincov? (first contribution)
Phillip Garland
Saket Choudhary (first contribution)
Tiago Antao
Wibowo ?Bow? Arindrarto
Xabier Bello (first contribution)

Thank you all.

Release announcement here (RSS feed available):
http://news.open-bio.org/news/2013/08/biopython-1-62-released/

P.S. You can follow @Biopython on Twitter
https://twitter.com/Biopython


From p.j.a.cock at googlemail.com  Thu Aug 29 09:04:59 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 10:04:59 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
Message-ID: <CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>

On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hello all - especially newcomers,
>
> There are going to be several boring but useful things to do to
> the Biopython code base once we're finished with Python 2.5
> (the imminent release of Biopython 1.62 has been clearly
> described as the final Biopython release to support it).
>
> Some of these tasks are quite easy, and might tempt some
> of our non-core contributors or new-comers to have a go,
> however to avoid too much duplication of effort I'd suggest
> **replying in this thread if you want to tackle anything** - and
> then start working out how to send us your first pull request.
>
> Things which will need doing:
>
> (0) Disable the Python 2.5 and Jython 2.5 buildbot
> (this will be done by me or Tiago)

Done.

> (1) Disable the Python 2.5 target in TravisCI, see
> https://travis-ci.org/biopython/biopython/
> (this is a simple one line edit to the .travis.yml file)

Done by Wayne,
https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de

> (2) Remove all the with statement imports (and any
> comment lines associated with them):
>
> from __future__ import with_statement

Done by Wayne,
https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84

> (3) Remove Bio/_py3k/_namedtuple.py and adjust
> import lines accordingly

Any new volunteer want to try this?

> (4) Scan over the code base looking for any comments
> about Python 2.5 (e.g. using the grep command), and
> reviewing them one by one to see if there is an old
> workaround we can now remove.

Lenna had a quick look, there should be some easy one here.

> (5) More advanced code review, for example looking
> for places we can better take advantage of context
> managers (with statements) for file handles.

Another new one, related to (5), and fairly easy:

(6) Reviewing examples in the docstrings and Tutorial
where it would make sense to use a 'with' for file handles.

This should also solve many of the ResourceWarning:
unclosed file ... warnings visible running the full test
suite under Python 3, e.g. see:
http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio

Peter


From chris.mit7 at gmail.com  Thu Aug 29 15:20:09 2013
From: chris.mit7 at gmail.com (Chris Mitchell)
Date: Thu, 29 Aug 2013 11:20:09 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
Message-ID: <CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>

I was going to take a stab at (3), but it seems that _namedtuple.py doesn't
exist.

Looking under _py3k as well as grep -Ri namedtuple ./*

fails to find it. I'm pulling from
https://github.com/biopython/biopython.git


On Thu, Aug 29, 2013 at 5:04 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > Hello all - especially newcomers,
> >
> > There are going to be several boring but useful things to do to
> > the Biopython code base once we're finished with Python 2.5
> > (the imminent release of Biopython 1.62 has been clearly
> > described as the final Biopython release to support it).
> >
> > Some of these tasks are quite easy, and might tempt some
> > of our non-core contributors or new-comers to have a go,
> > however to avoid too much duplication of effort I'd suggest
> > **replying in this thread if you want to tackle anything** - and
> > then start working out how to send us your first pull request.
> >
> > Things which will need doing:
> >
> > (0) Disable the Python 2.5 and Jython 2.5 buildbot
> > (this will be done by me or Tiago)
>
> Done.
>
> > (1) Disable the Python 2.5 target in TravisCI, see
> > https://travis-ci.org/biopython/biopython/
> > (this is a simple one line edit to the .travis.yml file)
>
> Done by Wayne,
>
> https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de
>
> > (2) Remove all the with statement imports (and any
> > comment lines associated with them):
> >
> > from __future__ import with_statement
>
> Done by Wayne,
>
> https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84
>
> > (3) Remove Bio/_py3k/_namedtuple.py and adjust
> > import lines accordingly
>
> Any new volunteer want to try this?
>
> > (4) Scan over the code base looking for any comments
> > about Python 2.5 (e.g. using the grep command), and
> > reviewing them one by one to see if there is an old
> > workaround we can now remove.
>
> Lenna had a quick look, there should be some easy one here.
>
> > (5) More advanced code review, for example looking
> > for places we can better take advantage of context
> > managers (with statements) for file handles.
>
> Another new one, related to (5), and fairly easy:
>
> (6) Reviewing examples in the docstrings and Tutorial
> where it would make sense to use a 'with' for file handles.
>
> This should also solve many of the ResourceWarning:
> unclosed file ... warnings visible running the full test
> suite under Python 3, e.g. see:
>
> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From p.j.a.cock at googlemail.com  Thu Aug 29 15:30:51 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 16:30:51 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
Message-ID: <CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>

On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell <chris.mit7 at gmail.com> wrote:
> I was going to take a stab at (3), but it seems that _namedtuple.py doesn't
> exist.
>
> Looking under _py3k as well as grep -Ri namedtuple ./*
>
> fails to find it. I'm pulling from
> https://github.com/biopython/biopython.git

Oops. I wrote that email on my latop - it was a file never checked
into source code control. Looking back it was a plan for allowing
us to use named tuples on older versions of Python. Sorry!

But I have come up with another easy task instead,

(7) Update exception style from this,

except ErrorClass, variable_name:

to this:

except ErrorClass as variable_name:

The second form is the only allowed syntax in Python 3,
but was not possible under Python 2.5.

Regards,

Peter


From chris.mit7 at gmail.com  Thu Aug 29 16:03:51 2013
From: chris.mit7 at gmail.com (Chris Mitchell)
Date: Thu, 29 Aug 2013 12:03:51 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
	<CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>
Message-ID: <CAK_U6OCThFBe3OYVOSjKr+h1s0s5kJgijq3ie4ZkXhQX4Zc06Q@mail.gmail.com>

Sounds good. Just took care of (7), running the test suite and will send a
pull request when that passes.

Chris


On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, Aug 29, 2013 at 4:20 PM, Chris Mitchell <chris.mit7 at gmail.com>
> wrote:
> > I was going to take a stab at (3), but it seems that _namedtuple.py
> doesn't
> > exist.
> >
> > Looking under _py3k as well as grep -Ri namedtuple ./*
> >
> > fails to find it. I'm pulling from
> > https://github.com/biopython/biopython.git
>
> Oops. I wrote that email on my latop - it was a file never checked
> into source code control. Looking back it was a plan for allowing
> us to use named tuples on older versions of Python. Sorry!
>
> But I have come up with another easy task instead,
>
> (7) Update exception style from this,
>
> except ErrorClass, variable_name:
>
> to this:
>
> except ErrorClass as variable_name:
>
> The second form is the only allowed syntax in Python 3,
> but was not possible under Python 2.5.
>
> Regards,
>
> Peter
>


From p.j.a.cock at googlemail.com  Thu Aug 29 16:20:51 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 17:20:51 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAK_U6OCThFBe3OYVOSjKr+h1s0s5kJgijq3ie4ZkXhQX4Zc06Q@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAK_U6OBzHMFco3Pe3y=nsvJ8=ut4AkwhPkTmHAHkaXUXuE0r4Q@mail.gmail.com>
	<CAKVJ-_7DLe-o_-fbZhe8iXfKN5-46bMd3y59opigMQnK__Ux0A@mail.gmail.com>
	<CAK_U6OCThFBe3OYVOSjKr+h1s0s5kJgijq3ie4ZkXhQX4Zc06Q@mail.gmail.com>
Message-ID: <CAKVJ-_5QV1G-gWO8ftqa82GTJ-YW3z1AocM-ttNO_co0c=5ZsQ@mail.gmail.com>

On Thu, Aug 29, 2013 at 5:03 PM, Chris Mitchell <chris.mit7 at gmail.com> wrote:
> Sounds good. Just took care of (7), running the test suite and will send a
> pull request when that passes.
>
> Chris

https://github.com/biopython/biopython/pull/227 looks good, but
has highlighted a bug in Scripts/debug/debug_blast_parser.py
(see my comment on GitHub).

Good work,

Peter


From p.j.a.cock at googlemail.com  Thu Aug 29 16:33:43 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 17:33:43 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
Message-ID: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>

> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hello all - especially newcomers,
>>
>> There are going to be several boring but useful things to do to
>> the Biopython code base once we're finished with Python 2.5
>> (the imminent release of Biopython 1.62 has been clearly
>> described as the final Biopython release to support it).
>>
>> Some of these tasks are quite easy, and might tempt some
>> of our non-core contributors or new-comers to have a go,
>> however to avoid too much duplication of effort I'd suggest
>> **replying in this thread if you want to tackle anything** - and
>> then start working out how to send us your first pull request.
>>
>> Things which will need doing:
>>
>> (0) Disable the Python 2.5 and Jython 2.5 buildbot
>> (this will be done by me or Tiago)
>
> Done.
>
>> (1) Disable the Python 2.5 target in TravisCI, see
>> https://travis-ci.org/biopython/biopython/
>> (this is a simple one line edit to the .travis.yml file)
>
> Done by Wayne,
> https://github.com/biopython/biopython/commit/d134b3ae6d963b81510c40c621d640ee00b6f3de
>
>> (2) Remove all the with statement imports (and any
>> comment lines associated with them):
>>
>> from __future__ import with_statement
>
> Done by Wayne,
> https://github.com/biopython/biopython/commit/eeab501987de61ae5935153e1b1a0b225878cb84
>
>> (3) Remove Bio/_py3k/_namedtuple.py and adjust
>> import lines accordingly

(3) was a false alarm, just an old file on my latop confusing me.

>> (4) Scan over the code base looking for any comments
>> about Python 2.5 (e.g. using the grep command), and
>> reviewing them one by one to see if there is an old
>> workaround we can now remove.
>
> Lenna had a quick look, there should be some easy one here.
>
>> (5) More advanced code review, for example looking
>> for places we can better take advantage of context
>> managers (with statements) for file handles.
>
> Another new one, related to (5), and fairly easy:
>
> (6) Reviewing examples in the docstrings and Tutorial
> where it would make sense to use a 'with' for file handles.
>
> This should also solve many of the ResourceWarning:
> unclosed file ... warnings visible running the full test
> suite under Python 3, e.g. see:
> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio

On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> ... I have come up with another easy task instead,
>
> (7) Update exception style from this,
>
> except ErrorClass, variable_name:
>
> to this:
>
> except ErrorClass as variable_name:
>
> The second form is the only allowed syntax in Python 3,
> but was not possible under Python 2.5.

(7) is being tackled by Chris Mitchell,
https://github.com/biopython/biopython/pull/227

Here's another fairly easy task for another new volunteer?:

(8) Excluding doctests and the Tutorial, use print function
rather than print statement. e.g. replace this:

print variable1, variable2

with this:

from __future__ import print_function
...
print(variable1, variable2)

Note that I am deliberately not suggesting we switch the
user visible examples on our documentation yet - that
deserves some discussion first.

Peter


From p.j.a.cock at googlemail.com  Thu Aug 29 17:03:24 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 18:03:24 +0100
Subject: [Biopython-dev] Python 2.6+ support for __dir__ method
Message-ID: <CAKVJ-_7gUKDAqW-B7+eFCjVX=3LZYvUbELfCgjOWzPX8TCCUhQ@mail.gmail.com>

Hi all,

I was reading over the list of what's new in Python 2.6 and wondered about this:

> The built-in dir() function now checks for a __dir__() method on the
> objects it receives. This method must return a list of strings containing
> the names of valid attributes for the object, and lets the object control
> the value that dir() produces. Objects that have __getattr__() or
> __getattribute__() methods can use this to advertise pseudo-attributes
> they will honor. (issue 1591665)

http://docs.python.org/2/whatsnew/2.6.html

Does that sound useful for some of our more dynamic objects?

Peter


From arklenna at gmail.com  Thu Aug 29 17:18:16 2013
From: arklenna at gmail.com (Lenna Peterson)
Date: Thu, 29 Aug 2013 13:18:16 -0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
Message-ID: <CAHQkFdfhJsRGGq=BPpZDh=rS8UtJcqA0ELSqHwjGvPqb5=GG=g@mail.gmail.com>

On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

>
> Here's another fairly easy task for another new volunteer?:
>
> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:
>
> print variable1, variable2
>
> with this:
>
> from __future__ import print_function
> ...
> print(variable1, variable2)
>
> Note that I am deliberately not suggesting we switch the
> user visible examples on our documentation yet - that
> deserves some discussion first.
>
>
>From the docs:  "When using the 2to3 source-to-source conversion tool, all
print statements are automatically converted to print() function calls, so
this is mostly a non-issue for larger projects."

http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function

Which suggests either doing it with the tool or just waiting until the full
3.0 changeover?


From p.j.a.cock at googlemail.com  Thu Aug 29 17:35:16 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 18:35:16 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAHQkFdfhJsRGGq=BPpZDh=rS8UtJcqA0ELSqHwjGvPqb5=GG=g@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAHQkFdfhJsRGGq=BPpZDh=rS8UtJcqA0ELSqHwjGvPqb5=GG=g@mail.gmail.com>
Message-ID: <CAKVJ-_5XLYbe0Axooob=wr-ynCGRPL7dR7bpHjvd6n5CALAAow@mail.gmail.com>

On Thursday, August 29, 2013, Lenna Peterson wrote:

>
>
> On Thu, Aug 29, 2013 at 12:33 PM, Peter Cock <p.j.a.cock at googlemail.com<javascript:_e({}, 'cvml', 'p.j.a.cock at googlemail.com');>
> > wrote:
>
>>
>> Here's another fairly easy task for another new volunteer?:
>>
>> (8) Excluding doctests and the Tutorial, use print function
>> rather than print statement. e.g. replace this:
>>
>> print variable1, variable2
>>
>> with this:
>>
>> from __future__ import print_function
>> ...
>> print(variable1, variable2)
>>
>> Note that I am deliberately not suggesting we switch the
>> user visible examples on our documentation yet - that
>> deserves some discussion first.
>>
>>
> From the docs:  "When using the 2to3 source-to-source conversion tool, all
> print statements are automatically converted to print() function calls, so
> this is mostly a non-issue for larger projects."
>
> http://docs.python.org/3.0/whatsnew/3.0.html#print-is-a-function
>
> Which suggests either doing it with the tool or just waiting until the
> full 3.0 changeover?
>

My motivation is a step towards a single codebase for both
Python 2 and Python 3 without needing 2to3, see:

http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html
http://www.slideshare.net/pjacock/biopython-update-bosc2013/

Peter


From superbobry at gmail.com  Thu Aug 29 20:34:59 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Fri, 30 Aug 2013 00:34:59 +0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
Message-ID: <CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>

On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Here's another fairly easy task for another new volunteer?:
>
> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:
>
> print variable1, variable2
>
> with this:
>
> from __future__ import print_function
> ...
> print(variable1, variable2)
>
> Note that I am deliberately not suggesting we switch the
> user visible examples on our documentation yet - that
> deserves some discussion first.


So the task is to remove print statement from the code only, right? I think
I can do this, should I use a separate branch?

Sergei


From p.j.a.cock at googlemail.com  Thu Aug 29 20:44:49 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 29 Aug 2013 21:44:49 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
Message-ID: <CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>

On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
> On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
>>
>> Here's another fairly easy task for another new volunteer?:
>>
>> (8) Excluding doctests and the Tutorial, use print function
>> rather than print statement. e.g. replace this:
>>
>> print variable1, variable2
>>
>> with this:
>>
>> from __future__ import print_function
>> ...
>> print(variable1, variable2)
>>
>> Note that I am deliberately not suggesting we switch the
>> user visible examples on our documentation yet - that
>> deserves some discussion first.
>
>
> So the task is to remove print statement from the code only, right?

Replacing them with print functions, and testing this
worked OK under both Python 2 and Python 3, yes :)

> I think I can do this, should I use a separate branch?
>
> Sergei

Yes, I would certainly recommend keeping the
default 'master' branch as a copy of the official one,
and creating a new 'print-function' branch (or whatever
name you prefer) for this work.

We probably need to improve this wiki page - so any
comments about what is unclear would be great (on
a new email thread): http://biopython.org/wiki/GitUsage

Thanks,

Peter


From p.j.a.cock at googlemail.com  Fri Aug 30 10:49:23 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 11:49:23 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
	clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
Message-ID: <CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>

Hello Biopythoneers,

I've outlined another relatively simple improvement for potential
new contributors to try below....

On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>> Hello all - especially newcomers,
>>>
>>> There are going to be several boring but useful things to do to
>>> the Biopython code base once we're finished with Python 2.5
>>> (the imminent release of Biopython 1.62 has been clearly
>>> described as the final Biopython release to support it).
>>>
>>> ...
>>>
>>> (4) Scan over the code base looking for any comments
>>> about Python 2.5 (e.g. using the grep command), and
>>> reviewing them one by one to see if there is an old
>>> workaround we can now remove.
>>
>> Lenna had a quick look, there should be some easy one here.
>>
>>> (5) More advanced code review, for example looking
>>> for places we can better take advantage of context
>>> managers (with statements) for file handles.
>>
>> Another new one, related to (5), and fairly easy:
>>
>> (6) Reviewing examples in the docstrings and Tutorial
>> where it would make sense to use a 'with' for file handles.
>>
>> This should also solve many of the ResourceWarning:
>> unclosed file ... warnings visible running the full test
>> suite under Python 3, e.g. see:
>> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio
>
> On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> ... I have come up with another easy task instead,
>>
>> (7) Update exception style

(7) was done by Chris Mitchell,
https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942

> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:

(8) is being looked at by Sergei Lebedev.

----

Here's another idea, under the general issue (5) of taking
advantage of context managers (with statements), which
I would judge to be fairly easy (but not trivial).

(9) Use context managers (with statements) for temporary
warning filters in the unit tests.

Currently many of our unit tests add simple filters to ignore
a warning, and then restore the old filters using pop(). This
mostly works, but is fragile and the filter list is global so this
can have strange side effects. See:

$ grep "warnings." Tests/*.py

The idea here is to replace this:

warnings.simplefilter('ignore', PDBConstructionWarning)
#some code which may trigger the warning
warnings.filters.pop()

with this:

with warnings.catch_warnings():
    warnings.simplefilter("ignore", PDBConstructionWarning)
    #some code which may trigger the warning

Note the indentation - these changes will not give nice
clean diffs, so this will not be so easy to review.

I would therefore suggest editing just one test file at a
time (i.e. limit each commit to changing a single file), as
that makes it easier to selectively apply your changes

Please make sure you test this Python 2.6 which is most
likely to have problems with this "new" style ;)

(Again, if anyone plans to work on this, please let the list
know to minimised duplicated effort.)

If you're not familiar with our test suite, there is a chapter
introducing this in the main Tutorial & Cookbook,
http://biopython.org/DIST/docs/tutorial/Tutorial.html

Thanks,

Peter


From superbobry at gmail.com  Fri Aug 30 12:58:31 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Fri, 30 Aug 2013 16:58:31 +0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
Message-ID: <CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>

> (8) Excluding doctests and the Tutorial, use print function
> rather than print statement. e.g. replace this:
Unfortunately we cannot exclude doctests, because 'from __future__' import
is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on
docstrings with print statement.

Sergei


On Fri, Aug 30, 2013 at 12:44 AM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, Aug 29, 2013 at 9:34 PM, Sergei Lebedev <superbobry at gmail.com>
> wrote:
> > On Thu, Aug 29, 2013 at 8:33 PM, Peter Cock <p.j.a.cock at googlemail.com>
> > wrote:
> >>
> >> Here's another fairly easy task for another new volunteer?:
> >>
> >> (8) Excluding doctests and the Tutorial, use print function
> >> rather than print statement. e.g. replace this:
> >>
> >> print variable1, variable2
> >>
> >> with this:
> >>
> >> from __future__ import print_function
> >> ...
> >> print(variable1, variable2)
> >>
> >> Note that I am deliberately not suggesting we switch the
> >> user visible examples on our documentation yet - that
> >> deserves some discussion first.
> >
> >
> > So the task is to remove print statement from the code only, right?
>
> Replacing them with print functions, and testing this
> worked OK under both Python 2 and Python 3, yes :)
>
> > I think I can do this, should I use a separate branch?
> >
> > Sergei
>
> Yes, I would certainly recommend keeping the
> default 'master' branch as a copy of the official one,
> and creating a new 'print-function' branch (or whatever
> name you prefer) for this work.
>
> We probably need to improve this wiki page - so any
> comments about what is unclear would be great (on
> a new email thread): http://biopython.org/wiki/GitUsage
>
> Thanks,
>
> Peter
>


From p.j.a.cock at googlemail.com  Fri Aug 30 13:14:14 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 14:14:14 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
	<CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
Message-ID: <CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>

On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
>> (8) Excluding doctests and the Tutorial, use print function
>> rather than print statement. e.g. replace this:
>
> Unfortunately we cannot exclude doctests, because 'from __future__' import
> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on
> docstrings with print statement.
>
> Sergei

Could you clarify this? Does this cause a problem via:

[Tests]$ python run_tests.py doctest

If you have a small example, copy & paste the "git diff" output here.

Peter


From superbobry at gmail.com  Fri Aug 30 13:28:50 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Fri, 30 Aug 2013 17:28:50 +0400
Subject: [Biopython-dev] =?utf-8?q?_Re=3A__Post_Biopython_1=2E62_release?=
 =?utf-8?q?=2C_clean-up_after_dropping_Python_2=2E5?=
In-Reply-To: <CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
	<CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
	<CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>
Message-ID: <etPan.52209e12.6b8b4567.5796@yooki.labs.intellij.net>

Sure,?a common pattern for a lot of BioPython modules seems to be:

? ? # +from __future__ import print_function


? ? def foo():
? ? ? ? """A docstring with print statement.

? ? ? ? >>> print "foo"
? ? ? ? foo
? ? ? ? """
? ? ? ? print "Running foo ..."
? ? ? ? # +print("Running foo ...")


? ? if __name__ == "__main__":
? ? ? ? import doctest
? ? ? ? doctest.testmod()

where foo is some function, which uses print statement in its body. Since we want to switch from print statements to print function we replace?print "Running foo ..."?with a?print()?call and add from?__future__ import ...?to the?beginning?of the module.?

What happens if we try to run the doctests after we've switched to?print_function?

? ? $ python /tmp/foo.py
? ? **********************************************************************
? ? File "/tmp/foo.py", line 7, in __main__.foo
? ? Failed example:
? ? ? ? print "foo"
? ? Exception raised:
? ? ? ? Traceback (most recent call last):
? ? ? ? ? File ".../doctest.py", line 1254, in __run
? ? ? ? ? ? compileflags, 1) in test.globs
? ? ? ? ? File "<doctest __main__.foo[0]>", line 1
? ? ? ? ? ? print "foo"
? ? ? ? ? ? ? ? ? ? ? ^
? ? ? ? SyntaxError: invalid syntax
? ? **********************************************************************
? ? 1 items had failures:
? ? ? ?1 of ? 1 in __main__.foo
? ? ***Test Failed*** 1 failures.

So, enabling?print_function?makes doctests using print statement fail with a SyntaxError, as shown by the example above. Thus, if we want to get rid of print statement in the code we have no other choice but to do the same it in the doctests.

Sergei?


On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com) wrote:

On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev <superbobry at gmail.com> wrote:  
>> (8) Excluding doctests and the Tutorial, use print function  
>> rather than print statement. e.g. replace this:  
>  
> Unfortunately we cannot exclude doctests, because 'from __future__' import  
> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on  
> docstrings with print statement.  
>  
> Sergei  

Could you clarify this? Does this cause a problem via:  

[Tests]$ python run_tests.py doctest  

If you have a small example, copy & paste the "git diff" output here.  

Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biopython-dev/attachments/20130830/8d297e64/attachment-0002.html>

From p.j.a.cock at googlemail.com  Fri Aug 30 14:22:26 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 15:22:26 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <etPan.52209e12.6b8b4567.5796@yooki.labs.intellij.net>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAFQn-391vEPbtq-SUvBz6uv1jjQ1gWRZagTat2+kKB+nOko89g@mail.gmail.com>
	<CAKVJ-_4P5SfZKHuO9=5CjBP_ya73-jS1OQ+kAJLX8D5XJj3CzA@mail.gmail.com>
	<CAFQn-38TZHM6geBZkwmL5Rkg8tWEUxTV-Yvj_Dw_04UxLNy58A@mail.gmail.com>
	<CAKVJ-_7iKPcb2W3NWPaW-9cK2nPC8kciQE_n9FtsQ4emxeXQmw@mail.gmail.com>
	<etPan.52209e12.6b8b4567.5796@yooki.labs.intellij.net>
Message-ID: <CAKVJ-_5+0GeN5p5d9tvf+BxzHC+6DUnamwNJ6_N61STWa0TLFQ@mail.gmail.com>

Thanks Sergei - that clarified things.

Unfortunately this doesn't just break our convenience __main__ trick for
running the doctests in any single module, it also breaks doing it via:

$ python run_tests.py doctest

This means we'd have to update the doctests to also use Python 3
style print functions... which may be premature (we'll need to do
this at some point though).

How about the less ambitious plan of replacing lines like this:

print variable

with:

print(variable)

This will be understood as a print function call on Python 3 (and work),
and will also work on Python 2 (without the future import) where it will
be parsed as redundant parentheses.

Note you can't use this trick where more than one variable is printed,
because then on Python 2 the brackets will create a tuple instead.

Peter


On Fri, Aug 30, 2013 at 2:28 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
> Sure, a common pattern for a lot of BioPython modules seems to be:
>
>     # +from __future__ import print_function
>
>
>     def foo():
>         """A docstring with print statement.
>
>         >>> print "foo"
>         foo
>         """
>         print "Running foo ..."
>         # +print("Running foo ...")
>
>
>     if __name__ == "__main__":
>         import doctest
>         doctest.testmod()
>
> where foo is some function, which uses print statement in its body. Since we
> want to switch from print statements to print function we replace print
> "Running foo ..." with a print() call and add from __future__ import ... to
> the beginning of the module.
>
> What happens if we try to run the doctests after we've switched to
> print_function?
>
>     $ python /tmp/foo.py
>     **********************************************************************
>     File "/tmp/foo.py", line 7, in __main__.foo
>     Failed example:
>         print "foo"
>     Exception raised:
>         Traceback (most recent call last):
>           File ".../doctest.py", line 1254, in __run
>             compileflags, 1) in test.globs
>           File "<doctest __main__.foo[0]>", line 1
>             print "foo"
>                       ^
>         SyntaxError: invalid syntax
>     **********************************************************************
>     1 items had failures:
>        1 of   1 in __main__.foo
>     ***Test Failed*** 1 failures.
>
> So, enabling print_function makes doctests using print statement fail with a
> SyntaxError, as shown by the example above. Thus, if we want to get rid of
> print statement in the code we have no other choice but to do the same it in
> the doctests.
>
> Sergei
>
>
>
> On August 30, 2013 at 5:14:14 PM, Peter Cock (p.j.a.cock at googlemail.com)
> wrote:
>
> On Fri, Aug 30, 2013 at 1:58 PM, Sergei Lebedev <superbobry at gmail.com>
> wrote:
>>> (8) Excluding doctests and the Tutorial, use print function
>>> rather than print statement. e.g. replace this:
>>
>> Unfortunately we cannot exclude doctests, because 'from __future__' import
>> is module wide, thus the 'doctest.testmod()' will raise a SyntaxError on
>> docstrings with print statement.
>>
>> Sergei
>
> Could you clarify this? Does this cause a problem via:
>
> [Tests]$ python run_tests.py doctest
>
> If you have a small example, copy & paste the "git diff" output here.
>
> Peter


From p.j.a.cock at googlemail.com  Fri Aug 30 15:46:59 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 30 Aug 2013 16:46:59 +0100
Subject: [Biopython-dev] Fwd: [biopython] Potential error in mass
	calculations for RNA/DNA? (#229)
In-Reply-To: <biopython/biopython/issues/229@github.com>
References: <biopython/biopython/issues/229@github.com>
Message-ID: <CAKVJ-_7EN2VR3qpinNUhDqOs=CD+0EFz40RSzjztS5fjH7gZYw@mail.gmail.com>

Who are our sequence mass experts?
https://github.com/biopython/biopython/issues/229

---------- Forwarded message ----------
From: nruggero <notifications at github.com>
Date: Thu, Aug 29, 2013 at 11:03 PM
Subject: [biopython] Potential error in mass calculations for RNA/DNA?
(#229)
To: biopython/biopython <biopython at noreply.github.com>


In Bio/Data/IUPACData.py the molecular weights of unambiguous DNA are
listed as:

unambiguous_dna_weights = {
    "A": 347.,
    "C": 323.,
    "G": 363.,
    "T": 322.,
    }

As far as I can tell these are the molecular weights for the non-deoxy
bases instead of the deoxy bases. For example, AMP (347.22) instead of dAMP
(331.22) is listed.

I've looked at the original BioPearl code that these numbers were taken
from and I think they were just copied incorrectly. I have also looked at
the code which uses this dict in Bio/SeqUtils/__init__.py called
molecular_weight() and it just takes the sum of these values over the
sequence (no correction made).

So, is this an error or am I missing something basic?
Thanks

?
Reply to this email directly or view it on
GitHub<https://github.com/biopython/biopython/issues/229>
.


From superbobry at gmail.com  Fri Aug 30 22:53:53 2013
From: superbobry at gmail.com (Sergei Lebedev)
Date: Sat, 31 Aug 2013 02:53:53 +0400
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>
Message-ID: <CAFQn-3_92W5NFht51YapuhrB64hVhQOvqCXfuh6JoR7NO2ragQ@mail.gmail.com>

Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which
does all the job, so I think I can take #9.

Sergei

[*] https://github.com/biopython/biopython/pull/230


On Fri, Aug 30, 2013 at 2:49 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> Hello Biopythoneers,
>
> I've outlined another relatively simple improvement for potential
> new contributors to try below....
>
> On Thu, Aug 29, 2013 at 5:33 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> On Wed, Aug 28, 2013 at 7:53 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >>> Hello all - especially newcomers,
> >>>
> >>> There are going to be several boring but useful things to do to
> >>> the Biopython code base once we're finished with Python 2.5
> >>> (the imminent release of Biopython 1.62 has been clearly
> >>> described as the final Biopython release to support it).
> >>>
> >>> ...
> >>>
> >>> (4) Scan over the code base looking for any comments
> >>> about Python 2.5 (e.g. using the grep command), and
> >>> reviewing them one by one to see if there is an old
> >>> workaround we can now remove.
> >>
> >> Lenna had a quick look, there should be some easy one here.
> >>
> >>> (5) More advanced code review, for example looking
> >>> for places we can better take advantage of context
> >>> managers (with statements) for file handles.
> >>
> >> Another new one, related to (5), and fairly easy:
> >>
> >> (6) Reviewing examples in the docstrings and Tutorial
> >> where it would make sense to use a 'with' for file handles.
> >>
> >> This should also solve many of the ResourceWarning:
> >> unclosed file ... warnings visible running the full test
> >> suite under Python 3, e.g. see:
> >>
> http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.3/builds/298/steps/shell/logs/stdio
> >
> > On Thu, Aug 29, 2013 at 11:30 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> ... I have come up with another easy task instead,
> >>
> >> (7) Update exception style
>
> (7) was done by Chris Mitchell,
>
> https://github.com/biopython/biopython/commit/1d42f4dc07c8203a162d635b9bca5acb90204942
>
> > (8) Excluding doctests and the Tutorial, use print function
> > rather than print statement. e.g. replace this:
>
> (8) is being looked at by Sergei Lebedev.
>
> ----
>
> Here's another idea, under the general issue (5) of taking
> advantage of context managers (with statements), which
> I would judge to be fairly easy (but not trivial).
>
> (9) Use context managers (with statements) for temporary
> warning filters in the unit tests.
>
> Currently many of our unit tests add simple filters to ignore
> a warning, and then restore the old filters using pop(). This
> mostly works, but is fragile and the filter list is global so this
> can have strange side effects. See:
>
> $ grep "warnings." Tests/*.py
>
> The idea here is to replace this:
>
> warnings.simplefilter('ignore', PDBConstructionWarning)
> #some code which may trigger the warning
> warnings.filters.pop()
>
> with this:
>
> with warnings.catch_warnings():
>     warnings.simplefilter("ignore", PDBConstructionWarning)
>     #some code which may trigger the warning
>
> Note the indentation - these changes will not give nice
> clean diffs, so this will not be so easy to review.
>
> I would therefore suggest editing just one test file at a
> time (i.e. limit each commit to changing a single file), as
> that makes it easier to selectively apply your changes
>
> Please make sure you test this Python 2.6 which is most
> likely to have problems with this "new" style ;)
>
> (Again, if anyone plans to work on this, please let the list
> know to minimised duplicated effort.)
>
> If you're not familiar with our test suite, there is a chapter
> introducing this in the main Tutorial & Cookbook,
> http://biopython.org/DIST/docs/tutorial/Tutorial.html
>
> Thanks,
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>


From p.j.a.cock at googlemail.com  Sat Aug 31 09:31:53 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sat, 31 Aug 2013 10:31:53 +0100
Subject: [Biopython-dev] Post Biopython 1.62 release,
 clean-up after dropping Python 2.5
In-Reply-To: <CAFQn-3_92W5NFht51YapuhrB64hVhQOvqCXfuh6JoR7NO2ragQ@mail.gmail.com>
References: <CAKVJ-_4QkWKT33cutT5cpTXw7nyqs18KFj+=8gvT_S4854nUOg@mail.gmail.com>
	<CAKVJ-_7V3mpE-d+myKfHXzmJ26UFqz24OR6wnFOfrS55=FHJLg@mail.gmail.com>
	<CAKVJ-_6x9ztCVZNn-W+qpxkAWsy1Rdt_QCYwXAd-8-=nsWjSpA@mail.gmail.com>
	<CAKVJ-_4HPwAp=6L9rfvV69bOuq-Z5vwy2m_CD2_ROiBguJDLHA@mail.gmail.com>
	<CAFQn-3_92W5NFht51YapuhrB64hVhQOvqCXfuh6JoR7NO2ragQ@mail.gmail.com>
Message-ID: <CAKVJ-_7nPcfApCLgMGjb8-jToJDcDnkk8ve7LiiNgy5w0prxrQ@mail.gmail.com>

On Fri, Aug 30, 2013 at 11:53 PM, Sergei Lebedev <superbobry at gmail.com> wrote:
> Peter, I've just submitted a PR [*] for #8 along with a 2to3 fixer which
> does all the job, so I think I can take #9.
>
> Sergei
>
> [*] https://github.com/biopython/biopython/pull/230

Print-function-like syntax committed for (8), thank you.
We'll need to come back to this later as there are still
lots of print statements left in the codebase... time for
a more general discussion about what people would
prefer to see in the user-facing documentation.

If you'd like to try some context managers for the
warnings in the unit tests (9), that would be great.

Note some of the tests will require you to install a
command line tool - it should be clear, but if we
need to add more documentation (e.g. URLs) please
let us know.

Thanks,

Peter