From yeyanbo289 at gmail.com Mon Jul 1 05:29:34 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 1 Jul 2013 17:29:34 +0800 Subject: [Biopython-dev] gsoc weekly update Message-ID: Hi all, I post an update for the project 'Phylogenetics in Biopython: Filling in the gaps'. http://blog.yeyanbo.com/posts/google-summer-of-code-4.html Best, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From mtholder at gmail.com Mon Jul 1 11:15:47 2013 From: mtholder at gmail.com (Mark Holder) Date: Mon, 1 Jul 2013 10:15:47 -0500 Subject: [Biopython-dev] gsoc weekly update In-Reply-To: References: Message-ID: Hi Yanbo, It looks like you are making nice progress. 1. A comment on tests: I noticed that the upgma and nj tests (from last week) just verify that the trees produced are of the right class and can be written as newick. It is probably worth strengthening those tests to make them check that the branch lengths are correct. 2. A thought on character weighting: You might think about adding support for tree construction from a "compressed" input character matrix. By compressed, I mean one in which you store unique data patterns (unique columns in an alignment) and a pattern weight for that column rather than storing every character separately. The pattern weight is typically the number of times that the pattern was observed in the original ("raw") character matrix (but it is nice to support floats as weights). Richer implementations of a compressed matrix also store the complete mapping of data patterns to original character indices to enable the recreation of the original matrix, but that feature is rarely used in tree inference. I don't know if biopython has this form of data compression implemented, but it is very widely used in phylogenetic inference. It can be used in any inference technique that treats characters as independent and identically distributed. If biopython does not support this form of compression, then it may be worth writing the TreeConstruction code to work with character weights in the event that someone else implements this feature. Or you could at least add a #\TODO comment in the code where ever character weighting would be used (so that it would be easy to fix later). all the best, Mark PS: I've been travelling, but I'm back in Lawrence now. I'm happy to chat with you this week about parsimony algorithms if you have questions. On Mon, Jul 1, 2013 at 4:29 AM, Yanbo Ye wrote: > Hi all, > > I post an update for the project 'Phylogenetics in Biopython: Filling in the > gaps'. > http://blog.yeyanbo.com/posts/google-summer-of-code-4.html > > Best, > > Yanbo > > -- > > ?????? > > ???????????????????????????????? > > Yanbo Ye > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > Sciences -- Mark Holder mtholder at gmail.com mtholder at ku.edu http://phylo.bio.ku.edu/mark-holder ============================================== Department of Ecology and Evolutionary Biology University of Kansas 6031 Haworth Hall 1200 Sunnyside Avenue Lawrence, Kansas 66045 lab phone: 785.864.5789 fax (shared): 785.864.5860 ============================================== From redmine at redmine.open-bio.org Tue Jul 2 18:14:21 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Tue, 2 Jul 2013 22:14:21 +0000 Subject: [Biopython-dev] [Biopython - Bug #3419] Bio.SearchIO.FastaIO References: Message-ID: Issue #3419 has been updated by Wibowo Arindrarto. Hi Jason, Apologies for a very long reply. Apparently the notification of your reply didn't get to my inbox and I have forgotten to check the page manually :(. Fortunately I met Peter and he pointed this out :). IIRC, the parser does store the program name that created the results (the QueryResult.program attribute). And we can deal with strand/frame accordingly. There is, however, not a standard way to store strand information of 'parent' sequence (in this case the DNA that was the template of the protein). I'll poke around to see if this has been dealt with. Anyway, your patch does look OK for the time being. The BLASTX parser handles this information the same way too (storing read frame in the protein Seq object). Would you like to submit it through Github? I'd be happy to commit on your behalf as well :). ---------------------------------------- Bug #3419: Bio.SearchIO.FastaIO https://redmine.open-bio.org/issues/3419 Author: Jason Stajich Status: New Priority: Low Assignee: Biopython Dev Mailing List Category: Main Distribution Target version: URL: The strand of the translated sequence (query or subject depending on the analysis) is lost for tfastxy and fastx/y reports. from Bio import SearchIO qresults = SearchIO.parse('test.FASTY.out','fasta-m10') for qresult in qresults: for hit in qresult: for hsp in hit.hsps: print qresult.id, " ", hit.id, " ", \ hsp.query_start, "..",hsp.query_end, " ", hsp.query_strand, " ", \ hsp.hit_start, "..",hsp.hit_end, " ", hsp.hit_strand -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From w.arindrarto at gmail.com Tue Jul 2 18:32:32 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Wed, 3 Jul 2013 00:32:32 +0200 Subject: [Biopython-dev] Storing reading frame / strand of nucleic acid sequences used for creating protein sequences Message-ID: Hi everyone, I'm wondering if we have a standard way of storing the reading frames of DNA/RNA sequences used for creating protein sequences? In some cases, keeping track of the original reading frame may be desirable. (e.g. in TBLASTX alignments, where users want to know the reading frames of both the query and the hit sequences). I realize that it is possible to store strand information in SeqFeature objects. However, I am afraid that storing the strand / reading frame information for SeqFeatures of Seq objects with ProteinAlphabets may seem misleading as the strand information belongs to the DNA / RNA sequence that was used as the protein template, not the protein itself. On a related note, I'm also wondering if for the Seq object's translate method, there should be an argument to specify which reading frame we want use to translate the sequence? This can be trivially solved using Python's convenient indexing and slicing system. However, indexing and/or slicing does not allow us to keep track of the original DNA/RNA reading frame (at least not the way it is implemented now). Let me know what you think :). Best regards, Bow From zruan1991 at gmail.com Wed Jul 3 09:35:08 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Wed, 3 Jul 2013 21:35:08 +0800 Subject: [Biopython-dev] Storing reading frame / strand of nucleic acid sequences used for creating protein sequences In-Reply-To: References: Message-ID: Hi Bow, I'm quite inerested in setting up a standard way to deal with reading frames, as this is the job I would like to implement in the Codon Alignment project this summer. Getting reading frame information from TBLASTX result seems to be a good idea and I may implement a funtion to deal with it. As to how to store this information, I favor the CodonSeq class I've written since frameshift occurrs at DNA/RNA level. I have a CodonAlphabet to check the codon sequene in CodonSeq. Maybe an enhancement of the Alphabet and CodonSeq will better taking fameshift into account. If you are seeking a standard way to represent frameshift in protein level, you may want to read the methods section of pal2nal paper. For example, M2P indicates that there is 1 nt deletion between methionine and proline. But this nontheless violates the ProteinAlphabet we've defined in Biopython. I just came to Shanghai and is about to writting code for this week. I am interested in hearing your suggestions. Thanks! Ruan On Wed, Jul 3, 2013 at 6:32 AM, Wibowo Arindrarto wrote: > Hi everyone, > > I'm wondering if we have a standard way of storing the reading frames > of DNA/RNA sequences used for creating protein sequences? > > In some cases, keeping track of the original reading frame may be > desirable. (e.g. in TBLASTX alignments, where users want to know the > reading frames of both the query and the hit sequences). > > I realize that it is possible to store strand information in > SeqFeature objects. However, I am afraid that storing the strand / > reading frame information for SeqFeatures of Seq objects with > ProteinAlphabets may seem misleading as the strand information belongs > to the DNA / RNA sequence that was used as the protein template, not > the protein itself. > > On a related note, I'm also wondering if for the Seq object's > translate method, there should be an argument to specify which reading > frame we want use to translate the sequence? > > This can be trivially solved using Python's convenient indexing and > slicing system. However, indexing and/or slicing does not allow us to > keep track of the original DNA/RNA reading frame (at least not the way > it is implemented now). > > Let me know what you think :). > > Best regards, > Bow > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From lgautier at gmail.com Mon Jul 8 10:38:36 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Mon, 08 Jul 2013 16:38:36 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv Message-ID: <51DACEEC.6080305@gmail.com> Hi, I just tried installing biopython with pip (v-1.3.1) and Python 3.3 (under a virtual environment created with pyvenv), but (after a a fairly long time) the process is failing with the last 3 lines printed on the terminal: ``` Python 2to3 processing done. running egg_info error: error in 'egg_base' option: 'pip-egg-info' does not exist or is not a directory ``` Am I missing something ? Best, Laurent From p.j.a.cock at googlemail.com Mon Jul 8 11:00:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 8 Jul 2013 16:00:24 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DACEEC.6080305@gmail.com> References: <51DACEEC.6080305@gmail.com> Message-ID: On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier wrote: > Hi, > > I just tried installing biopython with pip (v-1.3.1) and Python 3.3 (under a > virtual environment created with pyvenv), > but (after a a fairly long time) the process is failing with the last 3 > lines printed on the terminal: > > ``` > > Python 2to3 processing done. > > running egg_info > > error: error in 'egg_base' option: 'pip-egg-info' does not exist or is not a > directory > > ``` > > Am I missing something ? > > Best, > > > Laurent Hi Laurent, The 2to3 conversion does take a while (sigh), but I have a plan for that: http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html Which version of Biopython, and if from git, which commit? In particular was it before or after this change from Brad which I recently committed? https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 Thanks, Peter From lgautier at gmail.com Mon Jul 8 12:18:30 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Mon, 08 Jul 2013 18:18:30 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> Message-ID: <51DAE656.2020609@gmail.com> On 07/08/2013 05:00 PM, Peter Cock wrote: > On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier wrote: >> Hi, >> >> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 (under a >> virtual environment created with pyvenv), >> but (after a a fairly long time) the process is failing with the last 3 >> lines printed on the terminal: >> >> ``` >> >> Python 2to3 processing done. >> >> running egg_info >> >> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is not a >> directory >> >> ``` >> >> Am I missing something ? >> >> Best, >> >> >> Laurent > Hi Laurent, > > The 2to3 conversion does take a while (sigh), but I have a plan for that: > http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html For it is worth, I have started doing the following 2 rules for all Python projects (and it is working fairly well): 1- Develop for Python 3.3 (not worrying about anything earlier in the 3.x series) 2- When considering support for Python 2, consider 2.7. Forget about anything earlier The rationale is: if you want the latest version of a package, get a recent Python as well. A variant of "you can't have your cake and eat it", I suppose. It might seem a bit radical, but that's not that bad; it has worked for bioconductor (where each BioC release depends on the latest R version). If no C-extension is involved, rule #2 is only introducing few conditional definitions that are hand-coded. There are only few things to consider. > > Which version of Biopython, Latest on Pypi. I am doing a vanilla: ``` pip install biopython ``` > and if from git, which commit? Anyone on this list volunteering to run `git bisect` on my behalf ? (For now I more familiar with Mercurial, and plenty to do at the moment) > In particular was it before or after this change from Brad which > I recently committed? > > https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 The distutils/distutils2/distribute/setuptools/whatever can be an endless source of headaches. Do you check with one of them in particular ? If so, may be worthwhile to enforce the use of that one in setup.py. Best, Laurent > > Thanks, > > Peter From p.j.a.cock at googlemail.com Mon Jul 8 12:29:32 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 8 Jul 2013 17:29:32 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DAE656.2020609@gmail.com> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> Message-ID: On Mon, Jul 8, 2013 at 5:18 PM, Laurent Gautier wrote: > On 07/08/2013 05:00 PM, Peter Cock wrote: >> >> On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier >> wrote: >>> >>> Hi, >>> >>> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 >>> (under a virtual environment created with pyvenv), >>> but (after a a fairly long time) the process is failing with the last 3 >>> lines printed on the terminal: >>> >>> ``` >>> >>> Python 2to3 processing done. >>> >>> running egg_info >>> >>> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is >>> not a >>> directory >>> >>> ``` >>> >>> Am I missing something ? >>> >>> Best, >>> >>> >>> Laurent >> >> Hi Laurent, >> >> The 2to3 conversion does take a while (sigh), but I have a plan for that: >> http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html > > > For it is worth, I have started doing the following 2 rules for all Python > projects (and it is working fairly well): > > 1- Develop for Python 3.3 (not worrying about anything earlier in the 3.x > series) > 2- When considering support for Python 2, consider 2.7. Forget about > anything earlier > The rationale is: if you want the latest version of a package, get a recent > Python as well. A variant of "you can't have your cake and eat it", I > suppose. It might seem a bit radical, but that's not that bad; it has worked > for bioconductor (where each BioC release depends on the latest R version). BioConductor's policy of being synced to the R version has pros and cons, certainly its users are used to this version treadmill. > If no C-extension is involved, rule #2 is only introducing few conditional > definitions that are hand-coded. There are only few things to consider. We're currently doing Python 2.5/2.6/2.6/3.3+ (although for the most part things have been OK under Python 3.1 and 3.2 as well but we're not going to officially support those). We're about to drop Python 2.5, and I'm fairly sure that dual coding for Python 2.6/2.7/3.3+ will be viable. If not, we may have to be more aggressive about phasing out Python 2.6 in the near future - we'll see. >> >> Which version of Biopython, > > > Latest on Pypi. https://pypi.python.org/pypi/biopython That means Biopython 1.61 at the moment, so prior to Brad's change landing on the trunk (which should be in 1.62 barring any problems before then). > I am doing a vanilla: > > ``` > pip install biopython > ``` > >> and if from git, which commit? > > Anyone on this list volunteering to run `git bisect` on my behalf ? > (For now I more familiar with Mercurial, and plenty to do at the moment) Have you tried using the latest Biopython from source to see if Brad's change fixed this for you? If that also fails the same way, there is no point to a 'git bisect' ;) > The distutils/distutils2/distribute/setuptools/whatever can be an endless > source of headaches. Do you check with one of them in particular ? > If so, may be worthwhile to enforce the use of that one in setup.py. Personally I only ever use the stock "python setup.py install" route, but there has been user demand for things like pip support. I look forward to improved packaging being standardised in a future Python 3.x release on day... Regards, Peter From lgautier at gmail.com Mon Jul 8 15:35:56 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Mon, 08 Jul 2013 21:35:56 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> Message-ID: <51DB149C.2010605@gmail.com> On 07/08/2013 06:29 PM, Peter Cock wrote: > On Mon, Jul 8, 2013 at 5:18 PM, Laurent Gautier wrote: >> On 07/08/2013 05:00 PM, Peter Cock wrote: >>> On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier >>> wrote: >>>> Hi, >>>> >>>> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 >>>> (under a virtual environment created with pyvenv), >>>> but (after a a fairly long time) the process is failing with the last 3 >>>> lines printed on the terminal: >>>> >>>> ``` >>>> >>>> Python 2to3 processing done. >>>> >>>> running egg_info >>>> >>>> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is >>>> not a >>>> directory >>>> >>>> ``` >>>> >>>> Am I missing something ? >>>> >>>> Best, >>>> >>>> >>>> Laurent >>> Hi Laurent, >>> >>> The 2to3 conversion does take a while (sigh), but I have a plan for that: >>> http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html >> >> For it is worth, I have started doing the following 2 rules for all Python >> projects (and it is working fairly well): >> >> 1- Develop for Python 3.3 (not worrying about anything earlier in the 3.x >> series) >> 2- When considering support for Python 2, consider 2.7. Forget about >> anything earlier >> The rationale is: if you want the latest version of a package, get a recent >> Python as well. A variant of "you can't have your cake and eat it", I >> suppose. It might seem a bit radical, but that's not that bad; it has worked >> for bioconductor (where each BioC release depends on the latest R version). > BioConductor's policy of being synced to the R version has pros and > cons, certainly its users are used to this version treadmill. The package installation system has certainly played a role in help keeping users on board. The life cycle of Python version is much slower, so the pace would be slower. > >> If no C-extension is involved, rule #2 is only introducing few conditional >> definitions that are hand-coded. There are only few things to consider. > We're currently doing Python 2.5/2.6/2.6/3.3+ (although for the most > part things have been OK under Python 3.1 and 3.2 as well but we're > not going to officially support those). > > We're about to drop Python 2.5, and I'm fairly sure that dual coding > for Python 2.6/2.7/3.3+ will be viable. If not, we may have to be more > aggressive about phasing out Python 2.6 in the near future - we'll see. IIRC, Python 2.6 does not receive bugfixes for some time already, only security fixes (the last of which will be in the autumn). > >>> Which version of Biopython, >> >> Latest on Pypi. > https://pypi.python.org/pypi/biopython > > That means Biopython 1.61 at the moment, so prior to Brad's > change landing on the trunk (which should be in 1.62 barring > any problems before then). > >> I am doing a vanilla: >> >> ``` >> pip install biopython >> ``` >> >>> and if from git, which commit? >> Anyone on this list volunteering to run `git bisect` on my behalf ? >> (For now I more familiar with Mercurial, and plenty to do at the moment) > Have you tried using the latest Biopython from source to see if Brad's > change fixed this for you? If that also fails the same way, there is no > point to a 'git bisect' ;) > >> The distutils/distutils2/distribute/setuptools/whatever can be an endless >> source of headaches. Do you check with one of them in particular ? >> If so, may be worthwhile to enforce the use of that one in setup.py. > Personally I only ever use the stock "python setup.py install" route, > but there has been user demand for things like pip support. You are missing out on quite a lot, I think. For example, trying to install the master from github can be achieved the following single command: ``` pip install https://github.com/biopython/biopython/archive/master.zip ``` The outcome is the same error as before. > I look > forward to improved packaging being standardised in a future > Python 3.x release on day... Some of the time I am thinking that this will not come before Python 4.x... Best, Laurent > > Regards, > > Peter From p.j.a.cock at googlemail.com Tue Jul 9 05:35:03 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 10:35:03 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DB149C.2010605@gmail.com> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> Message-ID: On Mon, Jul 8, 2013 at 8:35 PM, Laurent Gautier wrote: > On 07/08/2013 06:29 PM, Peter Cock wrote: >> >> On Mon, Jul 8, 2013, Laurent Gautier wrote: >>> >>> On 07/08/2013 05:00 PM, Peter Cock wrote: >>>> >>>> On Mon, Jul 8, 2013, Laurent Gautier >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 >>>>> (under a virtual environment created with pyvenv), >>>>> but (after a a fairly long time) the process is failing with the last 3 >>>>> lines printed on the terminal: >>>>> >>>>> ``` >>>>> >>>>> Python 2to3 processing done. >>>>> >>>>> running egg_info >>>>> >>>>> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is >>>>> not a directory >>>>> >>>>> ``` >>>>> >>>>> Am I missing something ? >>>>> >>>>> Best, >>>>> >>>>> >>>>> Laurent >>>> >>>> Hi Laurent, >>>> >>>> ... >>>> >>>> Which version of Biopython, >>> >>> >>> Latest on Pypi. >> >> https://pypi.python.org/pypi/biopython >> >> That means Biopython 1.61 at the moment, so prior to Brad's >> change landing on the trunk (which should be in 1.62 barring >> any problems before then). >> >>> I am doing a vanilla: >>> >>> ``` >>> pip install biopython >>> ``` >>> >>>> and if from git, which commit? >>> >>> Anyone on this list volunteering to run `git bisect` on my behalf ? >>> (For now I more familiar with Mercurial, and plenty to do at the moment) >> >> Have you tried using the latest Biopython from source to see if Brad's >> change fixed this for you? If that also fails the same way, there is no >> point to a 'git bisect' ;) >> > > ... > > For example, trying to install the master from github can be achieved the > following single command: > > ``` > pip install https://github.com/biopython/biopython/archive/master.zip > ``` > > The outcome is the same error as before. OK, so you get the same error from a 'pip install' from both the Biopython 1.61 release, and the latest code from github as of 8 July 2013, which would have included Brad's pip-related commit which evidently made no difference to this issue: https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 Thanks for checking that. Having confirmed the latest code is still affected, I'm out of ideas - how about you Brad, any thoughts? Thanks, Peter From p.j.a.cock at googlemail.com Tue Jul 9 05:43:57 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 10:43:57 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <86ip0kul37.fsf@fastmail.fm> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> <86ip0kul37.fsf@fastmail.fm> Message-ID: On Tue, Jul 9, 2013 at 10:41 AM, Brad Chapman wrote: > > Laurent and Peter; > >> OK, so you get the same error from a 'pip install' from both the >> Biopython 1.61 release, and the latest code from github as of >> 8 July 2013, which would have included Brad's pip-related >> commit which evidently made no difference to this issue: >> >> https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 >> >> Thanks for checking that. Having confirmed the latest code is >> still affected, I'm out of ideas - how about you Brad, any thoughts? > > Laurent, can you try this branch and see if it installs cleanly for you > using pip on Python 3.3: > > https://github.com/chapmanb/biopython > > I pulled the fix from numpy's older setup.py (before they moved to a > combined 2.7 and 3.3 code base) and it worked on my tests, but would > like to confirm before merging: > > https://github.com/biopython/biopython/pull/172/files > > Thanks for reporting the issue, > Brad Ah - those changes are now live on the master guys: https://github.com/biopython/biopython/commit/1d3d2fe43ae776d30777ffdab7b6528cd3164cfd https://github.com/biopython/biopython/commit/0f821c5b69ab5f6b888d36fb29b1ba3b51ce8590 It was perhaps a little premature but since Brad pushed it up to the pull request I thought it was good to go. We can revert it need be. Laurent, you can just re-test the master rather than Brad's branch. Thanks, Peter From chapmanb at 50mail.com Tue Jul 9 05:41:16 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Tue, 09 Jul 2013 05:41:16 -0400 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> Message-ID: <86ip0kul37.fsf@fastmail.fm> Laurent and Peter; > OK, so you get the same error from a 'pip install' from both the > Biopython 1.61 release, and the latest code from github as of > 8 July 2013, which would have included Brad's pip-related > commit which evidently made no difference to this issue: > > https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 > > Thanks for checking that. Having confirmed the latest code is > still affected, I'm out of ideas - how about you Brad, any thoughts? Laurent, can you try this branch and see if it installs cleanly for you using pip on Python 3.3: https://github.com/chapmanb/biopython I pulled the fix from numpy's older setup.py (before they moved to a combined 2.7 and 3.3 code base) and it worked on my tests, but would like to confirm before merging: https://github.com/biopython/biopython/pull/172/files Thanks for reporting the issue, Brad From p.j.a.cock at googlemail.com Tue Jul 9 06:53:05 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 11:53:05 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? Message-ID: Dear all, Given I'm heading off to Germany next week for BOSC and the CodeFest, it would be good to have Biopython 1.62 out this week - or at least a beta release. Having a beta would make sense in terms of trying to get more testing under Python 3.3, plus the SeqFeature change with sub_features (previously used for joins etc) being deprecated in favour of the new CompoundLocation object. Any thoughts? Also, I've had a go at updating the main README file and the Installation.tex file - that probably needs more work still (e.g. the ReportLab section needs updating). Thanks, Peter From lgautier at gmail.com Tue Jul 9 09:43:38 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Tue, 09 Jul 2013 15:43:38 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> <86ip0kul37.fsf@fastmail.fm> Message-ID: <51DC138A.2020005@gmail.com> On 07/09/2013 11:43 AM, Peter Cock wrote: > On Tue, Jul 9, 2013 at 10:41 AM, Brad Chapman wrote: >> Laurent and Peter; >> >>> OK, so you get the same error from a 'pip install' from both the >>> Biopython 1.61 release, and the latest code from github as of >>> 8 July 2013, which would have included Brad's pip-related >>> commit which evidently made no difference to this issue: >>> >>> https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 >>> >>> Thanks for checking that. Having confirmed the latest code is >>> still affected, I'm out of ideas - how about you Brad, any thoughts? >> Laurent, can you try this branch and see if it installs cleanly for you >> using pip on Python 3.3: >> >> https://github.com/chapmanb/biopython >> >> I pulled the fix from numpy's older setup.py (before they moved to a >> combined 2.7 and 3.3 code base) and it worked on my tests, but would >> like to confirm before merging: >> >> https://github.com/biopython/biopython/pull/172/files >> >> Thanks for reporting the issue, >> Brad > Ah - those changes are now live on the master guys: > > https://github.com/biopython/biopython/commit/1d3d2fe43ae776d30777ffdab7b6528cd3164cfd > https://github.com/biopython/biopython/commit/0f821c5b69ab5f6b888d36fb29b1ba3b51ce8590 > > It was perhaps a little premature but since Brad pushed it up to > the pull request I thought it was good to go. We can revert it > need be. > > Laurent, you can just re-test the master rather than Brad's branch. The install process is now working with the master on Github. Thanks, Laurent > > Thanks, > > Peter From p.j.a.cock at googlemail.com Tue Jul 9 10:38:25 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 15:38:25 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DC138A.2020005@gmail.com> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> <86ip0kul37.fsf@fastmail.fm> <51DC138A.2020005@gmail.com> Message-ID: On Tue, Jul 9, 2013 at 2:43 PM, Laurent Gautier wrote: > On 07/09/2013 11:43 AM, Peter Cock wrote: >> >> ... those changes are now live on the master guys: >> >> https://github.com/biopython/biopython/commit/1d3d2fe43ae776d30777ffdab7b6528cd3164cfd >> https://github.com/biopython/biopython/commit/0f821c5b69ab5f6b888d36fb29b1ba3b51ce8590 >> >> ... >> >> Laurent, you can just re-test the master rather than Brad's branch. > > The install process is now working with the master on Github. > > Thanks, > > Laurent Great - thanks Laurent & Brad :) Peter From mjldehoon at yahoo.com Fri Jul 12 05:06:54 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 12 Jul 2013 02:06:54 -0700 (PDT) Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: Message-ID: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Based on the Biopython deprecation schedule policy, we can remove the following pieces of code in release 1.62: Bio.Align.__init__: the get_column and add_sequence methods Bio.Align.Generic: the class Alignment in its entirety Bio.Graphics.GenomeDiagram._AbstractDraw: the .xcentre and .ycentre properties and their setters in the class AbstractDrawer Bio.Graphics.GenomeDiagram._Graph: The .centre property and its setter in the class GraphData Any final objections before we proceed? Best, -Michiel. ________________________________ From: Peter Cock To: Biopython-Dev Mailing List Sent: Tuesday, July 9, 2013 7:53 PM Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? Dear all, Given I'm heading off to Germany next week for BOSC and the CodeFest, it would be good to have Biopython 1.62 out this week - or at least a beta release. Having a beta would make sense in terms of trying to get more testing under Python 3.3, plus the SeqFeature change with sub_features (previously used for joins etc) being deprecated in favour of the new CompoundLocation object. Any thoughts? Also, I've had a go at updating the main README file and the Installation.tex file - that probably needs more work still (e.g. the ReportLab section needs updating). Thanks, Peter _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev From p.j.a.cock at googlemail.com Fri Jul 12 05:42:50 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 10:42:50 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: On Fri, Jul 12, 2013 at 10:06 AM, Michiel de Hoon wrote: > Based on the Biopython deprecation schedule policy, we can remove the > following pieces of code in release 1.62: > > Bio.Align.__init__: the get_column and add_sequence methods > Bio.Align.Generic: the class Alignment in its entirety Eric & Zheng, that won't cause any problems for your GSoC work will it? > Bio.Graphics.GenomeDiagram._AbstractDraw: the .xcentre and .ycentre > properties and their setters in the class AbstractDrawer > Bio.Graphics.GenomeDiagram._Graph: The .centre property and its setter in > the class GraphData Sounds fine - I can do those, Thanks Michiel, Peter From p.j.a.cock at googlemail.com Fri Jul 12 06:48:10 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 11:48:10 +0100 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> Message-ID: On Fri, Jul 12, 2013 at 11:00 AM, Peter Cock wrote: > On Fri, Jul 12, 2013 at 10:40 AM, Marco Galardini > wrote: >> Hi, >> >> i've arranged a sample script and sample data to replicate the issue: >> >> python test.py test.fa test.txt >> 551 20.9172 >> -5389 21.0426 >> >> pypy test.py test.fa test.txt >> 551 20.9172 >> -5389 21.0426 >> >> Traceback (most recent call last): >> File "app_main.py", line 72, in run_toplevel >> File "test.py", line 20, in >> for position, score in pssm.search(s.seq, threshold=score_t): >> File "/usr/local/lib/pypy2.7/dist-packages/Bio/motifs/matrix.py", line >> 354, in search >> score = self.calculate(s) >> File "/usr/local/lib/pypy2.7/dist-packages/Bio/motifs/matrix.py", line >> 331, in calculate >> score += self[letter][position] >> File "/usr/local/lib/pypy2.7/dist-packages/Bio/motifs/matrix.py", line >> 113, in __getitem__ >> return dict.__getitem__(self, letter) >> KeyError: 'N' >> >> Hope this helps, my guess is that it may be something related to the >> implementation of dictionaries in pypy, since the object raising the >> exception inherits dict. >> >> Thanks a lot for the help, >> Marco > > Great - I can reproduce that here using PyPy 1.9 as well... > OK - this also breaks under Jython and even Python if we disable the C extension. Here self[letters] only has ACGT, not N, thus a key error. This is something the C code just ignores. There is also an inconsistency with mixed case. New unit test: https://github.com/biopython/biopython/commit/e13c97ae3535b58d8ec3da3fc565e97db1fa75a3 Fix for the mixed case difference: https://github.com/biopython/biopython/commit/0cab00c66a1fd15072d020cfc17edbdfb37484a5 The KeyError from bad characters can be handled like this: $ git diff diff --git a/Bio/motifs/matrix.py b/Bio/motifs/matrix.py index bce1d4f..e6446b5 100644 --- a/Bio/motifs/matrix.py +++ b/Bio/motifs/matrix.py @@ -364,7 +364,11 @@ class PositionSpecificScoringMatrix(GenericPositionMatrix): score = 0.0 for position in xrange(m): letter = sequence[i+position] - score += self[letter][position] + try: + score += self[letter][position] + except KeyError: + #The C code ignores unexpected letters like N + pass scores.append(score) else: # get the log-odds matrix into a proper shape However, that leaves a numerical difference in the output: $ pypy test_motifs.py test_simple (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works. ... ok test_with_bad_char (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works with unexpected letters like N. ... FAIL test_with_mixed_case (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works with mixed case. ... ok ... ====================================================================== FAIL: test_with_bad_char (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works with unexpected letters like N. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_motifs.py", line 1662, in test_with_bad_char self.assertTrue(_isnan(result[6]), "Expected nan, not %r" % result[6]) AssertionError: Expected nan, not -37.417418833750574 ---------------------------------------------------------------------- Ran 15 tests in 0.077s FAILED (failures=1) The same error occurs on Jython, and on Python if I disable the C extension. This needs a little more investigation... I don't immediately follow when the C code sets the value to nan. Peter From p.j.a.cock at googlemail.com Fri Jul 12 08:57:08 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 13:57:08 +0100 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> Message-ID: On Fri, Jul 12, 2013 at 11:48 AM, Peter Cock wrote: > > OK - this also breaks under Jython and even Python if we > disable the C extension. Here self[letters] only has ACGT, > not N, thus a key error. This is something the C code just > ignores. There is also an inconsistency with mixed case. > > New unit test: > https://github.com/biopython/biopython/commit/e13c97ae3535b58d8ec3da3fc565e97db1fa75a3 > > Fix for the mixed case difference: > https://github.com/biopython/biopython/commit/0cab00c66a1fd15072d020cfc17edbdfb37484a5 > > The KeyError from bad characters can be handled like this: > > $ git diff > diff --git a/Bio/motifs/matrix.py b/Bio/motifs/matrix.py > index bce1d4f..e6446b5 100644 > --- a/Bio/motifs/matrix.py > +++ b/Bio/motifs/matrix.py > @@ -364,7 +364,11 @@ class PositionSpecificScoringMatrix(GenericPositionMatrix): > score = 0.0 > for position in xrange(m): > letter = sequence[i+position] > - score += self[letter][position] > + try: > + score += self[letter][position] > + except KeyError: > + #The C code ignores unexpected letters like N > + pass > scores.append(score) > else: > # get the log-odds matrix into a proper shape > > However, that leaves a numerical difference in the output: > > ... > > The same error occurs on Jython, and on Python if I disable > the C extension. This needs a little more investigation... I > don't immediately follow when the C code sets the value > to nan. Rereading the C code after lunch I realised how the 'ok' sentinel value was being used - bad letters result in NaN as the value. Fixed, https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 Peter From marco.galardini at unifi.it Fri Jul 12 09:02:20 2013 From: marco.galardini at unifi.it (Marco Galardini) Date: Fri, 12 Jul 2013 15:02:20 +0200 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> Message-ID: <51DFFE5C.4060709@unifi.it> On 07/12/2013 02:57 PM, Peter Cock wrote: > Rereading the C code after lunch I realised how the 'ok' sentinel > value was being used - bad letters result in NaN as the value. > > Fixed, > https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 > > Peter Peter, i think you should remove the " raise ImportError" statement in line 359, as it would render impossible to use the extension (if I got that correctly). Marco -- ------------------------------------------------- Marco Galardini, PhD Dipartimento di Biologia Via Madonna del Piano, 6 - 50019 Sesto Fiorentino (FI) e-mail: marco.galardini at unifi.it www: http://www.unifi.it/dblage/CMpro-v-p-51.html phone: +39 055 4574737 mobile: +39 340 2808041 ------------------------------------------------- From p.j.a.cock at googlemail.com Fri Jul 12 09:26:12 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 14:26:12 +0100 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: <51DFFE5C.4060709@unifi.it> References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> <51DFFE5C.4060709@unifi.it> Message-ID: On Fri, Jul 12, 2013 at 2:02 PM, Marco Galardini wrote: > On 07/12/2013 02:57 PM, Peter Cock wrote: >> >> Rereading the C code after lunch I realised how the 'ok' sentinel >> value was being used - bad letters result in NaN as the value. >> >> Fixed, >> >> https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 >> >> Peter > > Peter, i think you should remove the " raise ImportError" statement in line > 359, as it would render impossible to use the extension (if I got that > correctly). > > Marco Thank you - that was debugging code for testing normal Python: https://github.com/biopython/biopython/commit/66e35d5cdd1cdfbd56b46a2f6098f715adb80f9d Peter From marco.galardini at unifi.it Fri Jul 12 10:50:00 2013 From: marco.galardini at unifi.it (Marco Galardini) Date: Fri, 12 Jul 2013 16:50:00 +0200 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> <51DFFE5C.4060709@unifi.it> Message-ID: <51E01798.1070301@unifi.it> The proposed fix works perfectly for me, thanks a lot! Marco On 07/12/2013 03:26 PM, Peter Cock wrote: > On Fri, Jul 12, 2013 at 2:02 PM, Marco Galardini > wrote: >> On 07/12/2013 02:57 PM, Peter Cock wrote: >>> Rereading the C code after lunch I realised how the 'ok' sentinel >>> value was being used - bad letters result in NaN as the value. >>> >>> Fixed, >>> >>> https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 >>> >>> Peter >> Peter, i think you should remove the " raise ImportError" statement in line >> 359, as it would render impossible to use the extension (if I got that >> correctly). >> >> Marco > Thank you - that was debugging code for testing normal Python: > https://github.com/biopython/biopython/commit/66e35d5cdd1cdfbd56b46a2f6098f715adb80f9d > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev -- ------------------------------------------------- Marco Galardini, PhD Dipartimento di Biologia Via Madonna del Piano, 6 - 50019 Sesto Fiorentino (FI) e-mail: marco.galardini at unifi.it www: http://www.unifi.it/dblage/CMpro-v-p-51.html phone: +39 055 4574737 mobile: +39 340 2808041 ------------------------------------------------- From mjldehoon at yahoo.com Fri Jul 12 21:52:30 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 12 Jul 2013 18:52:30 -0700 (PDT) Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> The following pieces of code had a PendingDeprecationWarning in Biopython release 1.61, and can be upgraded to a BiopythonDeprecationWarning: Bio.Blast.NCBIStandalone (entire module). This module has had a PendingDeprecationWarning since September 2010. Bio.Motif (entire module). Its functionality is available from Bio.motifs, so Bio.Motif can be deprecated. Bio.ParserSupport (entire module). This module is currently only being used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning since September 2011. Any final objections? Best, -Michiel From zruan1991 at gmail.com Sat Jul 13 02:37:09 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sat, 13 Jul 2013 14:37:09 +0800 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: I don't think this will cause any problems if methods for MultipleSeqAlignment are there. I notice the latest code of MultipleSeqAlignment inherits Bio.Align.Generic.Alignment. Will that code (such as __str__, __getitem__, _str_line) be ported to MultipleSeqAlignment? Thanks! Zheng ?? 2013??7??12??????????Peter Cock ?????? > On Fri, Jul 12, 2013 at 10:06 AM, Michiel de Hoon > > wrote: > > Based on the Biopython deprecation schedule policy, we can remove the > > following pieces of code in release 1.62: > > > > Bio.Align.__init__: the get_column and add_sequence methods > > Bio.Align.Generic: the class Alignment in its entirety > > Eric & Zheng, that won't cause any problems for your GSoC work > will it? > > > Bio.Graphics.GenomeDiagram._AbstractDraw: the .xcentre and .ycentre > > properties and their setters in the class AbstractDrawer > > Bio.Graphics.GenomeDiagram._Graph: The .centre property and its setter in > > the class GraphData > > Sounds fine - I can do those, > > Thanks Michiel, > > Peter > From w.arindrarto at gmail.com Sat Jul 13 02:58:23 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Sat, 13 Jul 2013 08:58:23 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: Hi Michiel, There are two classes from Bio.Blast.NCBIStandalone still being used by Bio.SearchIO internally (for the BLAST text parser): the BlastParser and the Iterator classes. The BlastParser class itself still relies on Bio.ParserSupport. Would it be ok if we move parts that are used by SearchIO into their own private classes in Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current files? Best regards, Bow On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon wrote: > The following pieces of code had a PendingDeprecationWarning in Biopython release 1.61, and can be upgraded to a BiopythonDeprecationWarning: > > Bio.Blast.NCBIStandalone (entire module). This module has had a PendingDeprecationWarning since September 2010. > > Bio.Motif (entire module). Its functionality is available from Bio.motifs, so Bio.Motif can be deprecated. > > Bio.ParserSupport (entire module). This module is currently only being used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning since September 2011. > > Any final objections? > > Best, > -Michiel > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From mjldehoon at yahoo.com Sat Jul 13 06:54:07 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 13 Jul 2013 03:54:07 -0700 (PDT) Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Hi Bow, > Would it be ok if we move parts that are used by SearchIO into their own private classes in > Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current files? That sounds fine to me. Any other opinions, anybody? Best, -Michiel. ________________________________ From: Wibowo Arindrarto To: Michiel de Hoon Cc: Peter Cock ; Eric Talevich ; Zheng Ruan ; Biopython-Dev Mailing List Sent: Saturday, July 13, 2013 3:58 PM Subject: Re: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? Hi Michiel, There are two classes from Bio.Blast.NCBIStandalone still being used by Bio.SearchIO internally (for the BLAST text parser): the BlastParser and the Iterator classes. The BlastParser class itself still relies on Bio.ParserSupport. Would it be ok if we move parts that are used by SearchIO into their own private classes in Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current files? Best regards, Bow On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon wrote: > The following pieces of code had a PendingDeprecationWarning in Biopython release 1.61, and can be upgraded to a BiopythonDeprecationWarning: > > Bio.Blast.NCBIStandalone (entire module). This module has had a PendingDeprecationWarning since September 2010. > > Bio.Motif (entire module). Its functionality is available from Bio.motifs, so Bio.Motif can be deprecated. > > Bio.ParserSupport (entire module). This module is currently only being used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning since September 2011. > > Any final objections? > > Best, > -Michiel > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From zruan1991 at gmail.com Mon Jul 15 00:19:50 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 15 Jul 2013 12:19:50 +0800 Subject: [Biopython-dev] Codon Alignment GSoC Project Update Message-ID: Hi all, I have an update of Codon Alignment project. It can be found at http://zruanweb.com/. My plan for the following three weeks is also there. Thanks! Best, Zheng Ruan From redmine at redmine.open-bio.org Mon Jul 15 05:30:20 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Mon, 15 Jul 2013 09:30:20 +0000 Subject: [Biopython-dev] [Biopython - Bug #3441] (New) DSSP parser fails for some DSSP 2.1.0 output files Message-ID: Issue #3441 has been reported by Ahmet Sinan Yavuz. ---------------------------------------- Bug #3441: DSSP parser fails for some DSSP 2.1.0 output files https://redmine.open-bio.org/issues/3441 Author: Ahmet Sinan Yavuz Status: New Priority: High Assignee: Category: Target version: URL: Some of the DSSP files created by mkdssp 2.1.0 starts with following header:
==== Secondary Structure Definition by the program DSSP, CMBI version by M.L. Hekkelman/2010-10-21 ==== DATE=2013-07-15        .
REFERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637                                                              .
                                                                                                                               .
  336  1  0  0  0 TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, NUMBER OF SS-BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN)                .
and following parsing code (make_dssp_dict function, line 121, @sl[1]@ part) fails for the 3rd line in the example given above with "IndexError: list index out of range" as expected.
try:
        start = 0
        keys = []
        for l in handle.readlines():
            sl = l.split()
            if sl[1] == "RESIDUE":
            # Start parsing from here
                start = 1
                continue
...

Potential temp. solution:
    try:
        start = 0
        keys = []
        for l in handle.readlines():
            sl = l.split()
            if len(sl) > 1:
                if sl[1] == "RESIDUE":
                    # Start parsing from here
                    start = 1
                    continue
...
---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From p.j.a.cock at googlemail.com Mon Jul 15 09:02:14 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 15 Jul 2013 14:02:14 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: On Sat, Jul 13, 2013 at 2:52 AM, Michiel de Hoon wrote: > The following pieces of code had a PendingDeprecationWarning in Biopython > release 1.61, and can be upgraded to a BiopythonDeprecationWarning: > > Bio.Motif (entire module). Its functionality is available from Bio.motifs, > so Bio.Motif can be deprecated. Done, https://github.com/biopython/biopython/commit/74fe3dd40c6f1f43032fa490a918abf052fd5c0e Peter From p.j.a.cock at googlemail.com Mon Jul 15 13:04:30 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 15 Jul 2013 18:04:30 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: On Mon, Jul 15, 2013 at 2:02 PM, Peter Cock wrote: > On Sat, Jul 13, 2013 at 2:52 AM, Michiel de Hoon wrote: >> The following pieces of code had a PendingDeprecationWarning in Biopython >> release 1.61, and can be upgraded to a BiopythonDeprecationWarning: >> >> Bio.Motif (entire module). Its functionality is available from Bio.motifs, >> so Bio.Motif can be deprecated. > > Done, > https://github.com/biopython/biopython/commit/74fe3dd40c6f1f43032fa490a918abf052fd5c0e > > Peter I've started doing a Biopython 1.62 beta release now (before heading off to Berlin tomorrow for the CodeFest and BOSC), while I have access to the Windows machine to build the installers. Sorting out the BLAST deprecation warnings (and any required relocation of files) etc can happen once the beta is out in preparation for the final release tentatively next week (once I'm back in the office). Peter From p.j.a.cock at googlemail.com Mon Jul 15 13:29:44 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 15 Jul 2013 18:29:44 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: On Mon, Jul 15, 2013 at 6:04 PM, Peter Cock wrote: > > I've started doing a Biopython 1.62 beta release now (before heading > off to Berlin tomorrow for the CodeFest and BOSC), while I have access > to the Windows machine to build the installers. > > Sorting out the BLAST deprecation warnings (and any required > relocation of files) etc can happen once the beta is out in preparation > for the final release tentatively next week (once I'm back in the office). > > Peter Beta release ready, this commit is tagged as biopython-162b, https://github.com/biopython/biopython/commit/76dbdba4ed791e69a480afb4382dd5865dd35dac Archives and Windows installers are live on biopython.org in the usual place http://biopython.org/DIST/ for sanity testing prior to an announcement on the main list etc. If some of you could cast your eyes over this in the next few hours that would be great. If someone wants to draft the email (and/or news post), even better. Thanks, Peter From yeyanbo289 at gmail.com Mon Jul 15 23:21:18 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Tue, 16 Jul 2013 11:21:18 +0800 Subject: [Biopython-dev] GSOC 2013 Biopython.Phylo update 5 Message-ID: Hi all, I posted an update here . Thanks! Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From p.j.a.cock at googlemail.com Tue Jul 16 05:37:04 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 16 Jul 2013 10:37:04 +0100 Subject: [Biopython-dev] Biopython 1.62 beta release Message-ID: Dear Biopythoneers, A beta release for Biopython 1.54 is now available for download and testing - noted that I haven't done a fully detailed release announcement, we'll leave that for the official release: https://github.com/biopython/biopython/blob/master/NEWS Source distributions and Windows installers are available from the downloads page on the Biopython website. http://biopython.org/wiki/Download We are interested in getting feedback on the beta release as a whole, but especially on Python 3.3 support and the change to sub-feature handling in EMBL/GenBank parsing for joins. (At least) 22 people have contributed to this release (so far), which includes 11 new people: Alexander Campbell (first contribution) Andrea Rizzi (first contribution) Anthony Mathelier (first contribution) Ben Morris (first contribution) Brad Chapman Christian Brueffer David Arenillas (first contribution) David Martin (first contribution) Eric Talevich Iddo Friedberg Jian-Long Huang (first contribution) Joao Rodrigues Kai Blin Michiel de Hoon Nate Sutton (first contribution) Peter Cock Petra Kubincov? (first contribution) Phillip Garland Saket Choudhary (first contribution) Tiago Antao Wibowo 'Bow' Arindrarto Xabier Bello (first contribution) Our thanks to them, and on behalf of the Biopython team, thank you for any feedback, bug reports, and contributions from trying this beta release. Regards, Peter P.S. Biopython news is also on twitter: http://twitter.com/biopython From p.j.a.cock at googlemail.com Tue Jul 16 06:02:11 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 16 Jul 2013 11:02:11 +0100 Subject: [Biopython-dev] Biopython 1.62 beta release In-Reply-To: References: Message-ID: On Tue, Jul 16, 2013 at 10:37 AM, Peter Cock wrote: > Dear Biopythoneers, > > A beta release for Biopython 1.54 is now available for download > and testing Ahem. Biopython 1.62 beta, as per the title! Peter From heathmatlock at gmail.com Wed Jul 17 02:14:03 2013 From: heathmatlock at gmail.com (heathmatlock) Date: Wed, 17 Jul 2013 01:14:03 -0500 Subject: [Biopython-dev] No issues on Github Message-ID: I was looking for some open issues on Github, but I don't see any. Is biopython bug free with no roadmap of features needing assistance? :) -- Heath Matlock +1 256 274 4225 From p.j.a.cock at googlemail.com Wed Jul 17 06:03:54 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 17 Jul 2013 11:03:54 +0100 Subject: [Biopython-dev] No issues on Github In-Reply-To: References: Message-ID: On Wed, Jul 17, 2013 at 7:14 AM, heathmatlock wrote: > I was looking for some open issues on Github, but I don't see any. Is > biopython bug free with no roadmap of features needing assistance? :) Hi Heath, We're still using RedMine as our bug tracker, but moving the issues to GitHub seems quite appealing too: https://redmine.open-bio.org/projects/biopython (An updated SSL certificate is being organised, sorry about the current warning) Peter From p.j.a.cock at googlemail.com Wed Jul 17 12:53:56 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 17 Jul 2013 17:53:56 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Fri, Feb 1, 2013 at 5:48 PM, Wibowo Arindrarto wrote: >Peter wrote: >> Biopython used to use Bugzilla, at http://bugzilla.open-bio.org/ >> (it was left as a read only legacy listing, but it broke last year when >> the old server started to die and isn't really worth fixing). >> >> This was moved over to RedMine, along with all the other OBF >> projects. This does have some git integration, but I'm not that >> taken with it - and it is yet another service for the OBF team >> to maintain. >> >> What do people think of moving over to using GitHub issues? >> This would link in very well with pull requests and makes linking >> to commits much simpler too. One potential issue is if and how >> we could have bug reports sent to the biopython-dev mailing list >> (something we touched on recently for pull requests). >> >> A full automated move could be possible (NumPy did this), but I >> think a gradual move would be fine - stop filing new issues on >> RedMine and use GitHub issues in future. There are only about >> 100 issues open at the moment anyway, and a manual migration >> would also be a good way to review some of the older tickets. >> >> Thoughts?, > > Moving to GitHub sounds good to me. I'd prefer if we go over the > issues manually (removing the obsolete ones and keeping the current > ones). > > As per the bug reports sending to the mailing list, could we perhaps > create our own custom hooks? e.g. anytime a pull request is issued, an > email would be sent (see https://github.com/github/github-services and > http://developer.github.com/v3/repos/hooks/#create-a-hook) > > Regards, > Bow I just talked to Brad about this during the pre-BOSC 2013 CodeFest, and we agree that moving from RedMine to GitHub issues is a good move. BioRuby have already done this. If no one objects, I will enable filing issues on GitHub, update the wiki with links. It should be possible to disable filing new issues on RedMine, but leave it live for reference. https://redmine.open-bio.org/projects/biopython https://github.com/biopython/biopython/issues/ <-- not live yet We as a group should then manually review the ~100 open issues on RedMine, and file new issues on GitHub as appropriate. I think a manual review is a good idea anyway - there are some stale issues etc which need some fresh eyes. Regards, Peter From arklenna at gmail.com Wed Jul 17 13:06:48 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Wed, 17 Jul 2013 13:06:48 -0400 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: Hi Peter, I'd be happy to take a look at some of the issues over the next few days. Cheers, Lenna On Wed, Jul 17, 2013 at 12:53 PM, Peter Cock wrote: > On Fri, Feb 1, 2013 at 5:48 PM, Wibowo Arindrarto > wrote: > >Peter wrote: > >> Biopython used to use Bugzilla, at http://bugzilla.open-bio.org/ > >> (it was left as a read only legacy listing, but it broke last year when > >> the old server started to die and isn't really worth fixing). > >> > >> This was moved over to RedMine, along with all the other OBF > >> projects. This does have some git integration, but I'm not that > >> taken with it - and it is yet another service for the OBF team > >> to maintain. > >> > >> What do people think of moving over to using GitHub issues? > >> This would link in very well with pull requests and makes linking > >> to commits much simpler too. One potential issue is if and how > >> we could have bug reports sent to the biopython-dev mailing list > >> (something we touched on recently for pull requests). > >> > >> A full automated move could be possible (NumPy did this), but I > >> think a gradual move would be fine - stop filing new issues on > >> RedMine and use GitHub issues in future. There are only about > >> 100 issues open at the moment anyway, and a manual migration > >> would also be a good way to review some of the older tickets. > >> > >> Thoughts?, > > > > Moving to GitHub sounds good to me. I'd prefer if we go over the > > issues manually (removing the obsolete ones and keeping the current > > ones). > > > > As per the bug reports sending to the mailing list, could we perhaps > > create our own custom hooks? e.g. anytime a pull request is issued, an > > email would be sent (see https://github.com/github/github-services and > > http://developer.github.com/v3/repos/hooks/#create-a-hook) > > > > Regards, > > Bow > > I just talked to Brad about this during the pre-BOSC 2013 CodeFest, > and we agree that moving from RedMine to GitHub issues is a good > move. BioRuby have already done this. > > If no one objects, I will enable filing issues on GitHub, update the > wiki with links. It should be possible to disable filing new issues > on RedMine, but leave it live for reference. > > https://redmine.open-bio.org/projects/biopython > https://github.com/biopython/biopython/issues/ <-- not live yet > > We as a group should then manually review the ~100 open issues > on RedMine, and file new issues on GitHub as appropriate. I think > a manual review is a good idea anyway - there are some stale > issues etc which need some fresh eyes. > > Regards, > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Wed Jul 17 13:09:20 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 17 Jul 2013 18:09:20 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Wed, Jul 17, 2013 at 6:06 PM, Lenna Peterson wrote: > Hi Peter, > > I'd be happy to take a look at some of the issues over the next few days. > > Cheers, > > Lenna That would be great - and reviewing it worthy in itself. Shall we set an aim of starting to use GitHub issues tomorrow? Peter From eric.talevich at gmail.com Wed Jul 17 15:36:30 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Wed, 17 Jul 2013 12:36:30 -0700 Subject: [Biopython-dev] Codon Alignment GSoC Project Update In-Reply-To: References: Message-ID: On Sun, Jul 14, 2013 at 9:19 PM, Zheng Ruan wrote: > Hi all, > > I have an update of Codon Alignment project. It can be found at > http://zruanweb.com/. My plan for the following three weeks is also > there. Thanks! > > Best, > Zheng Ruan > Hi Zheng, Nice work. Regarding future plans: - "Add Numpy slice for CodonAlignment" -- Peter voiced an interested in optionally using Numpy arrays for multiple sequence alignments in general. I suggest waiting to reach a consensus with Peter before implementing this feature for CodonAlignment specifically. - "Construct codon alignment based on tblastn result" -- tblastn is just a heuristic for fast local alignment; instead, you can use dynamic programming for pairwise alignments (e.g. Bio.pairwise2). You could translate the nucleotide sequence in 3 frames, do local pairwise alignment of the query protein sequence (ungapped) vs. each translated frame, then stitch the alignments together as best you can. It might help to generate lists of the offsets of each translated codon relative to the original nucleotide sequence, e.g. range(0, 3*(N//3)+1, 3); range(1, 3*(N//3)+2, 3); range(2, 3*(N//3)+3, 3). In this case the build() procedure has two distinct phases: Align the protein sequence to the nucleotide sequence optimally, then insert the gaps of the protein MSA into the codon sequences. - In your Week 2 diary, you mentioned having a minimum score as an option in the alignment function, but I don't see it in the code. I can think of a few reasonable versions of this. Reasonable options might be mismatch_count and untranslated_region_count for the number of codons that don't translate to the amino acid they're aligned to, and the number of skipped regions in the nucleotide sequence (presumably introns or UTRs in the input, although who knows what the user might want to do). If not specified by the user, the build() function should probably throw an error if those instances are encountered, rather than defaulting to some value. Scoring in the style of Exonerate seems unnecessarily open-ended. In your GSoC application, you mentioned a published method for alignment that might be relevant here. Did you determine that it wouldn't work here? Also see the Exonerate (http://www.biomedcentral.com/1471-2105/6/31), as their protein2genome alignment procedure does something similar to what you're attempting. Cheers, Eric From redmine at redmine.open-bio.org Wed Jul 17 18:48:15 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Wed, 17 Jul 2013 22:48:15 +0000 Subject: [Biopython-dev] [Biopython - Bug #3444] (New) Missing DTD files Message-ID: Issue #3444 has been reported by Emanuil Tolev. ---------------------------------------- Bug #3444: Missing DTD files https://redmine.open-bio.org/issues/3444 Author: Emanuil Tolev Status: New Priority: Low Assignee: Category: Target version: URL: When running handle = Entrez.esearch(db="pubmed", term="Wellcome[GRNT]", retmax=100000) and then .efetch on all the 70k+ results ... I get warnings about the following missing DTD files: http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/pubmed_130501.dtd http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/nlmmedlinecitationset_130501.dtd http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/bookdoc_130101.dtd I downloaded them locally, but the warnings say I should "file a bug" too to let you know. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From zruan1991 at gmail.com Thu Jul 18 06:07:31 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Thu, 18 Jul 2013 18:07:31 +0800 Subject: [Biopython-dev] Codon Alignment GSoC Project Update In-Reply-To: References: Message-ID: Hi Eric, Thanks for the feedback. I finished implementing backward frameshift today. There are a few things I need to mention here. For the future plan. There is already a Numpy style slice implemented in MultipleSeqAlignment. I don't know why it doesn't work in CodonAlignment, as the __getitem__ method in CodonAlignment is directly from MultipleSeqAlignment. But this is a small issue and will be fixed soon. I'm thinking about using tblastx output because it might helpful to deal with forward frameshift (gaps) in translation. But it doesn't help when there are backward frameshift, because some nucleotide will be used twice in a single translation (None of the alignment methods can detect this as far as I know). Actually, the underlying algorithm in Bio.CodonAlign.build is almost the same as pal2nal and is just opposite of what you described. Instead of translating nucleotide sequence in three reading frames, I back translate protein sequences into degenerated codon regular expression. And try to find a match between the translated re and given nucleotide sequence. Frameshift detection from scratch is difficult bacause of my method is based on regular expression and the search step is rather simple. pal2nal can deal with user specified frameshift but doesn't try to find it from raw sequences. My current code can handle up to 10 forward frameshift (gaps) but only 1 backward frameshift in the sequence. I will add support for multiple backward frameshift support in the future. However, I don't anticipate this function to be able to handle all the situations. For example, if two frameshift events happen too close together, it is really hard to figure this out. Is it very often to see such frameshift in real biological world? Another question is about the actual usage of shifted alignment. I am unaware of any statistical methods that account for this. Normally, when people know there is a frameshift, they probably already figured out where it happens. Therefore, it's better to ask the user to tell the program where the frameshift lies. Functions to facilitate this step will be added. A scoring scheme is pending since the score need to account for all situations (mismatches, frameshift). I will add it when all the functions are tested correct. It is necessary because the mechanism for mismatch detection is very robust! It can align protein sequence to nucleotide sequence without any relationship in theory. Therefore a maximum tolerance should be set. As for the MACSE, it employs a totally different strategy and is not optimal. I will have a look at Exonerate and protein2genome procedure. Thanks! Best, Zheng Ruan On Thu, Jul 18, 2013 at 3:36 AM, Eric Talevich wrote: > On Sun, Jul 14, 2013 at 9:19 PM, Zheng Ruan wrote: > >> Hi all, >> >> I have an update of Codon Alignment project. It can be found at >> http://zruanweb.com/. My plan for the following three weeks is also >> there. Thanks! >> >> Best, >> Zheng Ruan >> > > Hi Zheng, > > Nice work. Regarding future plans: > > - "Add Numpy slice for CodonAlignment" -- Peter voiced an interested in > optionally using Numpy arrays for multiple sequence alignments in general. > I suggest waiting to reach a consensus with Peter before implementing this > feature for CodonAlignment specifically. > > - "Construct codon alignment based on tblastn result" -- tblastn is just a > heuristic for fast local alignment; instead, you can use dynamic > programming for pairwise alignments (e.g. Bio.pairwise2). You could > translate the nucleotide sequence in 3 frames, do local pairwise alignment > of the query protein sequence (ungapped) vs. each translated frame, then > stitch the alignments together as best you can. It might help to generate > lists of the offsets of each translated codon relative to the original > nucleotide sequence, e.g. range(0, 3*(N//3)+1, 3); range(1, 3*(N//3)+2, 3); > range(2, 3*(N//3)+3, 3). In this case the build() procedure has two > distinct phases: Align the protein sequence to the nucleotide sequence > optimally, then insert the gaps of the protein MSA into the codon sequences. > - In your Week 2 diary, you mentioned having a minimum score as an option > in the alignment function, but I don't see it in the code. I can think of a > few reasonable versions of this. Reasonable options might be mismatch_count > and untranslated_region_count for the number of codons that don't translate > to the amino acid they're aligned to, and the number of skipped regions in > the nucleotide sequence (presumably introns or UTRs in the input, although > who knows what the user might want to do). If not specified by the user, > the build() function should probably throw an error if those instances are > encountered, rather than defaulting to some value. Scoring in the style of > Exonerate seems unnecessarily open-ended. > > In your GSoC application, you mentioned a published method for alignment > that might be relevant here. Did you determine that it wouldn't work here? > Also see the Exonerate (http://www.biomedcentral.com/1471-2105/6/31), as > their protein2genome alignment procedure does something similar to what > you're attempting. > > Cheers, > Eric > From p.j.a.cock at googlemail.com Thu Jul 18 08:45:16 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 18 Jul 2013 13:45:16 +0100 Subject: [Biopython-dev] Codon Alignment GSoC Project Update In-Reply-To: References: Message-ID: On Thu, Jul 18, 2013 at 11:07 AM, Zheng Ruan wrote: > Hi Eric, > > Thanks for the feedback. I finished implementing backward frameshift today. > There are a few things I need to mention here. That sounds good :) Would you mind posting your update emails (or a summary so far) on your blog too please? http://zr1991.blogspot.de Thanks, Peter From zruan1991 at gmail.com Sun Jul 21 22:55:50 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 22 Jul 2013 10:55:50 +0800 Subject: [Biopython-dev] Weekly Update for Codon Alignment GSoC project Message-ID: Hi, I post an update for the project last week in my blog as well as my plan next week. As the midterm evaluation deadline is approaching, I also include this into it. Thanks for your comments and suggestions. Best, Zheng Ruan From yeyanbo289 at gmail.com Mon Jul 22 01:44:08 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 22 Jul 2013 13:44:08 +0800 Subject: [Biopython-dev] GSOC weekly update 6 Message-ID: Hi all, I post an update for the Biopython.Phylo project here: http://blog.yeyanbo.com/posts/google-summer-of-code-6.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From p.j.a.cock at googlemail.com Mon Jul 22 07:43:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 22 Jul 2013 12:43:58 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Wed, Jul 17, 2013 at 6:09 PM, Peter Cock wrote: > On Wed, Jul 17, 2013 at 6:06 PM, Lenna Peterson wrote: >> Hi Peter, >> >> I'd be happy to take a look at some of the issues over the next few days. >> >> Cheers, >> >> Lenna > > That would be great - and reviewing it worthy in itself. > > Shall we set an aim of starting to use GitHub issues tomorrow? > > Peter Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now. In the absence of any dissenting views, and the fact that RedMine is also offline right now (which I've raised with the OBF admin volunteers), I've enabled GitHub issues & linked to this from the main page: https://github.com/biopython/biopython/issues You'll notice there are already lots of issues there - all pull request related. This is one reason why an automated import of the old Bugzilla/RedMine issues could be complicated. Various other bits of our documentation will need to be updated... Peter From p.j.a.cock at googlemail.com Mon Jul 22 10:36:06 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 22 Jul 2013 15:36:06 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Mon, Jul 22, 2013 at 12:43 PM, Peter Cock wrote: > > Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now. > > In the absence of any dissenting views, and the fact that RedMine is > also offline right now (which I've raised with the OBF admin volunteers), Fixed again :) > I've enabled GitHub issues & linked to this from the main page: > > https://github.com/biopython/biopython/issues > > You'll notice there are already lots of issues there - all pull request > related. This is one reason why an automated import of the old > Bugzilla/RedMine issues could be complicated. > > Various other bits of our documentation will need to be updated... Hopefully done now, e.g. https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 Note that there doesn't seem to be a way to turn off new issues in a RedMine project - there are hacks via removing the ability from the roles, but I fear that would affect the other projects still using the RedMine server (e.g. BioPerl). Instead we may just have to do the triage/migration and then drop the links to the old RedMine server from the website etc. Peter From mjldehoon at yahoo.com Wed Jul 24 03:31:08 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 24 Jul 2013 00:31:08 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation Message-ID: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Dear all, When trying to install Biopython on a new MacBook, I get the following error message when I run "python setup.py build": tkx330:biopython-1.62b mdehoon$ python setup.py build Traceback (most recent call last): ? File "setup.py", line 109, in ??? from setuptools import setup, Command ? File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/setuptools/__init__.py", line 2, in ??? from setuptools.extension import Extension, Library ... ? File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1211, in get_metadata ??? return self._get(self._fn(self.egg_info,name)) ? File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1326, in _get ??? stream = open(path, 'rb') IOError: [Errno 13] Permission denied: '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_dateutil-2.1-py2.7.egg/EGG-INFO/top_level.txt' This looks like a simple problem with file permissions, and I hope can be solved easily. Still, it is quite discouraging to first-time users of Biopython. Do we actually need setuptools? Looking at setup.py, it seems that distutils is sufficient for our needs. If so, let's remove the dependency on setuptools. Best, -Michiel. From p.j.a.cock at googlemail.com Wed Jul 24 05:13:06 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 10:13:06 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: Hello all, Something Brad and I chatted about during the BOSC 2013 CodeFest was should we switch the Biopython licence to something which is formally approved as "Open Source" by The Open Source Initiative (OSI): http://opensource.org/licenses The current Biopython License is very short and liberal, and I have long described it as an MIT/BSD type licence. However the actual wording matches neither of these exactly (as far as I could tell): http://biopython.org/DIST/LICENSE https://github.com/biopython/biopython/blob/master/LICENSE In theory we could ask the OSI to approve our current license, but as they explain "yet another license" is not a good thing to encourage: http://opensource.org/proliferation Brad and I thought it would be reasonable to adopt a standard MIT/BSD licence instead. Note that the following lack a "no endorsement" clause which we have currently: http://opensource.org/licenses/MIT http://opensource.org/licenses/BSD-2-Clause Therefore this looks like the closest match: http://opensource.org/licenses/BSD-3-Clause i.e. The BSD 3-Clause ("BSD New" or "BSD Simplified") license. This is also used by the NumPy project and many other Python libraries. Assuming people agree this is a good idea, we can start doing this on a file-by-file basis (checking for approval from the named copyright holders) and to be rigorous check with every named contributor in the CONTRIB or NEWS files. Peter From tiagoantao at gmail.com Wed Jul 24 05:23:08 2013 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Wed, 24 Jul 2013 10:23:08 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: +1 to getting rid of an unstandard license. If BSD 3-clause it is the closest, then I would change. This irrespective of license preferences: A potentially unfruitful discussion would be around "the best Free/Open license". This is just getting below the umbrella of a standard, OSI-approved license. A great idea. On Wed, Jul 24, 2013 at 10:13 AM, Peter Cock wrote: > Hello all, > > Something Brad and I chatted about during the BOSC 2013 CodeFest > was should we switch the Biopython licence to something which is > formally approved as "Open Source" by The Open Source Initiative > (OSI): http://opensource.org/licenses > > The current Biopython License is very short and liberal, and I have > long described it as an MIT/BSD type licence. However the actual > wording matches neither of these exactly (as far as I could tell): > > http://biopython.org/DIST/LICENSE > https://github.com/biopython/biopython/blob/master/LICENSE > > In theory we could ask the OSI to approve our current license, but as > they explain "yet another license" is not a good thing to encourage: > http://opensource.org/proliferation > > Brad and I thought it would be reasonable to adopt a standard > MIT/BSD licence instead. > > Note that the following lack a "no endorsement" clause which we > have currently: > > http://opensource.org/licenses/MIT > http://opensource.org/licenses/BSD-2-Clause > > Therefore this looks like the closest match: > > http://opensource.org/licenses/BSD-3-Clause > > i.e. The BSD 3-Clause ("BSD New" or "BSD Simplified") license. > This is also used by the NumPy project and many other Python > libraries. > > Assuming people agree this is a good idea, we can start doing > this on a file-by-file basis (checking for approval from the named > copyright holders) and to be rigorous check with every named > contributor in the CONTRIB or NEWS files. > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > -- ?Grant me chastity and continence, but not yet? - St Augustine From p.j.a.cock at googlemail.com Wed Jul 24 05:26:12 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 10:26:12 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Message-ID: On Wed, Jul 24, 2013 at 8:31 AM, Michiel de Hoon wrote: > Dear all, > > When trying to install Biopython on a new MacBook, I get the following error message when I run "python setup.py build": > > tkx330:biopython-1.62b mdehoon$ python setup.py build > Traceback (most recent call last): > File "setup.py", line 109, in > from setuptools import setup, Command > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/setuptools/__init__.py", line 2, in > from setuptools.extension import Extension, Library > ... > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1211, in get_metadata > return self._get(self._fn(self.egg_info,name)) > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1326, in _get > stream = open(path, 'rb') > IOError: [Errno 13] Permission denied: '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_dateutil-2.1-py2.7.egg/EGG-INFO/top_level.txt' > > This looks like a simple problem with file permissions, and I hope can be solved easily. My guess would be there is something broken with your dateutil install, a little Google searching shows very similar issues with other packages. > Still, it is quite discouraging to first-time users of Biopython. Yes, but I'm not sure it is our fault :( > Do we actually need setuptools? > Looking at setup.py, it seems that distutils is sufficient for our needs. > If so, let's remove the dependency on setuptools. I will have to pass that question to Brad, Peter From p.j.a.cock at googlemail.com Wed Jul 24 05:31:02 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 10:31:02 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: On Wed, Jul 24, 2013 at 10:23 AM, Tiago Ant?o wrote: > +1 to getting rid of an unstandard license. If BSD 3-clause it is the > closest, then I would change. Having a few more eyes confirm this would be good. Anything very close makes the switch easier to justify. > This irrespective of license preferences: A potentially unfruitful > discussion would be around "the best Free/Open license". I really don't want to go down that route - the Python OSS community by and large use liberal licenses in the MIT/BSD family. The fact that NumPy uses the BSD 3-clause licence is a good standard to follow. Brad said he prefers the MIT licence (and it is shorter). > This is just getting below the umbrella of a standard, OSI-approved license. > > A great idea. That's the idea - that and the fact that any non-standard license (even a nice open one) is one more barrier to adoption - especially in companies or institutes with lawyers that care about details. This was an issue which came up during the BOSC 2013 conference. Now since our current licence is short and simple, this isn't such an issue - but it is a small barrier all the same. This also makes like simpler for things like the PyPI license tagging and so on. Peter From mjldehoon at yahoo.com Wed Jul 24 05:32:45 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 24 Jul 2013 02:32:45 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Message-ID: <1374658365.47986.YahooMailNeo@web164004.mail.gq1.yahoo.com> Hi Peter, > > Still, it is quite discouraging to first-time users of Biopython. > > Yes, but I'm not sure it is our fault :( Sure, I know it's not our fault. But still it's avoidable. Best, -Michiel From christian at brueffer.de Wed Jul 24 05:46:24 2013 From: christian at brueffer.de (Christian Brueffer) Date: Wed, 24 Jul 2013 11:46:24 +0200 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: <51EFA270.2050903@brueffer.de> On 7/24/13 11:13 , Peter Cock wrote: [...] > > Therefore this looks like the closest match: > > http://opensource.org/licenses/BSD-3-Clause > > i.e. The BSD 3-Clause ("BSD New" or "BSD Simplified") license. > This is also used by the NumPy project and many other Python > libraries. > > Assuming people agree this is a good idea, we can start doing > this on a file-by-file basis (checking for approval from the named > copyright holders) and to be rigorous check with every named > contributor in the CONTRIB or NEWS files. > I welcome this initiative and I agree that BSD 3-clause seems to be the closest match. Cheers, Chris From p.j.a.cock at googlemail.com Wed Jul 24 06:02:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 11:02:24 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: <86a9lcl1nt.fsf@fastmail.fm> References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: On Wed, Jul 24, 2013 at 10:58 AM, Brad Chapman wrote: > > Peter and Michiel; > >>> Do we actually need setuptools? >>> Looking at setup.py, it seems that distutils is sufficient for our needs. >>> If so, let's remove the dependency on setuptools. > > We used setuptools/distribute to install dependencies, although > practically this doesn't work well since pip doesn't finish NumPy > installation before installing Biopython. So I'm fine with taking it out > if you want to simplify the setup and avoid the extra dependency. Sounds like a plan - but we should all test this change, especially users of PIP, easy_install, virtual env etc. It is major enough to warrant a second beta? Peter From chapmanb at 50mail.com Wed Jul 24 05:58:46 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Wed, 24 Jul 2013 05:58:46 -0400 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Message-ID: <86a9lcl1nt.fsf@fastmail.fm> Peter and Michiel; >> Do we actually need setuptools? >> Looking at setup.py, it seems that distutils is sufficient for our needs. >> If so, let's remove the dependency on setuptools. We used setuptools/distribute to install dependencies, although practically this doesn't work well since pip doesn't finish NumPy installation before installing Biopython. So I'm fine with taking it out if you want to simplify the setup and avoid the extra dependency. Brad From mjldehoon at yahoo.com Wed Jul 24 06:51:20 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 24 Jul 2013 03:51:20 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: <1374663080.82953.YahooMailNeo@web164003.mail.gq1.yahoo.com> Hi Peter, > It is major enough to warrant a second beta? I would assume that we won't need a second beta. If it does turn out that there are installation problems with Biopython 1.62, we can always release a Biopython 1.63. Best, -Michiel. From p.j.a.cock at googlemail.com Thu Jul 25 11:05:19 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 25 Jul 2013 16:05:19 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: On Wed, Jul 24, 2013 at 11:02 AM, Peter Cock wrote: > On Wed, Jul 24, 2013 at 10:58 AM, Brad Chapman wrote: >> >> Peter and Michiel; >> >>>> Do we actually need setuptools? >>>> Looking at setup.py, it seems that distutils is sufficient for our needs. >>>> If so, let's remove the dependency on setuptools. >> >> We used setuptools/distribute to install dependencies, although >> practically this doesn't work well since pip doesn't finish NumPy >> installation before installing Biopython. So I'm fine with taking it out >> if you want to simplify the setup and avoid the extra dependency. > > Sounds like a plan - but we should all test this change, especially > users of PIP, easy_install, virtual env etc. > So who's going to do the commit - Brad or Michiel? Peter From mjldehoon at yahoo.com Thu Jul 25 20:09:11 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 25 Jul 2013 17:09:11 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com> Brad, can you do it? Best, -Michiel. ________________________________ From: Peter Cock To: Brad Chapman ; Michiel de Hoon Cc: "biopython-dev at biopython.org" Sent: Friday, July 26, 2013 12:05 AM Subject: Re: [Biopython-dev] setuptools breaking biopython-1.62b installation On Wed, Jul 24, 2013 at 11:02 AM, Peter Cock wrote: > On Wed, Jul 24, 2013 at 10:58 AM, Brad Chapman wrote: >> >> Peter and Michiel; >> >>>> Do we actually need setuptools? >>>> Looking at setup.py, it seems that distutils is sufficient for our needs. >>>> If so, let's remove the dependency on setuptools. >> >> We used setuptools/distribute to install dependencies, although >> practically this doesn't work well since pip doesn't finish NumPy >> installation before installing Biopython. So I'm fine with taking it out >> if you want to simplify the setup and avoid the extra dependency. > > Sounds like a plan - but we should all test this change, especially > users of PIP, easy_install, virtual env etc. > So who's going to do the commit - Brad or Michiel? Peter From yeyanbo289 at gmail.com Sun Jul 28 22:49:09 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 29 Jul 2013 10:49:09 +0800 Subject: [Biopython-dev] GSOC weekly update 7 Message-ID: Hi all, I post an update for the Biopython.Phylo project here: http://blog.yeyanbo.com/posts/google-summer-of-code-7.html Cheers, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From zruan1991 at gmail.com Mon Jul 29 02:50:06 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 29 Jul 2013 02:50:06 -0400 Subject: [Biopython-dev] GSoC Update for Codon Alignment Message-ID: Hi all, An update of Codon Alignment GSoC can be found at http://zruanweb.com/. Thanks! Best, Zheng Ruan From eric.talevich at gmail.com Mon Jul 29 12:52:03 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 29 Jul 2013 09:52:03 -0700 Subject: [Biopython-dev] Weekly Update for Codon Alignment GSoC project In-Reply-To: References: Message-ID: On Sun, Jul 21, 2013 at 7:55 PM, Zheng Ruan wrote: > Hi, > > I post an update for the project last week in my blog as > well as my plan next week. As the midterm evaluation deadline is > approaching, I also include this into it. Thanks for your comments and > suggestions. > > Best, > Zheng Ruan > Hey Zheng, Great progress so far. Your implementation of codon alignment up to this point looks more than adequate to me. A couple thoughts: - Can your implementation detect inserts (forward frameshifts) of more than 3 nucleotides, as might be introduced by introns? Or just 1-2 bases? - Same question for the backward frameshift implementation. One biological cause for these backward shifts is ribosomal slippage -- these are usually short, e.g. 1 base, so it is not urgent for your implementation to handle larger backward shifts if this would be more difficult. (For drastic differences between protein and nucleotide sequences, a bioinformatician would normally use some variant of BLAST, exonerate, or another local alignment tool, rather than expecting this codon alignment algorithm to catch and handle every possibility.) Have a safe trip home! -Eric From zruan1991 at gmail.com Mon Jul 29 22:58:01 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Tue, 30 Jul 2013 10:58:01 +0800 Subject: [Biopython-dev] Weekly Update for Codon Alignment GSoC project In-Reply-To: References: Message-ID: Thanks Eric, It will not be difficult to allow more than 3 nucleotides forward frameshift. I will include this into my plan. For the backward frameshift support, more nucleotides support is desirable but difficult. Some function to help construct codon alignment with other software will be considered. Best, Ruan On Tue, Jul 30, 2013 at 12:52 AM, Eric Talevich wrote: > On Sun, Jul 21, 2013 at 7:55 PM, Zheng Ruan wrote: > >> Hi, >> >> I post an update for the project last week in my blog as >> well as my plan next week. As the midterm evaluation deadline is >> approaching, I also include this into it. Thanks for your comments and >> suggestions. >> >> Best, >> Zheng Ruan >> > > Hey Zheng, > > Great progress so far. Your implementation of codon alignment up to this > point looks more than adequate to me. A couple thoughts: > > - Can your implementation detect inserts (forward frameshifts) of more > than 3 nucleotides, as might be introduced by introns? Or just 1-2 bases? > > - Same question for the backward frameshift implementation. One biological > cause for these backward shifts is ribosomal slippage -- these are usually > short, e.g. 1 base, so it is not urgent for your implementation to handle > larger backward shifts if this would be more difficult. (For drastic > differences between protein and nucleotide sequences, a bioinformatician > would normally use some variant of BLAST, exonerate, or another local > alignment tool, rather than expecting this codon alignment algorithm to > catch and handle every possibility.) > > Have a safe trip home! > > -Eric > From ben at benfulton.net Tue Jul 30 20:43:31 2013 From: ben at benfulton.net (Ben Fulton) Date: Tue, 30 Jul 2013 20:43:31 -0400 Subject: [Biopython-dev] 1.62b test coverage report Message-ID: I ran Ned Batchelder's coverage tool against the 1.62 beta code to see how much code is covered by tests. The overall total was 74% which is pretty respectable. I ran the tests on a fairly fresh machine, which meant I had to install a lot of software, some of which I either didn't get installed properly, or the tests are out of date, or there were failures for some other reason. I ended up having to skip seven test files: Dialign_Tool EmbossPhylipNew Mafft PopGen_DFDist PopGen_FDist XXMotif phyml There were three tests I managed to get running but still had failures: FastTree NCBI_BLAST Prank_tool You can look at the report on my website at http://benfulton.net/BioPython162_Coverage/ . Please let me know if you have comments or questions, or can tell me what I did wrong on the above tests :) From p.j.a.cock at googlemail.com Wed Jul 31 03:40:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 31 Jul 2013 08:40:24 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Wednesday, July 31, 2013, Ben Fulton wrote: > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see how > much code is covered by tests. The overall total was 74% which is pretty > respectable. > > I ran the tests on a fairly fresh machine, which meant I had to install a > lot of software, some of which I either didn't get installed properly, or > the tests are out of date, or there were failures for some other reason. I > ended up having to skip seven test files: > > Dialign_Tool > EmbossPhylipNew > Mafft > PopGen_DFDist > PopGen_FDist > XXMotif > phyml I'm pretty sure I have some or all of those setup on at least one of my test machines, so with a little more work together we can try to resolve those (which may mean updating the docs). > There were three tests I managed to get running but still had failures: > > FastTree > NCBI_BLAST > Prank_tool A few more details here would be very good - what versions of the tools did you have and what error did the tests give? (I just fixed a warning from new options added in BLAST 2.2.28+ committed yesterday) > You can look at the report on my website at > http://benfulton.net/BioPython162_Coverage/ . Please let me know if you > have comments or questions, or can tell me what I did wrong on the above > tests :) > Thanks Ben - some of the modules with zero or low coverage are deprecated so don't worry me - others though probably do need to be looked at. Would anyone like to make a priority list? This would then be something we can point volunteers at who ask for suggestions of something they can contribute? Thanks, Peter From sharma409 at gmail.com Wed Jul 31 14:12:35 2013 From: sharma409 at gmail.com (Rishi Sharma) Date: Wed, 31 Jul 2013 11:12:35 -0700 Subject: [Biopython-dev] Saving a Trie Message-ID: Hello, I was was wondering how i might write a Trie to file. It doesn't seem to have a write() method so pickling won't work. I'm not sure how the biopython save is intended to work, so I guess that is what I'm asking. Thanks for your help, Rishi Sharma From arklenna at gmail.com Wed Jul 31 15:17:59 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Wed, 31 Jul 2013 15:17:59 -0400 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: I recall a bug report about this on redmine, but I can't find it or get the site to load at all (although downforeveryoneorjustme claims it's up). I don't have experience using pickle on non-default objects, but I wasn't aware an object needed a specific method to be pickled. What error does it throw when you pickle.dump() it? I did find a somewhat related SO question that suggests trie pickling could be a non-straightforward proposition: http://stackoverflow.com/questions/2134706/hitting-maximum-recursion-depth-using-pythons-pickle-cpickle Cheers, Lenna On Wed, Jul 31, 2013 at 2:12 PM, Rishi Sharma wrote: > Hello, > > I was was wondering how i might write a Trie to file. It doesn't seem to > have a write() method so pickling won't work. I'm not sure how the > biopython save is intended to work, so I guess that is what I'm asking. > > Thanks for your help, > Rishi Sharma > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From sharma409 at gmail.com Wed Jul 31 15:20:49 2013 From: sharma409 at gmail.com (Rishi Sharma) Date: Wed, 31 Jul 2013 12:20:49 -0700 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: TypeError: can't pickle trie objects On Wed, Jul 31, 2013 at 12:17 PM, Lenna Peterson wrote: > I recall a bug report about this on redmine, but I can't find it or get > the site to load at all (although downforeveryoneorjustme claims it's up). > > I don't have experience using pickle on non-default objects, but I wasn't > aware an object needed a specific method to be pickled. What error does it > throw when you pickle.dump() it? > > I did find a somewhat related SO question that suggests trie pickling > could be a non-straightforward proposition: > > > http://stackoverflow.com/questions/2134706/hitting-maximum-recursion-depth-using-pythons-pickle-cpickle > > Cheers, > > Lenna > > > On Wed, Jul 31, 2013 at 2:12 PM, Rishi Sharma wrote: > >> Hello, >> >> I was was wondering how i might write a Trie to file. It doesn't seem to >> have a write() method so pickling won't work. I'm not sure how the >> biopython save is intended to work, so I guess that is what I'm asking. >> >> Thanks for your help, >> Rishi Sharma >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev >> > > From p.j.a.cock at googlemail.com Wed Jul 31 17:59:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 31 Jul 2013 22:59:21 +0100 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: On Wednesday, July 31, 2013, Rishi Sharma wrote: > Hello, > > I was was wondering how i might write a Trie to file. It doesn't seem to > have a write() method so pickling won't work. I'm not sure how the > biopython save is intended to work, so I guess that is what I'm asking. > > Hi Rishi, You need to do something like this (untested - I'm not at a computer): from Bio import trie f = open("my-data.dat", "w") tr = trie.trie() #fill in the trie trie.save(f, trie) f.close() And to read it back, from Bio import trie f = open('my-data.dat', 'r') tr = trie.load(f) f.close() Peter From sharma409 at gmail.com Wed Jul 31 18:05:40 2013 From: sharma409 at gmail.com (Rishi Sharma) Date: Wed, 31 Jul 2013 15:05:40 -0700 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: Ah yes this worked. I was doing something stupid by importing trie from Bio.trie and confusing myself between the module and the method. Thank you! On Wed, Jul 31, 2013 at 2:59 PM, Peter Cock wrote: > > On Wednesday, July 31, 2013, Rishi Sharma wrote: > >> Hello, >> >> I was was wondering how i might write a Trie to file. It doesn't seem to >> have a write() method so pickling won't work. I'm not sure how the >> biopython save is intended to work, so I guess that is what I'm asking. >> >> > Hi Rishi, > > You need to do something like this (untested - I'm not at a computer): > > from Bio import trie > f = open("my-data.dat", "w") > tr = trie.trie() > #fill in the trie > trie.save(f, trie) > f.close() > > And to read it back, > > from Bio import trie > f = open('my-data.dat', 'r') > tr = trie.load(f) > f.close() > > Peter > > From yeyanbo289 at gmail.com Mon Jul 1 09:29:34 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 1 Jul 2013 17:29:34 +0800 Subject: [Biopython-dev] gsoc weekly update Message-ID: Hi all, I post an update for the project 'Phylogenetics in Biopython: Filling in the gaps'. http://blog.yeyanbo.com/posts/google-summer-of-code-4.html Best, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From mtholder at gmail.com Mon Jul 1 15:15:47 2013 From: mtholder at gmail.com (Mark Holder) Date: Mon, 1 Jul 2013 10:15:47 -0500 Subject: [Biopython-dev] gsoc weekly update In-Reply-To: References: Message-ID: Hi Yanbo, It looks like you are making nice progress. 1. A comment on tests: I noticed that the upgma and nj tests (from last week) just verify that the trees produced are of the right class and can be written as newick. It is probably worth strengthening those tests to make them check that the branch lengths are correct. 2. A thought on character weighting: You might think about adding support for tree construction from a "compressed" input character matrix. By compressed, I mean one in which you store unique data patterns (unique columns in an alignment) and a pattern weight for that column rather than storing every character separately. The pattern weight is typically the number of times that the pattern was observed in the original ("raw") character matrix (but it is nice to support floats as weights). Richer implementations of a compressed matrix also store the complete mapping of data patterns to original character indices to enable the recreation of the original matrix, but that feature is rarely used in tree inference. I don't know if biopython has this form of data compression implemented, but it is very widely used in phylogenetic inference. It can be used in any inference technique that treats characters as independent and identically distributed. If biopython does not support this form of compression, then it may be worth writing the TreeConstruction code to work with character weights in the event that someone else implements this feature. Or you could at least add a #\TODO comment in the code where ever character weighting would be used (so that it would be easy to fix later). all the best, Mark PS: I've been travelling, but I'm back in Lawrence now. I'm happy to chat with you this week about parsimony algorithms if you have questions. On Mon, Jul 1, 2013 at 4:29 AM, Yanbo Ye wrote: > Hi all, > > I post an update for the project 'Phylogenetics in Biopython: Filling in the > gaps'. > http://blog.yeyanbo.com/posts/google-summer-of-code-4.html > > Best, > > Yanbo > > -- > > ??? > > ???????????????? > > Yanbo Ye > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > Sciences -- Mark Holder mtholder at gmail.com mtholder at ku.edu http://phylo.bio.ku.edu/mark-holder ============================================== Department of Ecology and Evolutionary Biology University of Kansas 6031 Haworth Hall 1200 Sunnyside Avenue Lawrence, Kansas 66045 lab phone: 785.864.5789 fax (shared): 785.864.5860 ============================================== From redmine at redmine.open-bio.org Tue Jul 2 22:14:21 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Tue, 2 Jul 2013 22:14:21 +0000 Subject: [Biopython-dev] [Biopython - Bug #3419] Bio.SearchIO.FastaIO References: Message-ID: Issue #3419 has been updated by Wibowo Arindrarto. Hi Jason, Apologies for a very long reply. Apparently the notification of your reply didn't get to my inbox and I have forgotten to check the page manually :(. Fortunately I met Peter and he pointed this out :). IIRC, the parser does store the program name that created the results (the QueryResult.program attribute). And we can deal with strand/frame accordingly. There is, however, not a standard way to store strand information of 'parent' sequence (in this case the DNA that was the template of the protein). I'll poke around to see if this has been dealt with. Anyway, your patch does look OK for the time being. The BLASTX parser handles this information the same way too (storing read frame in the protein Seq object). Would you like to submit it through Github? I'd be happy to commit on your behalf as well :). ---------------------------------------- Bug #3419: Bio.SearchIO.FastaIO https://redmine.open-bio.org/issues/3419 Author: Jason Stajich Status: New Priority: Low Assignee: Biopython Dev Mailing List Category: Main Distribution Target version: URL: The strand of the translated sequence (query or subject depending on the analysis) is lost for tfastxy and fastx/y reports. from Bio import SearchIO qresults = SearchIO.parse('test.FASTY.out','fasta-m10') for qresult in qresults: for hit in qresult: for hsp in hit.hsps: print qresult.id, " ", hit.id, " ", \ hsp.query_start, "..",hsp.query_end, " ", hsp.query_strand, " ", \ hsp.hit_start, "..",hsp.hit_end, " ", hsp.hit_strand -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From w.arindrarto at gmail.com Tue Jul 2 22:32:32 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Wed, 3 Jul 2013 00:32:32 +0200 Subject: [Biopython-dev] Storing reading frame / strand of nucleic acid sequences used for creating protein sequences Message-ID: Hi everyone, I'm wondering if we have a standard way of storing the reading frames of DNA/RNA sequences used for creating protein sequences? In some cases, keeping track of the original reading frame may be desirable. (e.g. in TBLASTX alignments, where users want to know the reading frames of both the query and the hit sequences). I realize that it is possible to store strand information in SeqFeature objects. However, I am afraid that storing the strand / reading frame information for SeqFeatures of Seq objects with ProteinAlphabets may seem misleading as the strand information belongs to the DNA / RNA sequence that was used as the protein template, not the protein itself. On a related note, I'm also wondering if for the Seq object's translate method, there should be an argument to specify which reading frame we want use to translate the sequence? This can be trivially solved using Python's convenient indexing and slicing system. However, indexing and/or slicing does not allow us to keep track of the original DNA/RNA reading frame (at least not the way it is implemented now). Let me know what you think :). Best regards, Bow From zruan1991 at gmail.com Wed Jul 3 13:35:08 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Wed, 3 Jul 2013 21:35:08 +0800 Subject: [Biopython-dev] Storing reading frame / strand of nucleic acid sequences used for creating protein sequences In-Reply-To: References: Message-ID: Hi Bow, I'm quite inerested in setting up a standard way to deal with reading frames, as this is the job I would like to implement in the Codon Alignment project this summer. Getting reading frame information from TBLASTX result seems to be a good idea and I may implement a funtion to deal with it. As to how to store this information, I favor the CodonSeq class I've written since frameshift occurrs at DNA/RNA level. I have a CodonAlphabet to check the codon sequene in CodonSeq. Maybe an enhancement of the Alphabet and CodonSeq will better taking fameshift into account. If you are seeking a standard way to represent frameshift in protein level, you may want to read the methods section of pal2nal paper. For example, M2P indicates that there is 1 nt deletion between methionine and proline. But this nontheless violates the ProteinAlphabet we've defined in Biopython. I just came to Shanghai and is about to writting code for this week. I am interested in hearing your suggestions. Thanks! Ruan On Wed, Jul 3, 2013 at 6:32 AM, Wibowo Arindrarto wrote: > Hi everyone, > > I'm wondering if we have a standard way of storing the reading frames > of DNA/RNA sequences used for creating protein sequences? > > In some cases, keeping track of the original reading frame may be > desirable. (e.g. in TBLASTX alignments, where users want to know the > reading frames of both the query and the hit sequences). > > I realize that it is possible to store strand information in > SeqFeature objects. However, I am afraid that storing the strand / > reading frame information for SeqFeatures of Seq objects with > ProteinAlphabets may seem misleading as the strand information belongs > to the DNA / RNA sequence that was used as the protein template, not > the protein itself. > > On a related note, I'm also wondering if for the Seq object's > translate method, there should be an argument to specify which reading > frame we want use to translate the sequence? > > This can be trivially solved using Python's convenient indexing and > slicing system. However, indexing and/or slicing does not allow us to > keep track of the original DNA/RNA reading frame (at least not the way > it is implemented now). > > Let me know what you think :). > > Best regards, > Bow > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From lgautier at gmail.com Mon Jul 8 14:38:36 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Mon, 08 Jul 2013 16:38:36 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv Message-ID: <51DACEEC.6080305@gmail.com> Hi, I just tried installing biopython with pip (v-1.3.1) and Python 3.3 (under a virtual environment created with pyvenv), but (after a a fairly long time) the process is failing with the last 3 lines printed on the terminal: ``` Python 2to3 processing done. running egg_info error: error in 'egg_base' option: 'pip-egg-info' does not exist or is not a directory ``` Am I missing something ? Best, Laurent From p.j.a.cock at googlemail.com Mon Jul 8 15:00:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 8 Jul 2013 16:00:24 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DACEEC.6080305@gmail.com> References: <51DACEEC.6080305@gmail.com> Message-ID: On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier wrote: > Hi, > > I just tried installing biopython with pip (v-1.3.1) and Python 3.3 (under a > virtual environment created with pyvenv), > but (after a a fairly long time) the process is failing with the last 3 > lines printed on the terminal: > > ``` > > Python 2to3 processing done. > > running egg_info > > error: error in 'egg_base' option: 'pip-egg-info' does not exist or is not a > directory > > ``` > > Am I missing something ? > > Best, > > > Laurent Hi Laurent, The 2to3 conversion does take a while (sigh), but I have a plan for that: http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html Which version of Biopython, and if from git, which commit? In particular was it before or after this change from Brad which I recently committed? https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 Thanks, Peter From lgautier at gmail.com Mon Jul 8 16:18:30 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Mon, 08 Jul 2013 18:18:30 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> Message-ID: <51DAE656.2020609@gmail.com> On 07/08/2013 05:00 PM, Peter Cock wrote: > On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier wrote: >> Hi, >> >> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 (under a >> virtual environment created with pyvenv), >> but (after a a fairly long time) the process is failing with the last 3 >> lines printed on the terminal: >> >> ``` >> >> Python 2to3 processing done. >> >> running egg_info >> >> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is not a >> directory >> >> ``` >> >> Am I missing something ? >> >> Best, >> >> >> Laurent > Hi Laurent, > > The 2to3 conversion does take a while (sigh), but I have a plan for that: > http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html For it is worth, I have started doing the following 2 rules for all Python projects (and it is working fairly well): 1- Develop for Python 3.3 (not worrying about anything earlier in the 3.x series) 2- When considering support for Python 2, consider 2.7. Forget about anything earlier The rationale is: if you want the latest version of a package, get a recent Python as well. A variant of "you can't have your cake and eat it", I suppose. It might seem a bit radical, but that's not that bad; it has worked for bioconductor (where each BioC release depends on the latest R version). If no C-extension is involved, rule #2 is only introducing few conditional definitions that are hand-coded. There are only few things to consider. > > Which version of Biopython, Latest on Pypi. I am doing a vanilla: ``` pip install biopython ``` > and if from git, which commit? Anyone on this list volunteering to run `git bisect` on my behalf ? (For now I more familiar with Mercurial, and plenty to do at the moment) > In particular was it before or after this change from Brad which > I recently committed? > > https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 The distutils/distutils2/distribute/setuptools/whatever can be an endless source of headaches. Do you check with one of them in particular ? If so, may be worthwhile to enforce the use of that one in setup.py. Best, Laurent > > Thanks, > > Peter From p.j.a.cock at googlemail.com Mon Jul 8 16:29:32 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 8 Jul 2013 17:29:32 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DAE656.2020609@gmail.com> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> Message-ID: On Mon, Jul 8, 2013 at 5:18 PM, Laurent Gautier wrote: > On 07/08/2013 05:00 PM, Peter Cock wrote: >> >> On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier >> wrote: >>> >>> Hi, >>> >>> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 >>> (under a virtual environment created with pyvenv), >>> but (after a a fairly long time) the process is failing with the last 3 >>> lines printed on the terminal: >>> >>> ``` >>> >>> Python 2to3 processing done. >>> >>> running egg_info >>> >>> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is >>> not a >>> directory >>> >>> ``` >>> >>> Am I missing something ? >>> >>> Best, >>> >>> >>> Laurent >> >> Hi Laurent, >> >> The 2to3 conversion does take a while (sigh), but I have a plan for that: >> http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html > > > For it is worth, I have started doing the following 2 rules for all Python > projects (and it is working fairly well): > > 1- Develop for Python 3.3 (not worrying about anything earlier in the 3.x > series) > 2- When considering support for Python 2, consider 2.7. Forget about > anything earlier > The rationale is: if you want the latest version of a package, get a recent > Python as well. A variant of "you can't have your cake and eat it", I > suppose. It might seem a bit radical, but that's not that bad; it has worked > for bioconductor (where each BioC release depends on the latest R version). BioConductor's policy of being synced to the R version has pros and cons, certainly its users are used to this version treadmill. > If no C-extension is involved, rule #2 is only introducing few conditional > definitions that are hand-coded. There are only few things to consider. We're currently doing Python 2.5/2.6/2.6/3.3+ (although for the most part things have been OK under Python 3.1 and 3.2 as well but we're not going to officially support those). We're about to drop Python 2.5, and I'm fairly sure that dual coding for Python 2.6/2.7/3.3+ will be viable. If not, we may have to be more aggressive about phasing out Python 2.6 in the near future - we'll see. >> >> Which version of Biopython, > > > Latest on Pypi. https://pypi.python.org/pypi/biopython That means Biopython 1.61 at the moment, so prior to Brad's change landing on the trunk (which should be in 1.62 barring any problems before then). > I am doing a vanilla: > > ``` > pip install biopython > ``` > >> and if from git, which commit? > > Anyone on this list volunteering to run `git bisect` on my behalf ? > (For now I more familiar with Mercurial, and plenty to do at the moment) Have you tried using the latest Biopython from source to see if Brad's change fixed this for you? If that also fails the same way, there is no point to a 'git bisect' ;) > The distutils/distutils2/distribute/setuptools/whatever can be an endless > source of headaches. Do you check with one of them in particular ? > If so, may be worthwhile to enforce the use of that one in setup.py. Personally I only ever use the stock "python setup.py install" route, but there has been user demand for things like pip support. I look forward to improved packaging being standardised in a future Python 3.x release on day... Regards, Peter From lgautier at gmail.com Mon Jul 8 19:35:56 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Mon, 08 Jul 2013 21:35:56 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> Message-ID: <51DB149C.2010605@gmail.com> On 07/08/2013 06:29 PM, Peter Cock wrote: > On Mon, Jul 8, 2013 at 5:18 PM, Laurent Gautier wrote: >> On 07/08/2013 05:00 PM, Peter Cock wrote: >>> On Mon, Jul 8, 2013 at 3:38 PM, Laurent Gautier >>> wrote: >>>> Hi, >>>> >>>> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 >>>> (under a virtual environment created with pyvenv), >>>> but (after a a fairly long time) the process is failing with the last 3 >>>> lines printed on the terminal: >>>> >>>> ``` >>>> >>>> Python 2to3 processing done. >>>> >>>> running egg_info >>>> >>>> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is >>>> not a >>>> directory >>>> >>>> ``` >>>> >>>> Am I missing something ? >>>> >>>> Best, >>>> >>>> >>>> Laurent >>> Hi Laurent, >>> >>> The 2to3 conversion does take a while (sigh), but I have a plan for that: >>> http://lists.open-bio.org/pipermail/biopython-dev/2013-May/010633.html >> >> For it is worth, I have started doing the following 2 rules for all Python >> projects (and it is working fairly well): >> >> 1- Develop for Python 3.3 (not worrying about anything earlier in the 3.x >> series) >> 2- When considering support for Python 2, consider 2.7. Forget about >> anything earlier >> The rationale is: if you want the latest version of a package, get a recent >> Python as well. A variant of "you can't have your cake and eat it", I >> suppose. It might seem a bit radical, but that's not that bad; it has worked >> for bioconductor (where each BioC release depends on the latest R version). > BioConductor's policy of being synced to the R version has pros and > cons, certainly its users are used to this version treadmill. The package installation system has certainly played a role in help keeping users on board. The life cycle of Python version is much slower, so the pace would be slower. > >> If no C-extension is involved, rule #2 is only introducing few conditional >> definitions that are hand-coded. There are only few things to consider. > We're currently doing Python 2.5/2.6/2.6/3.3+ (although for the most > part things have been OK under Python 3.1 and 3.2 as well but we're > not going to officially support those). > > We're about to drop Python 2.5, and I'm fairly sure that dual coding > for Python 2.6/2.7/3.3+ will be viable. If not, we may have to be more > aggressive about phasing out Python 2.6 in the near future - we'll see. IIRC, Python 2.6 does not receive bugfixes for some time already, only security fixes (the last of which will be in the autumn). > >>> Which version of Biopython, >> >> Latest on Pypi. > https://pypi.python.org/pypi/biopython > > That means Biopython 1.61 at the moment, so prior to Brad's > change landing on the trunk (which should be in 1.62 barring > any problems before then). > >> I am doing a vanilla: >> >> ``` >> pip install biopython >> ``` >> >>> and if from git, which commit? >> Anyone on this list volunteering to run `git bisect` on my behalf ? >> (For now I more familiar with Mercurial, and plenty to do at the moment) > Have you tried using the latest Biopython from source to see if Brad's > change fixed this for you? If that also fails the same way, there is no > point to a 'git bisect' ;) > >> The distutils/distutils2/distribute/setuptools/whatever can be an endless >> source of headaches. Do you check with one of them in particular ? >> If so, may be worthwhile to enforce the use of that one in setup.py. > Personally I only ever use the stock "python setup.py install" route, > but there has been user demand for things like pip support. You are missing out on quite a lot, I think. For example, trying to install the master from github can be achieved the following single command: ``` pip install https://github.com/biopython/biopython/archive/master.zip ``` The outcome is the same error as before. > I look > forward to improved packaging being standardised in a future > Python 3.x release on day... Some of the time I am thinking that this will not come before Python 4.x... Best, Laurent > > Regards, > > Peter From p.j.a.cock at googlemail.com Tue Jul 9 09:35:03 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 10:35:03 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DB149C.2010605@gmail.com> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> Message-ID: On Mon, Jul 8, 2013 at 8:35 PM, Laurent Gautier wrote: > On 07/08/2013 06:29 PM, Peter Cock wrote: >> >> On Mon, Jul 8, 2013, Laurent Gautier wrote: >>> >>> On 07/08/2013 05:00 PM, Peter Cock wrote: >>>> >>>> On Mon, Jul 8, 2013, Laurent Gautier >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I just tried installing biopython with pip (v-1.3.1) and Python 3.3 >>>>> (under a virtual environment created with pyvenv), >>>>> but (after a a fairly long time) the process is failing with the last 3 >>>>> lines printed on the terminal: >>>>> >>>>> ``` >>>>> >>>>> Python 2to3 processing done. >>>>> >>>>> running egg_info >>>>> >>>>> error: error in 'egg_base' option: 'pip-egg-info' does not exist or is >>>>> not a directory >>>>> >>>>> ``` >>>>> >>>>> Am I missing something ? >>>>> >>>>> Best, >>>>> >>>>> >>>>> Laurent >>>> >>>> Hi Laurent, >>>> >>>> ... >>>> >>>> Which version of Biopython, >>> >>> >>> Latest on Pypi. >> >> https://pypi.python.org/pypi/biopython >> >> That means Biopython 1.61 at the moment, so prior to Brad's >> change landing on the trunk (which should be in 1.62 barring >> any problems before then). >> >>> I am doing a vanilla: >>> >>> ``` >>> pip install biopython >>> ``` >>> >>>> and if from git, which commit? >>> >>> Anyone on this list volunteering to run `git bisect` on my behalf ? >>> (For now I more familiar with Mercurial, and plenty to do at the moment) >> >> Have you tried using the latest Biopython from source to see if Brad's >> change fixed this for you? If that also fails the same way, there is no >> point to a 'git bisect' ;) >> > > ... > > For example, trying to install the master from github can be achieved the > following single command: > > ``` > pip install https://github.com/biopython/biopython/archive/master.zip > ``` > > The outcome is the same error as before. OK, so you get the same error from a 'pip install' from both the Biopython 1.61 release, and the latest code from github as of 8 July 2013, which would have included Brad's pip-related commit which evidently made no difference to this issue: https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 Thanks for checking that. Having confirmed the latest code is still affected, I'm out of ideas - how about you Brad, any thoughts? Thanks, Peter From p.j.a.cock at googlemail.com Tue Jul 9 09:43:57 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 10:43:57 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <86ip0kul37.fsf@fastmail.fm> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> <86ip0kul37.fsf@fastmail.fm> Message-ID: On Tue, Jul 9, 2013 at 10:41 AM, Brad Chapman wrote: > > Laurent and Peter; > >> OK, so you get the same error from a 'pip install' from both the >> Biopython 1.61 release, and the latest code from github as of >> 8 July 2013, which would have included Brad's pip-related >> commit which evidently made no difference to this issue: >> >> https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 >> >> Thanks for checking that. Having confirmed the latest code is >> still affected, I'm out of ideas - how about you Brad, any thoughts? > > Laurent, can you try this branch and see if it installs cleanly for you > using pip on Python 3.3: > > https://github.com/chapmanb/biopython > > I pulled the fix from numpy's older setup.py (before they moved to a > combined 2.7 and 3.3 code base) and it worked on my tests, but would > like to confirm before merging: > > https://github.com/biopython/biopython/pull/172/files > > Thanks for reporting the issue, > Brad Ah - those changes are now live on the master guys: https://github.com/biopython/biopython/commit/1d3d2fe43ae776d30777ffdab7b6528cd3164cfd https://github.com/biopython/biopython/commit/0f821c5b69ab5f6b888d36fb29b1ba3b51ce8590 It was perhaps a little premature but since Brad pushed it up to the pull request I thought it was good to go. We can revert it need be. Laurent, you can just re-test the master rather than Brad's branch. Thanks, Peter From chapmanb at 50mail.com Tue Jul 9 09:41:16 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Tue, 09 Jul 2013 05:41:16 -0400 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> Message-ID: <86ip0kul37.fsf@fastmail.fm> Laurent and Peter; > OK, so you get the same error from a 'pip install' from both the > Biopython 1.61 release, and the latest code from github as of > 8 July 2013, which would have included Brad's pip-related > commit which evidently made no difference to this issue: > > https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 > > Thanks for checking that. Having confirmed the latest code is > still affected, I'm out of ideas - how about you Brad, any thoughts? Laurent, can you try this branch and see if it installs cleanly for you using pip on Python 3.3: https://github.com/chapmanb/biopython I pulled the fix from numpy's older setup.py (before they moved to a combined 2.7 and 3.3 code base) and it worked on my tests, but would like to confirm before merging: https://github.com/biopython/biopython/pull/172/files Thanks for reporting the issue, Brad From p.j.a.cock at googlemail.com Tue Jul 9 10:53:05 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 11:53:05 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? Message-ID: Dear all, Given I'm heading off to Germany next week for BOSC and the CodeFest, it would be good to have Biopython 1.62 out this week - or at least a beta release. Having a beta would make sense in terms of trying to get more testing under Python 3.3, plus the SeqFeature change with sub_features (previously used for joins etc) being deprecated in favour of the new CompoundLocation object. Any thoughts? Also, I've had a go at updating the main README file and the Installation.tex file - that probably needs more work still (e.g. the ReportLab section needs updating). Thanks, Peter From lgautier at gmail.com Tue Jul 9 13:43:38 2013 From: lgautier at gmail.com (Laurent Gautier) Date: Tue, 09 Jul 2013 15:43:38 +0200 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> <86ip0kul37.fsf@fastmail.fm> Message-ID: <51DC138A.2020005@gmail.com> On 07/09/2013 11:43 AM, Peter Cock wrote: > On Tue, Jul 9, 2013 at 10:41 AM, Brad Chapman wrote: >> Laurent and Peter; >> >>> OK, so you get the same error from a 'pip install' from both the >>> Biopython 1.61 release, and the latest code from github as of >>> 8 July 2013, which would have included Brad's pip-related >>> commit which evidently made no difference to this issue: >>> >>> https://github.com/biopython/biopython/commit/bc828342716218b84908c1a4435163e564f31445 >>> >>> Thanks for checking that. Having confirmed the latest code is >>> still affected, I'm out of ideas - how about you Brad, any thoughts? >> Laurent, can you try this branch and see if it installs cleanly for you >> using pip on Python 3.3: >> >> https://github.com/chapmanb/biopython >> >> I pulled the fix from numpy's older setup.py (before they moved to a >> combined 2.7 and 3.3 code base) and it worked on my tests, but would >> like to confirm before merging: >> >> https://github.com/biopython/biopython/pull/172/files >> >> Thanks for reporting the issue, >> Brad > Ah - those changes are now live on the master guys: > > https://github.com/biopython/biopython/commit/1d3d2fe43ae776d30777ffdab7b6528cd3164cfd > https://github.com/biopython/biopython/commit/0f821c5b69ab5f6b888d36fb29b1ba3b51ce8590 > > It was perhaps a little premature but since Brad pushed it up to > the pull request I thought it was good to go. We can revert it > need be. > > Laurent, you can just re-test the master rather than Brad's branch. The install process is now working with the master on Github. Thanks, Laurent > > Thanks, > > Peter From p.j.a.cock at googlemail.com Tue Jul 9 14:38:25 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 9 Jul 2013 15:38:25 +0100 Subject: [Biopython-dev] biopython, pip, and Python 3.3 + virtualenv In-Reply-To: <51DC138A.2020005@gmail.com> References: <51DACEEC.6080305@gmail.com> <51DAE656.2020609@gmail.com> <51DB149C.2010605@gmail.com> <86ip0kul37.fsf@fastmail.fm> <51DC138A.2020005@gmail.com> Message-ID: On Tue, Jul 9, 2013 at 2:43 PM, Laurent Gautier wrote: > On 07/09/2013 11:43 AM, Peter Cock wrote: >> >> ... those changes are now live on the master guys: >> >> https://github.com/biopython/biopython/commit/1d3d2fe43ae776d30777ffdab7b6528cd3164cfd >> https://github.com/biopython/biopython/commit/0f821c5b69ab5f6b888d36fb29b1ba3b51ce8590 >> >> ... >> >> Laurent, you can just re-test the master rather than Brad's branch. > > The install process is now working with the master on Github. > > Thanks, > > Laurent Great - thanks Laurent & Brad :) Peter From mjldehoon at yahoo.com Fri Jul 12 09:06:54 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 12 Jul 2013 02:06:54 -0700 (PDT) Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: Message-ID: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Based on the Biopython deprecation schedule policy, we can remove the following pieces of code in release 1.62: Bio.Align.__init__: the get_column and add_sequence methods Bio.Align.Generic: the class Alignment in its entirety Bio.Graphics.GenomeDiagram._AbstractDraw: the .xcentre and .ycentre properties and their setters in the class AbstractDrawer Bio.Graphics.GenomeDiagram._Graph: The .centre property and its setter in the class GraphData Any final objections before we proceed? Best, -Michiel. ________________________________ From: Peter Cock To: Biopython-Dev Mailing List Sent: Tuesday, July 9, 2013 7:53 PM Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? Dear all, Given I'm heading off to Germany next week for BOSC and the CodeFest, it would be good to have Biopython 1.62 out this week - or at least a beta release. Having a beta would make sense in terms of trying to get more testing under Python 3.3, plus the SeqFeature change with sub_features (previously used for joins etc) being deprecated in favour of the new CompoundLocation object. Any thoughts? Also, I've had a go at updating the main README file and the Installation.tex file - that probably needs more work still (e.g. the ReportLab section needs updating). Thanks, Peter _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev From p.j.a.cock at googlemail.com Fri Jul 12 09:42:50 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 10:42:50 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: On Fri, Jul 12, 2013 at 10:06 AM, Michiel de Hoon wrote: > Based on the Biopython deprecation schedule policy, we can remove the > following pieces of code in release 1.62: > > Bio.Align.__init__: the get_column and add_sequence methods > Bio.Align.Generic: the class Alignment in its entirety Eric & Zheng, that won't cause any problems for your GSoC work will it? > Bio.Graphics.GenomeDiagram._AbstractDraw: the .xcentre and .ycentre > properties and their setters in the class AbstractDrawer > Bio.Graphics.GenomeDiagram._Graph: The .centre property and its setter in > the class GraphData Sounds fine - I can do those, Thanks Michiel, Peter From p.j.a.cock at googlemail.com Fri Jul 12 10:48:10 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 11:48:10 +0100 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> Message-ID: On Fri, Jul 12, 2013 at 11:00 AM, Peter Cock wrote: > On Fri, Jul 12, 2013 at 10:40 AM, Marco Galardini > wrote: >> Hi, >> >> i've arranged a sample script and sample data to replicate the issue: >> >> python test.py test.fa test.txt >> 551 20.9172 >> -5389 21.0426 >> >> pypy test.py test.fa test.txt >> 551 20.9172 >> -5389 21.0426 >> >> Traceback (most recent call last): >> File "app_main.py", line 72, in run_toplevel >> File "test.py", line 20, in >> for position, score in pssm.search(s.seq, threshold=score_t): >> File "/usr/local/lib/pypy2.7/dist-packages/Bio/motifs/matrix.py", line >> 354, in search >> score = self.calculate(s) >> File "/usr/local/lib/pypy2.7/dist-packages/Bio/motifs/matrix.py", line >> 331, in calculate >> score += self[letter][position] >> File "/usr/local/lib/pypy2.7/dist-packages/Bio/motifs/matrix.py", line >> 113, in __getitem__ >> return dict.__getitem__(self, letter) >> KeyError: 'N' >> >> Hope this helps, my guess is that it may be something related to the >> implementation of dictionaries in pypy, since the object raising the >> exception inherits dict. >> >> Thanks a lot for the help, >> Marco > > Great - I can reproduce that here using PyPy 1.9 as well... > OK - this also breaks under Jython and even Python if we disable the C extension. Here self[letters] only has ACGT, not N, thus a key error. This is something the C code just ignores. There is also an inconsistency with mixed case. New unit test: https://github.com/biopython/biopython/commit/e13c97ae3535b58d8ec3da3fc565e97db1fa75a3 Fix for the mixed case difference: https://github.com/biopython/biopython/commit/0cab00c66a1fd15072d020cfc17edbdfb37484a5 The KeyError from bad characters can be handled like this: $ git diff diff --git a/Bio/motifs/matrix.py b/Bio/motifs/matrix.py index bce1d4f..e6446b5 100644 --- a/Bio/motifs/matrix.py +++ b/Bio/motifs/matrix.py @@ -364,7 +364,11 @@ class PositionSpecificScoringMatrix(GenericPositionMatrix): score = 0.0 for position in xrange(m): letter = sequence[i+position] - score += self[letter][position] + try: + score += self[letter][position] + except KeyError: + #The C code ignores unexpected letters like N + pass scores.append(score) else: # get the log-odds matrix into a proper shape However, that leaves a numerical difference in the output: $ pypy test_motifs.py test_simple (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works. ... ok test_with_bad_char (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works with unexpected letters like N. ... FAIL test_with_mixed_case (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works with mixed case. ... ok ... ====================================================================== FAIL: test_with_bad_char (__main__.MotifTestPWM) Test if Bio.motifs PWM scoring works with unexpected letters like N. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_motifs.py", line 1662, in test_with_bad_char self.assertTrue(_isnan(result[6]), "Expected nan, not %r" % result[6]) AssertionError: Expected nan, not -37.417418833750574 ---------------------------------------------------------------------- Ran 15 tests in 0.077s FAILED (failures=1) The same error occurs on Jython, and on Python if I disable the C extension. This needs a little more investigation... I don't immediately follow when the C code sets the value to nan. Peter From p.j.a.cock at googlemail.com Fri Jul 12 12:57:08 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 13:57:08 +0100 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> Message-ID: On Fri, Jul 12, 2013 at 11:48 AM, Peter Cock wrote: > > OK - this also breaks under Jython and even Python if we > disable the C extension. Here self[letters] only has ACGT, > not N, thus a key error. This is something the C code just > ignores. There is also an inconsistency with mixed case. > > New unit test: > https://github.com/biopython/biopython/commit/e13c97ae3535b58d8ec3da3fc565e97db1fa75a3 > > Fix for the mixed case difference: > https://github.com/biopython/biopython/commit/0cab00c66a1fd15072d020cfc17edbdfb37484a5 > > The KeyError from bad characters can be handled like this: > > $ git diff > diff --git a/Bio/motifs/matrix.py b/Bio/motifs/matrix.py > index bce1d4f..e6446b5 100644 > --- a/Bio/motifs/matrix.py > +++ b/Bio/motifs/matrix.py > @@ -364,7 +364,11 @@ class PositionSpecificScoringMatrix(GenericPositionMatrix): > score = 0.0 > for position in xrange(m): > letter = sequence[i+position] > - score += self[letter][position] > + try: > + score += self[letter][position] > + except KeyError: > + #The C code ignores unexpected letters like N > + pass > scores.append(score) > else: > # get the log-odds matrix into a proper shape > > However, that leaves a numerical difference in the output: > > ... > > The same error occurs on Jython, and on Python if I disable > the C extension. This needs a little more investigation... I > don't immediately follow when the C code sets the value > to nan. Rereading the C code after lunch I realised how the 'ok' sentinel value was being used - bad letters result in NaN as the value. Fixed, https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 Peter From marco.galardini at unifi.it Fri Jul 12 13:02:20 2013 From: marco.galardini at unifi.it (Marco Galardini) Date: Fri, 12 Jul 2013 15:02:20 +0200 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> Message-ID: <51DFFE5C.4060709@unifi.it> On 07/12/2013 02:57 PM, Peter Cock wrote: > Rereading the C code after lunch I realised how the 'ok' sentinel > value was being used - bad letters result in NaN as the value. > > Fixed, > https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 > > Peter Peter, i think you should remove the " raise ImportError" statement in line 359, as it would render impossible to use the extension (if I got that correctly). Marco -- ------------------------------------------------- Marco Galardini, PhD Dipartimento di Biologia Via Madonna del Piano, 6 - 50019 Sesto Fiorentino (FI) e-mail: marco.galardini at unifi.it www: http://www.unifi.it/dblage/CMpro-v-p-51.html phone: +39 055 4574737 mobile: +39 340 2808041 ------------------------------------------------- From p.j.a.cock at googlemail.com Fri Jul 12 13:26:12 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 12 Jul 2013 14:26:12 +0100 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: <51DFFE5C.4060709@unifi.it> References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> <51DFFE5C.4060709@unifi.it> Message-ID: On Fri, Jul 12, 2013 at 2:02 PM, Marco Galardini wrote: > On 07/12/2013 02:57 PM, Peter Cock wrote: >> >> Rereading the C code after lunch I realised how the 'ok' sentinel >> value was being used - bad letters result in NaN as the value. >> >> Fixed, >> >> https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 >> >> Peter > > Peter, i think you should remove the " raise ImportError" statement in line > 359, as it would render impossible to use the extension (if I got that > correctly). > > Marco Thank you - that was debugging code for testing normal Python: https://github.com/biopython/biopython/commit/66e35d5cdd1cdfbd56b46a2f6098f715adb80f9d Peter From marco.galardini at unifi.it Fri Jul 12 14:50:00 2013 From: marco.galardini at unifi.it (Marco Galardini) Date: Fri, 12 Jul 2013 16:50:00 +0200 Subject: [Biopython-dev] [Biopython] Bio.motifs raising Exceptions using pypy In-Reply-To: References: <51DE917B.5030807@unifi.it> <51DFCF2B.4080200@unifi.it> <51DFFE5C.4060709@unifi.it> Message-ID: <51E01798.1070301@unifi.it> The proposed fix works perfectly for me, thanks a lot! Marco On 07/12/2013 03:26 PM, Peter Cock wrote: > On Fri, Jul 12, 2013 at 2:02 PM, Marco Galardini > wrote: >> On 07/12/2013 02:57 PM, Peter Cock wrote: >>> Rereading the C code after lunch I realised how the 'ok' sentinel >>> value was being used - bad letters result in NaN as the value. >>> >>> Fixed, >>> >>> https://github.com/biopython/biopython/commit/00043d28bdf5408519cb4832d6a8e822d10f6653 >>> >>> Peter >> Peter, i think you should remove the " raise ImportError" statement in line >> 359, as it would render impossible to use the extension (if I got that >> correctly). >> >> Marco > Thank you - that was debugging code for testing normal Python: > https://github.com/biopython/biopython/commit/66e35d5cdd1cdfbd56b46a2f6098f715adb80f9d > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev -- ------------------------------------------------- Marco Galardini, PhD Dipartimento di Biologia Via Madonna del Piano, 6 - 50019 Sesto Fiorentino (FI) e-mail: marco.galardini at unifi.it www: http://www.unifi.it/dblage/CMpro-v-p-51.html phone: +39 055 4574737 mobile: +39 340 2808041 ------------------------------------------------- From mjldehoon at yahoo.com Sat Jul 13 01:52:30 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 12 Jul 2013 18:52:30 -0700 (PDT) Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> The following pieces of code had a PendingDeprecationWarning in Biopython release 1.61, and can be upgraded to a BiopythonDeprecationWarning: Bio.Blast.NCBIStandalone (entire module). This module has had a PendingDeprecationWarning since September 2010. Bio.Motif (entire module). Its functionality is available from Bio.motifs, so Bio.Motif can be deprecated. Bio.ParserSupport (entire module). This module is currently only being used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning since September 2011. Any final objections? Best, -Michiel From zruan1991 at gmail.com Sat Jul 13 06:37:09 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sat, 13 Jul 2013 14:37:09 +0800 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: I don't think this will cause any problems if methods for MultipleSeqAlignment are there. I notice the latest code of MultipleSeqAlignment inherits Bio.Align.Generic.Alignment. Will that code (such as __str__, __getitem__, _str_line) be ported to MultipleSeqAlignment? Thanks! Zheng ? 2013?7?12?????Peter Cock ??? > On Fri, Jul 12, 2013 at 10:06 AM, Michiel de Hoon > > wrote: > > Based on the Biopython deprecation schedule policy, we can remove the > > following pieces of code in release 1.62: > > > > Bio.Align.__init__: the get_column and add_sequence methods > > Bio.Align.Generic: the class Alignment in its entirety > > Eric & Zheng, that won't cause any problems for your GSoC work > will it? > > > Bio.Graphics.GenomeDiagram._AbstractDraw: the .xcentre and .ycentre > > properties and their setters in the class AbstractDrawer > > Bio.Graphics.GenomeDiagram._Graph: The .centre property and its setter in > > the class GraphData > > Sounds fine - I can do those, > > Thanks Michiel, > > Peter > From w.arindrarto at gmail.com Sat Jul 13 06:58:23 2013 From: w.arindrarto at gmail.com (Wibowo Arindrarto) Date: Sat, 13 Jul 2013 08:58:23 +0200 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: Hi Michiel, There are two classes from Bio.Blast.NCBIStandalone still being used by Bio.SearchIO internally (for the BLAST text parser): the BlastParser and the Iterator classes. The BlastParser class itself still relies on Bio.ParserSupport. Would it be ok if we move parts that are used by SearchIO into their own private classes in Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current files? Best regards, Bow On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon wrote: > The following pieces of code had a PendingDeprecationWarning in Biopython release 1.61, and can be upgraded to a BiopythonDeprecationWarning: > > Bio.Blast.NCBIStandalone (entire module). This module has had a PendingDeprecationWarning since September 2010. > > Bio.Motif (entire module). Its functionality is available from Bio.motifs, so Bio.Motif can be deprecated. > > Bio.ParserSupport (entire module). This module is currently only being used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning since September 2011. > > Any final objections? > > Best, > -Michiel > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From mjldehoon at yahoo.com Sat Jul 13 10:54:07 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 13 Jul 2013 03:54:07 -0700 (PDT) Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: <1373712847.72527.YahooMailNeo@web164004.mail.gq1.yahoo.com> Hi Bow, > Would it be ok if we move parts that are used by SearchIO into their own private classes in > Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current files? That sounds fine to me. Any other opinions, anybody? Best, -Michiel. ________________________________ From: Wibowo Arindrarto To: Michiel de Hoon Cc: Peter Cock ; Eric Talevich ; Zheng Ruan ; Biopython-Dev Mailing List Sent: Saturday, July 13, 2013 3:58 PM Subject: Re: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? Hi Michiel, There are two classes from Bio.Blast.NCBIStandalone still being used by Bio.SearchIO internally (for the BLAST text parser): the BlastParser and the Iterator classes. The BlastParser class itself still relies on Bio.ParserSupport. Would it be ok if we move parts that are used by SearchIO into their own private classes in Bio.SearchIO, while putting the BiopythonDeprecationWarning on the current files? Best regards, Bow On Sat, Jul 13, 2013 at 3:52 AM, Michiel de Hoon wrote: > The following pieces of code had a PendingDeprecationWarning in Biopython release 1.61, and can be upgraded to a BiopythonDeprecationWarning: > > Bio.Blast.NCBIStandalone (entire module). This module has had a PendingDeprecationWarning since September 2010. > > Bio.Motif (entire module). Its functionality is available from Bio.motifs, so Bio.Motif can be deprecated. > > Bio.ParserSupport (entire module). This module is currently only being used by Bio.Blast.NCBIStandalone, and has had a PendingDeprecationWarning since September 2011. > > Any final objections? > > Best, > -Michiel > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev From zruan1991 at gmail.com Mon Jul 15 04:19:50 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 15 Jul 2013 12:19:50 +0800 Subject: [Biopython-dev] Codon Alignment GSoC Project Update Message-ID: Hi all, I have an update of Codon Alignment project. It can be found at http://zruanweb.com/. My plan for the following three weeks is also there. Thanks! Best, Zheng Ruan From redmine at redmine.open-bio.org Mon Jul 15 09:30:20 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Mon, 15 Jul 2013 09:30:20 +0000 Subject: [Biopython-dev] [Biopython - Bug #3441] (New) DSSP parser fails for some DSSP 2.1.0 output files Message-ID: Issue #3441 has been reported by Ahmet Sinan Yavuz. ---------------------------------------- Bug #3441: DSSP parser fails for some DSSP 2.1.0 output files https://redmine.open-bio.org/issues/3441 Author: Ahmet Sinan Yavuz Status: New Priority: High Assignee: Category: Target version: URL: Some of the DSSP files created by mkdssp 2.1.0 starts with following header:
==== Secondary Structure Definition by the program DSSP, CMBI version by M.L. Hekkelman/2010-10-21 ==== DATE=2013-07-15        .
REFERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637                                                              .
                                                                                                                               .
  336  1  0  0  0 TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, NUMBER OF SS-BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN)                .
and following parsing code (make_dssp_dict function, line 121, @sl[1]@ part) fails for the 3rd line in the example given above with "IndexError: list index out of range" as expected.
try:
        start = 0
        keys = []
        for l in handle.readlines():
            sl = l.split()
            if sl[1] == "RESIDUE":
            # Start parsing from here
                start = 1
                continue
...

Potential temp. solution:
    try:
        start = 0
        keys = []
        for l in handle.readlines():
            sl = l.split()
            if len(sl) > 1:
                if sl[1] == "RESIDUE":
                    # Start parsing from here
                    start = 1
                    continue
...
---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From p.j.a.cock at googlemail.com Mon Jul 15 13:02:14 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 15 Jul 2013 14:02:14 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: On Sat, Jul 13, 2013 at 2:52 AM, Michiel de Hoon wrote: > The following pieces of code had a PendingDeprecationWarning in Biopython > release 1.61, and can be upgraded to a BiopythonDeprecationWarning: > > Bio.Motif (entire module). Its functionality is available from Bio.motifs, > so Bio.Motif can be deprecated. Done, https://github.com/biopython/biopython/commit/74fe3dd40c6f1f43032fa490a918abf052fd5c0e Peter From p.j.a.cock at googlemail.com Mon Jul 15 17:04:30 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 15 Jul 2013 18:04:30 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: On Mon, Jul 15, 2013 at 2:02 PM, Peter Cock wrote: > On Sat, Jul 13, 2013 at 2:52 AM, Michiel de Hoon wrote: >> The following pieces of code had a PendingDeprecationWarning in Biopython >> release 1.61, and can be upgraded to a BiopythonDeprecationWarning: >> >> Bio.Motif (entire module). Its functionality is available from Bio.motifs, >> so Bio.Motif can be deprecated. > > Done, > https://github.com/biopython/biopython/commit/74fe3dd40c6f1f43032fa490a918abf052fd5c0e > > Peter I've started doing a Biopython 1.62 beta release now (before heading off to Berlin tomorrow for the CodeFest and BOSC), while I have access to the Windows machine to build the installers. Sorting out the BLAST deprecation warnings (and any required relocation of files) etc can happen once the beta is out in preparation for the final release tentatively next week (once I'm back in the office). Peter From p.j.a.cock at googlemail.com Mon Jul 15 17:29:44 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 15 Jul 2013 18:29:44 +0100 Subject: [Biopython-dev] Releasing (a beta of) Biopython 1.62 this week? In-Reply-To: References: <1373620014.53407.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1373680350.55044.YahooMailNeo@web164003.mail.gq1.yahoo.com> Message-ID: On Mon, Jul 15, 2013 at 6:04 PM, Peter Cock wrote: > > I've started doing a Biopython 1.62 beta release now (before heading > off to Berlin tomorrow for the CodeFest and BOSC), while I have access > to the Windows machine to build the installers. > > Sorting out the BLAST deprecation warnings (and any required > relocation of files) etc can happen once the beta is out in preparation > for the final release tentatively next week (once I'm back in the office). > > Peter Beta release ready, this commit is tagged as biopython-162b, https://github.com/biopython/biopython/commit/76dbdba4ed791e69a480afb4382dd5865dd35dac Archives and Windows installers are live on biopython.org in the usual place http://biopython.org/DIST/ for sanity testing prior to an announcement on the main list etc. If some of you could cast your eyes over this in the next few hours that would be great. If someone wants to draft the email (and/or news post), even better. Thanks, Peter From yeyanbo289 at gmail.com Tue Jul 16 03:21:18 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Tue, 16 Jul 2013 11:21:18 +0800 Subject: [Biopython-dev] GSOC 2013 Biopython.Phylo update 5 Message-ID: Hi all, I posted an update here . Thanks! Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From p.j.a.cock at googlemail.com Tue Jul 16 09:37:04 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 16 Jul 2013 10:37:04 +0100 Subject: [Biopython-dev] Biopython 1.62 beta release Message-ID: Dear Biopythoneers, A beta release for Biopython 1.54 is now available for download and testing - noted that I haven't done a fully detailed release announcement, we'll leave that for the official release: https://github.com/biopython/biopython/blob/master/NEWS Source distributions and Windows installers are available from the downloads page on the Biopython website. http://biopython.org/wiki/Download We are interested in getting feedback on the beta release as a whole, but especially on Python 3.3 support and the change to sub-feature handling in EMBL/GenBank parsing for joins. (At least) 22 people have contributed to this release (so far), which includes 11 new people: Alexander Campbell (first contribution) Andrea Rizzi (first contribution) Anthony Mathelier (first contribution) Ben Morris (first contribution) Brad Chapman Christian Brueffer David Arenillas (first contribution) David Martin (first contribution) Eric Talevich Iddo Friedberg Jian-Long Huang (first contribution) Joao Rodrigues Kai Blin Michiel de Hoon Nate Sutton (first contribution) Peter Cock Petra Kubincov? (first contribution) Phillip Garland Saket Choudhary (first contribution) Tiago Antao Wibowo 'Bow' Arindrarto Xabier Bello (first contribution) Our thanks to them, and on behalf of the Biopython team, thank you for any feedback, bug reports, and contributions from trying this beta release. Regards, Peter P.S. Biopython news is also on twitter: http://twitter.com/biopython From p.j.a.cock at googlemail.com Tue Jul 16 10:02:11 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 16 Jul 2013 11:02:11 +0100 Subject: [Biopython-dev] Biopython 1.62 beta release In-Reply-To: References: Message-ID: On Tue, Jul 16, 2013 at 10:37 AM, Peter Cock wrote: > Dear Biopythoneers, > > A beta release for Biopython 1.54 is now available for download > and testing Ahem. Biopython 1.62 beta, as per the title! Peter From heathmatlock at gmail.com Wed Jul 17 06:14:03 2013 From: heathmatlock at gmail.com (heathmatlock) Date: Wed, 17 Jul 2013 01:14:03 -0500 Subject: [Biopython-dev] No issues on Github Message-ID: I was looking for some open issues on Github, but I don't see any. Is biopython bug free with no roadmap of features needing assistance? :) -- Heath Matlock +1 256 274 4225 From p.j.a.cock at googlemail.com Wed Jul 17 10:03:54 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 17 Jul 2013 11:03:54 +0100 Subject: [Biopython-dev] No issues on Github In-Reply-To: References: Message-ID: On Wed, Jul 17, 2013 at 7:14 AM, heathmatlock wrote: > I was looking for some open issues on Github, but I don't see any. Is > biopython bug free with no roadmap of features needing assistance? :) Hi Heath, We're still using RedMine as our bug tracker, but moving the issues to GitHub seems quite appealing too: https://redmine.open-bio.org/projects/biopython (An updated SSL certificate is being organised, sorry about the current warning) Peter From p.j.a.cock at googlemail.com Wed Jul 17 16:53:56 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 17 Jul 2013 17:53:56 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Fri, Feb 1, 2013 at 5:48 PM, Wibowo Arindrarto wrote: >Peter wrote: >> Biopython used to use Bugzilla, at http://bugzilla.open-bio.org/ >> (it was left as a read only legacy listing, but it broke last year when >> the old server started to die and isn't really worth fixing). >> >> This was moved over to RedMine, along with all the other OBF >> projects. This does have some git integration, but I'm not that >> taken with it - and it is yet another service for the OBF team >> to maintain. >> >> What do people think of moving over to using GitHub issues? >> This would link in very well with pull requests and makes linking >> to commits much simpler too. One potential issue is if and how >> we could have bug reports sent to the biopython-dev mailing list >> (something we touched on recently for pull requests). >> >> A full automated move could be possible (NumPy did this), but I >> think a gradual move would be fine - stop filing new issues on >> RedMine and use GitHub issues in future. There are only about >> 100 issues open at the moment anyway, and a manual migration >> would also be a good way to review some of the older tickets. >> >> Thoughts?, > > Moving to GitHub sounds good to me. I'd prefer if we go over the > issues manually (removing the obsolete ones and keeping the current > ones). > > As per the bug reports sending to the mailing list, could we perhaps > create our own custom hooks? e.g. anytime a pull request is issued, an > email would be sent (see https://github.com/github/github-services and > http://developer.github.com/v3/repos/hooks/#create-a-hook) > > Regards, > Bow I just talked to Brad about this during the pre-BOSC 2013 CodeFest, and we agree that moving from RedMine to GitHub issues is a good move. BioRuby have already done this. If no one objects, I will enable filing issues on GitHub, update the wiki with links. It should be possible to disable filing new issues on RedMine, but leave it live for reference. https://redmine.open-bio.org/projects/biopython https://github.com/biopython/biopython/issues/ <-- not live yet We as a group should then manually review the ~100 open issues on RedMine, and file new issues on GitHub as appropriate. I think a manual review is a good idea anyway - there are some stale issues etc which need some fresh eyes. Regards, Peter From arklenna at gmail.com Wed Jul 17 17:06:48 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Wed, 17 Jul 2013 13:06:48 -0400 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: Hi Peter, I'd be happy to take a look at some of the issues over the next few days. Cheers, Lenna On Wed, Jul 17, 2013 at 12:53 PM, Peter Cock wrote: > On Fri, Feb 1, 2013 at 5:48 PM, Wibowo Arindrarto > wrote: > >Peter wrote: > >> Biopython used to use Bugzilla, at http://bugzilla.open-bio.org/ > >> (it was left as a read only legacy listing, but it broke last year when > >> the old server started to die and isn't really worth fixing). > >> > >> This was moved over to RedMine, along with all the other OBF > >> projects. This does have some git integration, but I'm not that > >> taken with it - and it is yet another service for the OBF team > >> to maintain. > >> > >> What do people think of moving over to using GitHub issues? > >> This would link in very well with pull requests and makes linking > >> to commits much simpler too. One potential issue is if and how > >> we could have bug reports sent to the biopython-dev mailing list > >> (something we touched on recently for pull requests). > >> > >> A full automated move could be possible (NumPy did this), but I > >> think a gradual move would be fine - stop filing new issues on > >> RedMine and use GitHub issues in future. There are only about > >> 100 issues open at the moment anyway, and a manual migration > >> would also be a good way to review some of the older tickets. > >> > >> Thoughts?, > > > > Moving to GitHub sounds good to me. I'd prefer if we go over the > > issues manually (removing the obsolete ones and keeping the current > > ones). > > > > As per the bug reports sending to the mailing list, could we perhaps > > create our own custom hooks? e.g. anytime a pull request is issued, an > > email would be sent (see https://github.com/github/github-services and > > http://developer.github.com/v3/repos/hooks/#create-a-hook) > > > > Regards, > > Bow > > I just talked to Brad about this during the pre-BOSC 2013 CodeFest, > and we agree that moving from RedMine to GitHub issues is a good > move. BioRuby have already done this. > > If no one objects, I will enable filing issues on GitHub, update the > wiki with links. It should be possible to disable filing new issues > on RedMine, but leave it live for reference. > > https://redmine.open-bio.org/projects/biopython > https://github.com/biopython/biopython/issues/ <-- not live yet > > We as a group should then manually review the ~100 open issues > on RedMine, and file new issues on GitHub as appropriate. I think > a manual review is a good idea anyway - there are some stale > issues etc which need some fresh eyes. > > Regards, > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From p.j.a.cock at googlemail.com Wed Jul 17 17:09:20 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 17 Jul 2013 18:09:20 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Wed, Jul 17, 2013 at 6:06 PM, Lenna Peterson wrote: > Hi Peter, > > I'd be happy to take a look at some of the issues over the next few days. > > Cheers, > > Lenna That would be great - and reviewing it worthy in itself. Shall we set an aim of starting to use GitHub issues tomorrow? Peter From eric.talevich at gmail.com Wed Jul 17 19:36:30 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Wed, 17 Jul 2013 12:36:30 -0700 Subject: [Biopython-dev] Codon Alignment GSoC Project Update In-Reply-To: References: Message-ID: On Sun, Jul 14, 2013 at 9:19 PM, Zheng Ruan wrote: > Hi all, > > I have an update of Codon Alignment project. It can be found at > http://zruanweb.com/. My plan for the following three weeks is also > there. Thanks! > > Best, > Zheng Ruan > Hi Zheng, Nice work. Regarding future plans: - "Add Numpy slice for CodonAlignment" -- Peter voiced an interested in optionally using Numpy arrays for multiple sequence alignments in general. I suggest waiting to reach a consensus with Peter before implementing this feature for CodonAlignment specifically. - "Construct codon alignment based on tblastn result" -- tblastn is just a heuristic for fast local alignment; instead, you can use dynamic programming for pairwise alignments (e.g. Bio.pairwise2). You could translate the nucleotide sequence in 3 frames, do local pairwise alignment of the query protein sequence (ungapped) vs. each translated frame, then stitch the alignments together as best you can. It might help to generate lists of the offsets of each translated codon relative to the original nucleotide sequence, e.g. range(0, 3*(N//3)+1, 3); range(1, 3*(N//3)+2, 3); range(2, 3*(N//3)+3, 3). In this case the build() procedure has two distinct phases: Align the protein sequence to the nucleotide sequence optimally, then insert the gaps of the protein MSA into the codon sequences. - In your Week 2 diary, you mentioned having a minimum score as an option in the alignment function, but I don't see it in the code. I can think of a few reasonable versions of this. Reasonable options might be mismatch_count and untranslated_region_count for the number of codons that don't translate to the amino acid they're aligned to, and the number of skipped regions in the nucleotide sequence (presumably introns or UTRs in the input, although who knows what the user might want to do). If not specified by the user, the build() function should probably throw an error if those instances are encountered, rather than defaulting to some value. Scoring in the style of Exonerate seems unnecessarily open-ended. In your GSoC application, you mentioned a published method for alignment that might be relevant here. Did you determine that it wouldn't work here? Also see the Exonerate (http://www.biomedcentral.com/1471-2105/6/31), as their protein2genome alignment procedure does something similar to what you're attempting. Cheers, Eric From redmine at redmine.open-bio.org Wed Jul 17 22:48:15 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Wed, 17 Jul 2013 22:48:15 +0000 Subject: [Biopython-dev] [Biopython - Bug #3444] (New) Missing DTD files Message-ID: Issue #3444 has been reported by Emanuil Tolev. ---------------------------------------- Bug #3444: Missing DTD files https://redmine.open-bio.org/issues/3444 Author: Emanuil Tolev Status: New Priority: Low Assignee: Category: Target version: URL: When running handle = Entrez.esearch(db="pubmed", term="Wellcome[GRNT]", retmax=100000) and then .efetch on all the 70k+ results ... I get warnings about the following missing DTD files: http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/pubmed_130501.dtd http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/nlmmedlinecitationset_130501.dtd http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/bookdoc_130101.dtd I downloaded them locally, but the warnings say I should "file a bug" too to let you know. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From zruan1991 at gmail.com Thu Jul 18 10:07:31 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Thu, 18 Jul 2013 18:07:31 +0800 Subject: [Biopython-dev] Codon Alignment GSoC Project Update In-Reply-To: References: Message-ID: Hi Eric, Thanks for the feedback. I finished implementing backward frameshift today. There are a few things I need to mention here. For the future plan. There is already a Numpy style slice implemented in MultipleSeqAlignment. I don't know why it doesn't work in CodonAlignment, as the __getitem__ method in CodonAlignment is directly from MultipleSeqAlignment. But this is a small issue and will be fixed soon. I'm thinking about using tblastx output because it might helpful to deal with forward frameshift (gaps) in translation. But it doesn't help when there are backward frameshift, because some nucleotide will be used twice in a single translation (None of the alignment methods can detect this as far as I know). Actually, the underlying algorithm in Bio.CodonAlign.build is almost the same as pal2nal and is just opposite of what you described. Instead of translating nucleotide sequence in three reading frames, I back translate protein sequences into degenerated codon regular expression. And try to find a match between the translated re and given nucleotide sequence. Frameshift detection from scratch is difficult bacause of my method is based on regular expression and the search step is rather simple. pal2nal can deal with user specified frameshift but doesn't try to find it from raw sequences. My current code can handle up to 10 forward frameshift (gaps) but only 1 backward frameshift in the sequence. I will add support for multiple backward frameshift support in the future. However, I don't anticipate this function to be able to handle all the situations. For example, if two frameshift events happen too close together, it is really hard to figure this out. Is it very often to see such frameshift in real biological world? Another question is about the actual usage of shifted alignment. I am unaware of any statistical methods that account for this. Normally, when people know there is a frameshift, they probably already figured out where it happens. Therefore, it's better to ask the user to tell the program where the frameshift lies. Functions to facilitate this step will be added. A scoring scheme is pending since the score need to account for all situations (mismatches, frameshift). I will add it when all the functions are tested correct. It is necessary because the mechanism for mismatch detection is very robust! It can align protein sequence to nucleotide sequence without any relationship in theory. Therefore a maximum tolerance should be set. As for the MACSE, it employs a totally different strategy and is not optimal. I will have a look at Exonerate and protein2genome procedure. Thanks! Best, Zheng Ruan On Thu, Jul 18, 2013 at 3:36 AM, Eric Talevich wrote: > On Sun, Jul 14, 2013 at 9:19 PM, Zheng Ruan wrote: > >> Hi all, >> >> I have an update of Codon Alignment project. It can be found at >> http://zruanweb.com/. My plan for the following three weeks is also >> there. Thanks! >> >> Best, >> Zheng Ruan >> > > Hi Zheng, > > Nice work. Regarding future plans: > > - "Add Numpy slice for CodonAlignment" -- Peter voiced an interested in > optionally using Numpy arrays for multiple sequence alignments in general. > I suggest waiting to reach a consensus with Peter before implementing this > feature for CodonAlignment specifically. > > - "Construct codon alignment based on tblastn result" -- tblastn is just a > heuristic for fast local alignment; instead, you can use dynamic > programming for pairwise alignments (e.g. Bio.pairwise2). You could > translate the nucleotide sequence in 3 frames, do local pairwise alignment > of the query protein sequence (ungapped) vs. each translated frame, then > stitch the alignments together as best you can. It might help to generate > lists of the offsets of each translated codon relative to the original > nucleotide sequence, e.g. range(0, 3*(N//3)+1, 3); range(1, 3*(N//3)+2, 3); > range(2, 3*(N//3)+3, 3). In this case the build() procedure has two > distinct phases: Align the protein sequence to the nucleotide sequence > optimally, then insert the gaps of the protein MSA into the codon sequences. > - In your Week 2 diary, you mentioned having a minimum score as an option > in the alignment function, but I don't see it in the code. I can think of a > few reasonable versions of this. Reasonable options might be mismatch_count > and untranslated_region_count for the number of codons that don't translate > to the amino acid they're aligned to, and the number of skipped regions in > the nucleotide sequence (presumably introns or UTRs in the input, although > who knows what the user might want to do). If not specified by the user, > the build() function should probably throw an error if those instances are > encountered, rather than defaulting to some value. Scoring in the style of > Exonerate seems unnecessarily open-ended. > > In your GSoC application, you mentioned a published method for alignment > that might be relevant here. Did you determine that it wouldn't work here? > Also see the Exonerate (http://www.biomedcentral.com/1471-2105/6/31), as > their protein2genome alignment procedure does something similar to what > you're attempting. > > Cheers, > Eric > From p.j.a.cock at googlemail.com Thu Jul 18 12:45:16 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 18 Jul 2013 13:45:16 +0100 Subject: [Biopython-dev] Codon Alignment GSoC Project Update In-Reply-To: References: Message-ID: On Thu, Jul 18, 2013 at 11:07 AM, Zheng Ruan wrote: > Hi Eric, > > Thanks for the feedback. I finished implementing backward frameshift today. > There are a few things I need to mention here. That sounds good :) Would you mind posting your update emails (or a summary so far) on your blog too please? http://zr1991.blogspot.de Thanks, Peter From zruan1991 at gmail.com Mon Jul 22 02:55:50 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 22 Jul 2013 10:55:50 +0800 Subject: [Biopython-dev] Weekly Update for Codon Alignment GSoC project Message-ID: Hi, I post an update for the project last week in my blog as well as my plan next week. As the midterm evaluation deadline is approaching, I also include this into it. Thanks for your comments and suggestions. Best, Zheng Ruan From yeyanbo289 at gmail.com Mon Jul 22 05:44:08 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 22 Jul 2013 13:44:08 +0800 Subject: [Biopython-dev] GSOC weekly update 6 Message-ID: Hi all, I post an update for the Biopython.Phylo project here: http://blog.yeyanbo.com/posts/google-summer-of-code-6.html Thanks, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From p.j.a.cock at googlemail.com Mon Jul 22 11:43:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 22 Jul 2013 12:43:58 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Wed, Jul 17, 2013 at 6:09 PM, Peter Cock wrote: > On Wed, Jul 17, 2013 at 6:06 PM, Lenna Peterson wrote: >> Hi Peter, >> >> I'd be happy to take a look at some of the issues over the next few days. >> >> Cheers, >> >> Lenna > > That would be great - and reviewing it worthy in itself. > > Shall we set an aim of starting to use GitHub issues tomorrow? > > Peter Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now. In the absence of any dissenting views, and the fact that RedMine is also offline right now (which I've raised with the OBF admin volunteers), I've enabled GitHub issues & linked to this from the main page: https://github.com/biopython/biopython/issues You'll notice there are already lots of issues there - all pull request related. This is one reason why an automated import of the old Bugzilla/RedMine issues could be complicated. Various other bits of our documentation will need to be updated... Peter From p.j.a.cock at googlemail.com Mon Jul 22 14:36:06 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 22 Jul 2013 15:36:06 +0100 Subject: [Biopython-dev] Bugzilla --> RedMine --> GitHub issues? In-Reply-To: References: Message-ID: On Mon, Jul 22, 2013 at 12:43 PM, Peter Cock wrote: > > Well this isn't tomorrow - but I'm back from BOSC 2013 in Germany now. > > In the absence of any dissenting views, and the fact that RedMine is > also offline right now (which I've raised with the OBF admin volunteers), Fixed again :) > I've enabled GitHub issues & linked to this from the main page: > > https://github.com/biopython/biopython/issues > > You'll notice there are already lots of issues there - all pull request > related. This is one reason why an automated import of the old > Bugzilla/RedMine issues could be complicated. > > Various other bits of our documentation will need to be updated... Hopefully done now, e.g. https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 https://github.com/biopython/biopython/commit/e836f4fadde494a8253b4a4114a36ff3259eb079 Note that there doesn't seem to be a way to turn off new issues in a RedMine project - there are hacks via removing the ability from the roles, but I fear that would affect the other projects still using the RedMine server (e.g. BioPerl). Instead we may just have to do the triage/migration and then drop the links to the old RedMine server from the website etc. Peter From mjldehoon at yahoo.com Wed Jul 24 07:31:08 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 24 Jul 2013 00:31:08 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation Message-ID: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Dear all, When trying to install Biopython on a new MacBook, I get the following error message when I run "python setup.py build": tkx330:biopython-1.62b mdehoon$ python setup.py build Traceback (most recent call last): ? File "setup.py", line 109, in ??? from setuptools import setup, Command ? File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/setuptools/__init__.py", line 2, in ??? from setuptools.extension import Extension, Library ... ? File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1211, in get_metadata ??? return self._get(self._fn(self.egg_info,name)) ? File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1326, in _get ??? stream = open(path, 'rb') IOError: [Errno 13] Permission denied: '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_dateutil-2.1-py2.7.egg/EGG-INFO/top_level.txt' This looks like a simple problem with file permissions, and I hope can be solved easily. Still, it is quite discouraging to first-time users of Biopython. Do we actually need setuptools? Looking at setup.py, it seems that distutils is sufficient for our needs. If so, let's remove the dependency on setuptools. Best, -Michiel. From p.j.a.cock at googlemail.com Wed Jul 24 09:13:06 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 10:13:06 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: Hello all, Something Brad and I chatted about during the BOSC 2013 CodeFest was should we switch the Biopython licence to something which is formally approved as "Open Source" by The Open Source Initiative (OSI): http://opensource.org/licenses The current Biopython License is very short and liberal, and I have long described it as an MIT/BSD type licence. However the actual wording matches neither of these exactly (as far as I could tell): http://biopython.org/DIST/LICENSE https://github.com/biopython/biopython/blob/master/LICENSE In theory we could ask the OSI to approve our current license, but as they explain "yet another license" is not a good thing to encourage: http://opensource.org/proliferation Brad and I thought it would be reasonable to adopt a standard MIT/BSD licence instead. Note that the following lack a "no endorsement" clause which we have currently: http://opensource.org/licenses/MIT http://opensource.org/licenses/BSD-2-Clause Therefore this looks like the closest match: http://opensource.org/licenses/BSD-3-Clause i.e. The BSD 3-Clause ("BSD New" or "BSD Simplified") license. This is also used by the NumPy project and many other Python libraries. Assuming people agree this is a good idea, we can start doing this on a file-by-file basis (checking for approval from the named copyright holders) and to be rigorous check with every named contributor in the CONTRIB or NEWS files. Peter From tiagoantao at gmail.com Wed Jul 24 09:23:08 2013 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Wed, 24 Jul 2013 10:23:08 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: +1 to getting rid of an unstandard license. If BSD 3-clause it is the closest, then I would change. This irrespective of license preferences: A potentially unfruitful discussion would be around "the best Free/Open license". This is just getting below the umbrella of a standard, OSI-approved license. A great idea. On Wed, Jul 24, 2013 at 10:13 AM, Peter Cock wrote: > Hello all, > > Something Brad and I chatted about during the BOSC 2013 CodeFest > was should we switch the Biopython licence to something which is > formally approved as "Open Source" by The Open Source Initiative > (OSI): http://opensource.org/licenses > > The current Biopython License is very short and liberal, and I have > long described it as an MIT/BSD type licence. However the actual > wording matches neither of these exactly (as far as I could tell): > > http://biopython.org/DIST/LICENSE > https://github.com/biopython/biopython/blob/master/LICENSE > > In theory we could ask the OSI to approve our current license, but as > they explain "yet another license" is not a good thing to encourage: > http://opensource.org/proliferation > > Brad and I thought it would be reasonable to adopt a standard > MIT/BSD licence instead. > > Note that the following lack a "no endorsement" clause which we > have currently: > > http://opensource.org/licenses/MIT > http://opensource.org/licenses/BSD-2-Clause > > Therefore this looks like the closest match: > > http://opensource.org/licenses/BSD-3-Clause > > i.e. The BSD 3-Clause ("BSD New" or "BSD Simplified") license. > This is also used by the NumPy project and many other Python > libraries. > > Assuming people agree this is a good idea, we can start doing > this on a file-by-file basis (checking for approval from the named > copyright holders) and to be rigorous check with every named > contributor in the CONTRIB or NEWS files. > > Peter > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > -- ?Grant me chastity and continence, but not yet? - St Augustine From p.j.a.cock at googlemail.com Wed Jul 24 09:26:12 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 10:26:12 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Message-ID: On Wed, Jul 24, 2013 at 8:31 AM, Michiel de Hoon wrote: > Dear all, > > When trying to install Biopython on a new MacBook, I get the following error message when I run "python setup.py build": > > tkx330:biopython-1.62b mdehoon$ python setup.py build > Traceback (most recent call last): > File "setup.py", line 109, in > from setuptools import setup, Command > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/setuptools/__init__.py", line 2, in > from setuptools.extension import Extension, Library > ... > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1211, in get_metadata > return self._get(self._fn(self.egg_info,name)) > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/distribute-0.6.45-py2.7.egg/pkg_resources.py", line 1326, in _get > stream = open(path, 'rb') > IOError: [Errno 13] Permission denied: '/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_dateutil-2.1-py2.7.egg/EGG-INFO/top_level.txt' > > This looks like a simple problem with file permissions, and I hope can be solved easily. My guess would be there is something broken with your dateutil install, a little Google searching shows very similar issues with other packages. > Still, it is quite discouraging to first-time users of Biopython. Yes, but I'm not sure it is our fault :( > Do we actually need setuptools? > Looking at setup.py, it seems that distutils is sufficient for our needs. > If so, let's remove the dependency on setuptools. I will have to pass that question to Brad, Peter From p.j.a.cock at googlemail.com Wed Jul 24 09:31:02 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 10:31:02 +0100 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: On Wed, Jul 24, 2013 at 10:23 AM, Tiago Ant?o wrote: > +1 to getting rid of an unstandard license. If BSD 3-clause it is the > closest, then I would change. Having a few more eyes confirm this would be good. Anything very close makes the switch easier to justify. > This irrespective of license preferences: A potentially unfruitful > discussion would be around "the best Free/Open license". I really don't want to go down that route - the Python OSS community by and large use liberal licenses in the MIT/BSD family. The fact that NumPy uses the BSD 3-clause licence is a good standard to follow. Brad said he prefers the MIT licence (and it is shorter). > This is just getting below the umbrella of a standard, OSI-approved license. > > A great idea. That's the idea - that and the fact that any non-standard license (even a nice open one) is one more barrier to adoption - especially in companies or institutes with lawyers that care about details. This was an issue which came up during the BOSC 2013 conference. Now since our current licence is short and simple, this isn't such an issue - but it is a small barrier all the same. This also makes like simpler for things like the PyPI license tagging and so on. Peter From mjldehoon at yahoo.com Wed Jul 24 09:32:45 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 24 Jul 2013 02:32:45 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Message-ID: <1374658365.47986.YahooMailNeo@web164004.mail.gq1.yahoo.com> Hi Peter, > > Still, it is quite discouraging to first-time users of Biopython. > > Yes, but I'm not sure it is our fault :( Sure, I know it's not our fault. But still it's avoidable. Best, -Michiel From christian at brueffer.de Wed Jul 24 09:46:24 2013 From: christian at brueffer.de (Christian Brueffer) Date: Wed, 24 Jul 2013 11:46:24 +0200 Subject: [Biopython-dev] Adopting BSD 3-Clause license for Biopython? In-Reply-To: References: Message-ID: <51EFA270.2050903@brueffer.de> On 7/24/13 11:13 , Peter Cock wrote: [...] > > Therefore this looks like the closest match: > > http://opensource.org/licenses/BSD-3-Clause > > i.e. The BSD 3-Clause ("BSD New" or "BSD Simplified") license. > This is also used by the NumPy project and many other Python > libraries. > > Assuming people agree this is a good idea, we can start doing > this on a file-by-file basis (checking for approval from the named > copyright holders) and to be rigorous check with every named > contributor in the CONTRIB or NEWS files. > I welcome this initiative and I agree that BSD 3-clause seems to be the closest match. Cheers, Chris From p.j.a.cock at googlemail.com Wed Jul 24 10:02:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 24 Jul 2013 11:02:24 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: <86a9lcl1nt.fsf@fastmail.fm> References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: On Wed, Jul 24, 2013 at 10:58 AM, Brad Chapman wrote: > > Peter and Michiel; > >>> Do we actually need setuptools? >>> Looking at setup.py, it seems that distutils is sufficient for our needs. >>> If so, let's remove the dependency on setuptools. > > We used setuptools/distribute to install dependencies, although > practically this doesn't work well since pip doesn't finish NumPy > installation before installing Biopython. So I'm fine with taking it out > if you want to simplify the setup and avoid the extra dependency. Sounds like a plan - but we should all test this change, especially users of PIP, easy_install, virtual env etc. It is major enough to warrant a second beta? Peter From chapmanb at 50mail.com Wed Jul 24 09:58:46 2013 From: chapmanb at 50mail.com (Brad Chapman) Date: Wed, 24 Jul 2013 05:58:46 -0400 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> Message-ID: <86a9lcl1nt.fsf@fastmail.fm> Peter and Michiel; >> Do we actually need setuptools? >> Looking at setup.py, it seems that distutils is sufficient for our needs. >> If so, let's remove the dependency on setuptools. We used setuptools/distribute to install dependencies, although practically this doesn't work well since pip doesn't finish NumPy installation before installing Biopython. So I'm fine with taking it out if you want to simplify the setup and avoid the extra dependency. Brad From mjldehoon at yahoo.com Wed Jul 24 10:51:20 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 24 Jul 2013 03:51:20 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: <1374663080.82953.YahooMailNeo@web164003.mail.gq1.yahoo.com> Hi Peter, > It is major enough to warrant a second beta? I would assume that we won't need a second beta. If it does turn out that there are installation problems with Biopython 1.62, we can always release a Biopython 1.63. Best, -Michiel. From p.j.a.cock at googlemail.com Thu Jul 25 15:05:19 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 25 Jul 2013 16:05:19 +0100 Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: On Wed, Jul 24, 2013 at 11:02 AM, Peter Cock wrote: > On Wed, Jul 24, 2013 at 10:58 AM, Brad Chapman wrote: >> >> Peter and Michiel; >> >>>> Do we actually need setuptools? >>>> Looking at setup.py, it seems that distutils is sufficient for our needs. >>>> If so, let's remove the dependency on setuptools. >> >> We used setuptools/distribute to install dependencies, although >> practically this doesn't work well since pip doesn't finish NumPy >> installation before installing Biopython. So I'm fine with taking it out >> if you want to simplify the setup and avoid the extra dependency. > > Sounds like a plan - but we should all test this change, especially > users of PIP, easy_install, virtual env etc. > So who's going to do the commit - Brad or Michiel? Peter From mjldehoon at yahoo.com Fri Jul 26 00:09:11 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 25 Jul 2013 17:09:11 -0700 (PDT) Subject: [Biopython-dev] setuptools breaking biopython-1.62b installation In-Reply-To: References: <1374651068.98742.YahooMailNeo@web164005.mail.gq1.yahoo.com> <86a9lcl1nt.fsf@fastmail.fm> Message-ID: <1374797351.81889.YahooMailNeo@web164002.mail.gq1.yahoo.com> Brad, can you do it? Best, -Michiel. ________________________________ From: Peter Cock To: Brad Chapman ; Michiel de Hoon Cc: "biopython-dev at biopython.org" Sent: Friday, July 26, 2013 12:05 AM Subject: Re: [Biopython-dev] setuptools breaking biopython-1.62b installation On Wed, Jul 24, 2013 at 11:02 AM, Peter Cock wrote: > On Wed, Jul 24, 2013 at 10:58 AM, Brad Chapman wrote: >> >> Peter and Michiel; >> >>>> Do we actually need setuptools? >>>> Looking at setup.py, it seems that distutils is sufficient for our needs. >>>> If so, let's remove the dependency on setuptools. >> >> We used setuptools/distribute to install dependencies, although >> practically this doesn't work well since pip doesn't finish NumPy >> installation before installing Biopython. So I'm fine with taking it out >> if you want to simplify the setup and avoid the extra dependency. > > Sounds like a plan - but we should all test this change, especially > users of PIP, easy_install, virtual env etc. > So who's going to do the commit - Brad or Michiel? Peter From yeyanbo289 at gmail.com Mon Jul 29 02:49:09 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 29 Jul 2013 10:49:09 +0800 Subject: [Biopython-dev] GSOC weekly update 7 Message-ID: Hi all, I post an update for the Biopython.Phylo project here: http://blog.yeyanbo.com/posts/google-summer-of-code-7.html Cheers, Yanbo -- *Yanbo Ye* *Guangzhou Institutes of Biomedicine and Health, * *Chinese Academy of Sciences* *190 Kaiyuan Avenue, Science Park, Guangzhou, China** * * * *Email: ye_yanbo at gibh.ac.cn* *Web: http://www.yeyanbo.com* *Phone: (86)-020-32093810* From zruan1991 at gmail.com Mon Jul 29 06:50:06 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Mon, 29 Jul 2013 02:50:06 -0400 Subject: [Biopython-dev] GSoC Update for Codon Alignment Message-ID: Hi all, An update of Codon Alignment GSoC can be found at http://zruanweb.com/. Thanks! Best, Zheng Ruan From eric.talevich at gmail.com Mon Jul 29 16:52:03 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 29 Jul 2013 09:52:03 -0700 Subject: [Biopython-dev] Weekly Update for Codon Alignment GSoC project In-Reply-To: References: Message-ID: On Sun, Jul 21, 2013 at 7:55 PM, Zheng Ruan wrote: > Hi, > > I post an update for the project last week in my blog as > well as my plan next week. As the midterm evaluation deadline is > approaching, I also include this into it. Thanks for your comments and > suggestions. > > Best, > Zheng Ruan > Hey Zheng, Great progress so far. Your implementation of codon alignment up to this point looks more than adequate to me. A couple thoughts: - Can your implementation detect inserts (forward frameshifts) of more than 3 nucleotides, as might be introduced by introns? Or just 1-2 bases? - Same question for the backward frameshift implementation. One biological cause for these backward shifts is ribosomal slippage -- these are usually short, e.g. 1 base, so it is not urgent for your implementation to handle larger backward shifts if this would be more difficult. (For drastic differences between protein and nucleotide sequences, a bioinformatician would normally use some variant of BLAST, exonerate, or another local alignment tool, rather than expecting this codon alignment algorithm to catch and handle every possibility.) Have a safe trip home! -Eric From zruan1991 at gmail.com Tue Jul 30 02:58:01 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Tue, 30 Jul 2013 10:58:01 +0800 Subject: [Biopython-dev] Weekly Update for Codon Alignment GSoC project In-Reply-To: References: Message-ID: Thanks Eric, It will not be difficult to allow more than 3 nucleotides forward frameshift. I will include this into my plan. For the backward frameshift support, more nucleotides support is desirable but difficult. Some function to help construct codon alignment with other software will be considered. Best, Ruan On Tue, Jul 30, 2013 at 12:52 AM, Eric Talevich wrote: > On Sun, Jul 21, 2013 at 7:55 PM, Zheng Ruan wrote: > >> Hi, >> >> I post an update for the project last week in my blog as >> well as my plan next week. As the midterm evaluation deadline is >> approaching, I also include this into it. Thanks for your comments and >> suggestions. >> >> Best, >> Zheng Ruan >> > > Hey Zheng, > > Great progress so far. Your implementation of codon alignment up to this > point looks more than adequate to me. A couple thoughts: > > - Can your implementation detect inserts (forward frameshifts) of more > than 3 nucleotides, as might be introduced by introns? Or just 1-2 bases? > > - Same question for the backward frameshift implementation. One biological > cause for these backward shifts is ribosomal slippage -- these are usually > short, e.g. 1 base, so it is not urgent for your implementation to handle > larger backward shifts if this would be more difficult. (For drastic > differences between protein and nucleotide sequences, a bioinformatician > would normally use some variant of BLAST, exonerate, or another local > alignment tool, rather than expecting this codon alignment algorithm to > catch and handle every possibility.) > > Have a safe trip home! > > -Eric > From ben at benfulton.net Wed Jul 31 00:43:31 2013 From: ben at benfulton.net (Ben Fulton) Date: Tue, 30 Jul 2013 20:43:31 -0400 Subject: [Biopython-dev] 1.62b test coverage report Message-ID: I ran Ned Batchelder's coverage tool against the 1.62 beta code to see how much code is covered by tests. The overall total was 74% which is pretty respectable. I ran the tests on a fairly fresh machine, which meant I had to install a lot of software, some of which I either didn't get installed properly, or the tests are out of date, or there were failures for some other reason. I ended up having to skip seven test files: Dialign_Tool EmbossPhylipNew Mafft PopGen_DFDist PopGen_FDist XXMotif phyml There were three tests I managed to get running but still had failures: FastTree NCBI_BLAST Prank_tool You can look at the report on my website at http://benfulton.net/BioPython162_Coverage/ . Please let me know if you have comments or questions, or can tell me what I did wrong on the above tests :) From p.j.a.cock at googlemail.com Wed Jul 31 07:40:24 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 31 Jul 2013 08:40:24 +0100 Subject: [Biopython-dev] 1.62b test coverage report In-Reply-To: References: Message-ID: On Wednesday, July 31, 2013, Ben Fulton wrote: > I ran Ned Batchelder's coverage tool against the 1.62 beta code to see how > much code is covered by tests. The overall total was 74% which is pretty > respectable. > > I ran the tests on a fairly fresh machine, which meant I had to install a > lot of software, some of which I either didn't get installed properly, or > the tests are out of date, or there were failures for some other reason. I > ended up having to skip seven test files: > > Dialign_Tool > EmbossPhylipNew > Mafft > PopGen_DFDist > PopGen_FDist > XXMotif > phyml I'm pretty sure I have some or all of those setup on at least one of my test machines, so with a little more work together we can try to resolve those (which may mean updating the docs). > There were three tests I managed to get running but still had failures: > > FastTree > NCBI_BLAST > Prank_tool A few more details here would be very good - what versions of the tools did you have and what error did the tests give? (I just fixed a warning from new options added in BLAST 2.2.28+ committed yesterday) > You can look at the report on my website at > http://benfulton.net/BioPython162_Coverage/ . Please let me know if you > have comments or questions, or can tell me what I did wrong on the above > tests :) > Thanks Ben - some of the modules with zero or low coverage are deprecated so don't worry me - others though probably do need to be looked at. Would anyone like to make a priority list? This would then be something we can point volunteers at who ask for suggestions of something they can contribute? Thanks, Peter From sharma409 at gmail.com Wed Jul 31 18:12:35 2013 From: sharma409 at gmail.com (Rishi Sharma) Date: Wed, 31 Jul 2013 11:12:35 -0700 Subject: [Biopython-dev] Saving a Trie Message-ID: Hello, I was was wondering how i might write a Trie to file. It doesn't seem to have a write() method so pickling won't work. I'm not sure how the biopython save is intended to work, so I guess that is what I'm asking. Thanks for your help, Rishi Sharma From arklenna at gmail.com Wed Jul 31 19:17:59 2013 From: arklenna at gmail.com (Lenna Peterson) Date: Wed, 31 Jul 2013 15:17:59 -0400 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: I recall a bug report about this on redmine, but I can't find it or get the site to load at all (although downforeveryoneorjustme claims it's up). I don't have experience using pickle on non-default objects, but I wasn't aware an object needed a specific method to be pickled. What error does it throw when you pickle.dump() it? I did find a somewhat related SO question that suggests trie pickling could be a non-straightforward proposition: http://stackoverflow.com/questions/2134706/hitting-maximum-recursion-depth-using-pythons-pickle-cpickle Cheers, Lenna On Wed, Jul 31, 2013 at 2:12 PM, Rishi Sharma wrote: > Hello, > > I was was wondering how i might write a Trie to file. It doesn't seem to > have a write() method so pickling won't work. I'm not sure how the > biopython save is intended to work, so I guess that is what I'm asking. > > Thanks for your help, > Rishi Sharma > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From sharma409 at gmail.com Wed Jul 31 19:20:49 2013 From: sharma409 at gmail.com (Rishi Sharma) Date: Wed, 31 Jul 2013 12:20:49 -0700 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: TypeError: can't pickle trie objects On Wed, Jul 31, 2013 at 12:17 PM, Lenna Peterson wrote: > I recall a bug report about this on redmine, but I can't find it or get > the site to load at all (although downforeveryoneorjustme claims it's up). > > I don't have experience using pickle on non-default objects, but I wasn't > aware an object needed a specific method to be pickled. What error does it > throw when you pickle.dump() it? > > I did find a somewhat related SO question that suggests trie pickling > could be a non-straightforward proposition: > > > http://stackoverflow.com/questions/2134706/hitting-maximum-recursion-depth-using-pythons-pickle-cpickle > > Cheers, > > Lenna > > > On Wed, Jul 31, 2013 at 2:12 PM, Rishi Sharma wrote: > >> Hello, >> >> I was was wondering how i might write a Trie to file. It doesn't seem to >> have a write() method so pickling won't work. I'm not sure how the >> biopython save is intended to work, so I guess that is what I'm asking. >> >> Thanks for your help, >> Rishi Sharma >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev >> > > From p.j.a.cock at googlemail.com Wed Jul 31 21:59:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 31 Jul 2013 22:59:21 +0100 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: On Wednesday, July 31, 2013, Rishi Sharma wrote: > Hello, > > I was was wondering how i might write a Trie to file. It doesn't seem to > have a write() method so pickling won't work. I'm not sure how the > biopython save is intended to work, so I guess that is what I'm asking. > > Hi Rishi, You need to do something like this (untested - I'm not at a computer): from Bio import trie f = open("my-data.dat", "w") tr = trie.trie() #fill in the trie trie.save(f, trie) f.close() And to read it back, from Bio import trie f = open('my-data.dat', 'r') tr = trie.load(f) f.close() Peter From sharma409 at gmail.com Wed Jul 31 22:05:40 2013 From: sharma409 at gmail.com (Rishi Sharma) Date: Wed, 31 Jul 2013 15:05:40 -0700 Subject: [Biopython-dev] Saving a Trie In-Reply-To: References: Message-ID: Ah yes this worked. I was doing something stupid by importing trie from Bio.trie and confusing myself between the module and the method. Thank you! On Wed, Jul 31, 2013 at 2:59 PM, Peter Cock wrote: > > On Wednesday, July 31, 2013, Rishi Sharma wrote: > >> Hello, >> >> I was was wondering how i might write a Trie to file. It doesn't seem to >> have a write() method so pickling won't work. I'm not sure how the >> biopython save is intended to work, so I guess that is what I'm asking. >> >> > Hi Rishi, > > You need to do something like this (untested - I'm not at a computer): > > from Bio import trie > f = open("my-data.dat", "w") > tr = trie.trie() > #fill in the trie > trie.save(f, trie) > f.close() > > And to read it back, > > from Bio import trie > f = open('my-data.dat', 'r') > tr = trie.load(f) > f.close() > > Peter > >