From bugzilla-daemon at portal.open-bio.org Sat Oct 2 22:51:14 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 2 Oct 2010 22:51:14 -0400
Subject: [Biopython-dev] [Bug 2608] Gcc "differ in signedness" warnings with
trie.c
In-Reply-To:
Message-ID: <201010030251.o932pEUM020278@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2608
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2010-10-02 22:51 EST -------
The problem here was the strdup is not an ANSI-C function, and its
implementation show differences between platforms. Replacing strdup removes the
need for unsigned chars.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Oct 3 09:51:50 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 3 Oct 2010 09:51:50 -0400
Subject: [Biopython-dev] [Bug 2938] Bio.Entrez.read() returns empty string
for HTML (not an error)
In-Reply-To:
Message-ID: <201010031351.o93DpoGZ023133@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2938
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #8 from mdehoon at ims.u-tokyo.ac.jp 2010-10-03 09:51 EST -------
(In reply to comment #7)
> Does the current funny XML file have anything useful in it?
Yes, but I doubt that many people (if any) are using the Journals database. If
they do, we could make a straightforward parser for plain-text output from the
Journals database, which is supported by NCBI. See this discussion on the
mailing list:
http://lists.open-bio.org/pipermail/biopython-dev/2010-September/008239.html
To resolve this bug, I have modified the parser such that an error is raised
whenever the XML data do not start with the XML declaration (
References: <486264729.08793@eyou.net>
Message-ID:
On Tue, Oct 5, 2010 at 8:45 AM, Yong wrote:
> Hello everyone,
>
> I am testing a database and its web interface
> (http://pbl.neau.edu.cn:8080/)?established with Plone4Bio, BioPython and
> BioSQL, when query database from webpage it always return the default date
> for sequence: "01-JAN-1980".
>
> I found that the error happened here in file Bio::SeqIO::InsdcIO.py (lines:
> 366-371) of BioPython:
>
> ??? def _get_date(self, record) :
> ??????? default = "01-JAN-1980"
> ??????? try :
> ??????????? date = record.annotations["date"]
> ??????? except KeyError :
> ??????????? return default
>
> It looks like that it does not have "date" key, is it a bug of BioPython or
> Plone4Bio? anybody know how to solve it?
Hi
As I recall, reading/writing a GenBank file with Bio.SeqIO (note single
dot in Python, two colons is Perl - grin), the date is preserved. I think
the problem is in Biopython loading/retrieving a GenBank file in BioSQL,
and I thought there was a bug open on this...
I can probably suggest a hack in the Plone4Bio code, but it would
be better to tweak Biopython.
Peter
From bugzilla-daemon at portal.open-bio.org Tue Oct 5 05:16:05 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 5 Oct 2010 05:16:05 -0400
Subject: [Biopython-dev] [Bug 2681] BioSQL: record annotations enhancements
In-Reply-To:
Message-ID: <201010050916.o959G53F031667@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2681
------- Comment #9 from biopython-bugzilla at maubp.freeserve.co.uk 2010-10-05 05:16 EST -------
(In reply to comment #4)
> (In reply to comment #2)
> > (In reply to comment #0)
> > > 1) Fixed date/dates typo.
> >
> > Why is it a typo? Change not checked in.
>
> The function _load_bioentry_date in Loader.py inserts the annotation 'date',
> if present, or the current date if not, into the bioentry_qualifier_value
> table. This is pulled by BioSeq.py _retrieve_qualifier_value and stored as
> the attribute 'dates'. Hence I considered line 307 in BioSeq.py to be a typo,
> which should be 'date' and not 'dates'. Also, because Loader.py handles dates
> separately, they should not be handled by the function load_annotations.
I'd forgotten about this issue - I was just reminded by a query on the
Plone4Bio mailing list. Yes, I think you are right:
http://github.com/biopython/biopython/commit/6aca2c0dbc17a172e76483d925248184080bb654
Thanks!
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Tue Oct 5 05:18:16 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 5 Oct 2010 10:18:16 +0100
Subject: [Biopython-dev] [P4b] Always return default date: "01-JAN-1980"
In-Reply-To:
References: <486264729.08793@eyou.net>
Message-ID:
On Tue, Oct 5, 2010 at 10:00 AM, Peter wrote:
> On Tue, Oct 5, 2010 at 8:45 AM, Yong wrote:
>> Hello everyone,
>>
>> I am testing a database and its web interface
>> (http://pbl.neau.edu.cn:8080/)?established with Plone4Bio, BioPython and
>> BioSQL, when query database from webpage it always return the default date
>> for sequence: "01-JAN-1980".
>>
>> I found that the error happened here in file Bio::SeqIO::InsdcIO.py (lines:
>> 366-371) of BioPython:
>>
>> ??? def _get_date(self, record) :
>> ??????? default = "01-JAN-1980"
>> ??????? try :
>> ??????????? date = record.annotations["date"]
>> ??????? except KeyError :
>> ??????????? return default
>>
>> It looks like that it does not have "date" key, is it a bug of BioPython or
>> Plone4Bio? anybody know how to solve it?
>
> Hi
>
> As I recall, reading/writing a GenBank file with Bio.SeqIO (note single
> dot in Python, two colons is Perl - grin), the date is preserved. I think
> the problem is in Biopython loading/retrieving a GenBank file in BioSQL,
> and I thought there was a bug open on this...
>
> I can probably suggest a hack in the Plone4Bio code, but it would
> be better to tweak Biopython.
>
> Peter
Hi Yong,
I found the open bug Biopython report I was thinking of, and committed a fix:
http://bugzilla.open-bio.org/show_bug.cgi?id=2681#c9
Are you able to update your copy of Biopython to the latest source code
to test this fix?
Thanks,
Peter
From biopython at maubp.freeserve.co.uk Mon Oct 11 04:53:45 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 11 Oct 2010 09:53:45 +0100
Subject: [Biopython-dev] Continuous integration
In-Reply-To:
References:
Message-ID:
2010/9/28 Tiago Ant?o :
> Hi,
>
> I've been playing with buildbot a bit (for continuous integration
> stuff). I am creating a page on the wiki with some info on that front.
>
> This is just concept/exploratory stuff: if people don't like it, it is
> just a question to delete the page. Hopefully this will at least
> permit to see if continuous integration is worthwhile the effort and
> if buildbot is a good platform for Biopython.
>
> Any comments most welcome. I expect to have a working
> prototype very soon. If people don't ?like it, I just trash it (no
> problems with
> that).
>
> Tiago
I see from your notes on the wiki you have been making
good progress. I have a couple of queries/ideas:
(1) Several of our tests go online to the NCBI or UniProt etc.
These tests can and do fail sometimes due to network issues.
Also, having some/many buildbot slaves running on a regular
basis (once a week? once a day?) would add up and this
load may be unwelcome. Perhaps we need to add an -offline
flag to run_tests.py which can skip any online tests?
(2) You mention buildbot doesn't have built in support for
spotting changes in a git repository - but can it do this for
SVN? Since github.com also allow access to the git repo
via svn that might be a more elegant workaround.
(3) Does the buildbot master require the buildbot slaves
be online most/all of the time? Would a desktop machine
which is typically only on during office hours on week
days still be useful? I could probably answer this myself
with a bit more background reading ;)
Thanks
Peter
From tiagoantao at gmail.com Mon Oct 11 08:21:01 2010
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 11 Oct 2010 13:21:01 +0100
Subject: [Biopython-dev] Continuous integration
In-Reply-To:
References:
Message-ID:
Hi Peter,
2010/10/11 Peter :
> (1) Several of our tests go online to the NCBI or UniProt etc.
> These tests can and do fail sometimes due to network issues.
> Also, having some/many buildbot slaves running on a regular
> basis (once a week? once a day?) would add up and this
> load may be unwelcome. Perhaps we need to add an -offline
> flag to run_tests.py which can skip any online tests?
That might be a good idea (to have an --offline flag, I mean). A very
good idea, indeed.
I would like to put the infrastructure in place (if people are
interested in going ahead with this...), but after that we need to
stabilize a test policy and that will mean answering questions like
that.
As far as I see we will have many builders (tests under different
conditions). Say 5 different Python versions (Jython included), at
least 3 OSes. This is already 15 builders. This can easily creep up.
Though the numbers are high, it is quite easy to maintain all this
stuff: 3 volunteer machines (one for each OS) are enough. The cool
thing about buildbot is that it is designed for volunteer machines to
be added, so you can start your buildbot slave on your laptop when you
are idle. It does not need an array of servers on demand to produce
the tests.
NCBI and Uniprot might not like to see 30 daily connections for tests
:( . So we might need to have, say, one weekly test for each OS doing
the network stuff (just a single Python version per OS, maybe) and
dailies not doing network loads.
> (2) You mention buildbot doesn't have built in support for
> spotting changes in a git repository - but can it do this for
> SVN? Since github.com also allow access to the git repo
> via svn that might be a more elegant workaround.
There are 2 different things to consider:
1. Spotting the git repository. There is no builtin support, but this
is TRIVIAL nonetheless with the general adaptor of buildbot. It works
like this:
a. a developer does a push
b. github has a hook system which allows for reporting a change
to the repository to a certain URL/CGI. Fully automated, transparent
to the developer.
c. We supply a CGI that receives the event and informs buildbot.
There are CGIs for github. We just have to stuff one in a webserver.
2. The slaves/builders have to download github code. In this case,
buildbot HAS NATIVE SUPPORT.
> (3) Does the buildbot master require the buildbot slaves
> be online most/all of the time? Would a desktop machine
> which is typically only on during office hours on week
> days still be useful? I could probably answer this myself
> with a bit more background reading ;)
That is one of the wonders of buildbot. Just the server needs to be online.
You can indeed have a desktop machine: Whenever it suits you better
you start your buildbot slave, it connects to the server to see if
there is work to do and the server supplies work to be done. The
server can be instructed to only allow the slave to do a single task
at a time (to avoid overloading the slave).
I am now at a stage were I really need a server to test (with a public
address). I would volunteer to do the installation myself, but I would
need shell access to a machine where I could run a server process. No
root access is needed, but a web server is not enough as buildbot is
twisted based. Maybe we can convince the OBF to help. Again, I can
volunteer to do the installation. No root access is needed, just the
ability to run a server process and a couple of open ports.
Tiago
From biopython at maubp.freeserve.co.uk Mon Oct 11 09:05:40 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 11 Oct 2010 14:05:40 +0100
Subject: [Biopython-dev] Continuous integration
In-Reply-To:
References:
Message-ID:
2010/10/11 Tiago Ant?o :
> Hi Peter,
>
> 2010/10/11 Peter :
>> (1) Several of our tests go online to the NCBI or UniProt etc.
>> These tests can and do fail sometimes due to network issues.
>> Also, having some/many buildbot slaves running on a regular
>> basis (once a week? once a day?) would add up and this
>> load may be unwelcome. Perhaps we need to add an -offline
>> flag to run_tests.py which can skip any online tests?
>
> That might be a good idea (to have an --offline flag, I mean). A very
> good idea, indeed.
>
> I would like to put the infrastructure in place (if people are
> interested in going ahead with this...), but after that we need to
> stabilize a test policy and that will mean answering questions like
> that.
>
> As far as I see we will have many builders (tests under different
> conditions). Say 5 different Python versions (Jython included), at
> least 3 OSes. This is already 15 builders. This can easily creep up.
> Though the numbers are high, it is quite easy to maintain all this
> stuff: 3 volunteer machines (one for each OS) are enough. The cool
> thing about buildbot is that it is designed for volunteer machines to
> be added, so you can start your buildbot slave on your laptop when you
> are idle. It does not need an array of servers on demand to produce
> the tests.
>
> NCBI and Uniprot might not like to see 30 daily connections for tests
> :( . So we might need to have, say, one weekly test for each OS doing
> the network stuff (just a single Python version per OS, maybe) and
> dailies not doing network loads.
Exactly.
>> (2) You mention buildbot doesn't have built in support for
>> spotting changes in a git repository - but can it do this for
>> SVN? Since github.com also allow access to the git repo
>> via svn that might be a more elegant workaround.
>
> There are 2 different things to consider:
> 1. Spotting the git repository. There is no builtin support, but this
> is TRIVIAL nonetheless with the general adaptor of buildbot. It works
> like this:
> ? ? a. a developer does a push
> ? ? b. github has a hook system which allows for reporting a change
> to the repository to a certain URL/CGI. Fully automated, transparent
> to the developer.
> ? ? c. We supply a CGI that receives the event and informs buildbot.
> There are CGIs for github. We just have to stuff one in a webserver.
Do you even need a post-commit hook? Unless you want
to automatically run the tests after every commit (which might
be useful) wouldn't it be enough to do a daily checkout?
> 2. The slaves/builders have to download github code. In this case,
> buildbot HAS NATIVE SUPPORT.
Understood.
>> (3) Does the buildbot master require the buildbot slaves
>> be online most/all of the time? Would a desktop machine
>> which is typically only on during office hours on week
>> days still be useful? I could probably answer this myself
>> with a bit more background reading ;)
>
> That is one of the wonders of buildbot. Just the server needs to be online.
> You can indeed have a desktop machine: Whenever it suits you better
> you start your buildbot slave, it connects to the server to see if
> there is work to do and the server supplies work to be done. The
> server can be instructed to only allow the slave to do a single task
> at a time (to avoid overloading the slave).
Excellent. I guess the specifics of starting the buildbot slave
will be OS specific, thus it would be up to the machine owner
if this should happen automatically at login or not.
> I am now at a stage were I really need a server to test (with a public
> address). I would volunteer to do the installation myself, but I would
> need shell access to a machine where I could run a server process. No
> root access is needed, but a web server is not enough as buildbot is
> twisted based. Maybe we can convince the OBF to help. Again, I can
> volunteer to do the installation. No root access is needed, just the
> ability to run a server process and a couple of open ports.
I think we should have a work with the OBF as running this on
one of their servers does seem the best plan. I'll email you.
Peter
From tiagoantao at gmail.com Mon Oct 11 09:23:15 2010
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 11 Oct 2010 14:23:15 +0100
Subject: [Biopython-dev] Continuous integration
In-Reply-To:
References:
Message-ID:
2010/10/11 Peter :
> Do you even need a post-commit hook? Unless you want
> to automatically run the tests after every commit (which might
> be useful) wouldn't it be enough to do a daily checkout?
I have actually being doing this for my tests: a daily checkout. So we
do not need the hook.
I would go with the simpler solution for now: ignore the post-commit
hook, get something useful working (maybe a nightly build) and in the
future we might revisit this when things are better understood and
tested.
From bugzilla-daemon at portal.open-bio.org Mon Oct 18 07:16:49 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 18 Oct 2010 07:16:49 -0400
Subject: [Biopython-dev] [Bug 3146] New: DSSP ungraceful failure
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=3146
Summary: DSSP ungraceful failure
Product: Biopython
Version: 1.53
Platform: PC
OS/Version: All
Status: NEW
Severity: minor
Priority: P4
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: patrick.winters at gmail.com
The DSSP annotator should probably fail gracefully when the PDBParser and DSSP
disagree about the existence of a residue at a certain position. Here, DSSP
reports values for residue 115 of chain A, while the PDBParser throws a key
error.
from Bio.PDB import PDBParser
parser = PDBParser()
from Bio.PDB.DSSP import DSSP
structure=parser.get_structure("2p0i","pdb2p0i.ent")
model=structure[0]
dssp=DSSP(model, "pdb2p0i.ent")
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/pymodules/python2.6/Bio/PDB/DSSP.py", line 175, in __init__
res=chain[res_id]
File "/usr/lib/pymodules/python2.6/Bio/PDB/Chain.py", line 71, in __getitem__
return Entity.__getitem__(self, id)
File "/usr/lib/pymodules/python2.6/Bio/PDB/Entity.py", line 38, in
__getitem__
return self.child_dict[id]
KeyError: (' ', 115, ' ')
>>> model['A'][114]
>>> model['A'][116]
>>> model['A'][115]
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/pymodules/python2.6/Bio/PDB/Chain.py", line 71, in __getitem__
return Entity.__getitem__(self, id)
File "/usr/lib/pymodules/python2.6/Bio/PDB/Entity.py", line 38, in
__getitem__
return self.child_dict[id]
KeyError: (' ', 115, ' ')
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From barwil at gmail.com Tue Oct 19 08:34:45 2010
From: barwil at gmail.com (Bartek Wilczynski)
Date: Tue, 19 Oct 2010 14:34:45 +0200
Subject: [Biopython-dev] Moving Bio.Motif documentation into Tutorial.tex
In-Reply-To:
References:
Message-ID:
Hi,
I've started to look into merging Bio.Motif docs with the Tutorial. I have a
few questions:
- First, I need to find a good place in the tutorial to put it.
One possibility is to make a separate chapter for it, another option is
to put it as a subchapter in chapter 15 (cookbook).
I think it would be better to make it a separate chapter, similar to one
the ones discussing Bio.popgen or bio.phylo, So i thought it would make
sense to create it as a new chapter 13, entitled Sequence motif analysis
with Bio.Motif
-second, I have links and references to papers in there. The question would
be should I remove those to keep to the style of the tutorial
any thoughts are welcome
Bartek
On Sat, Sep 18, 2010 at 3:04 PM, Bartek Wilczynski wrote:
> Hi,
>
> On Sat, Sep 18, 2010 at 2:25 PM, Peter wrote:
>
>> Hi Bartek,
>>
>> I think it would be good to try and move your Bio.Motif
>> documentation from file Docs/cookbook/motif/motif.tex
>> into the main Docs/Tutorial.tex as a new chapter.
>> Currently it isn't obvious that Biopython supports
>> things like a Position Weight Matrix (PWM).
>>
>> What do you think?
>>
>> The text will need a slight update since we have now
>> deprecated and removed Bio.AlignAce and Bio.MEME,
>> but that should be easy.
>>
>
> In general, I'm all for it. It's just that right now is not necessarily the
> best time for me to put much work into it. I'm trying to meet a RECOMB
> deadline of Oct. 8th with a paper, so if it would not be a problem, I could
> update it to the current state of the API after that. On the other hand, if
> there's anybody who wants to do it before then, I can review the changes
> even earlier.
>
> thanks for remembering about it.
>
> Bartek
>
--
Bartek Wilczynski
==================
Postdoctoral fellow
EMBL, Furlong group
Meyerhoffstrasse 1,
69012 Heidelberg,
Germany
tel: +49 6221 387 8433
From biopython at maubp.freeserve.co.uk Tue Oct 19 08:45:47 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 19 Oct 2010 13:45:47 +0100
Subject: [Biopython-dev] Moving Bio.Motif documentation into Tutorial.tex
In-Reply-To:
References:
Message-ID:
On Tue, Oct 19, 2010 at 1:34 PM, Bartek Wilczynski wrote:
> Hi,
>
> I've started to look into merging Bio.Motif docs with the Tutorial. I have a
> few questions:
> - First, I need to find a good place in the tutorial to put it.
> ? ?One possibility is to make a separate chapter for it, another option is
> to put it as a subchapter in chapter 15 (cookbook).
> ? ?I think it would be better to make it a separate chapter, similar to one
> the ones discussing Bio.popgen or bio.phylo, So i thought it would make
> sense to create it as a new chapter 13, entitled Sequence motif analysis
> with Bio.Motif
I agree, create a new chapter (and add yourself to the authors list).
I'd definitely put it before the "Cookbook Chapter", and between the
Phylogenetics and "Supervised learning methods" chapters seems
reasonable.
> -second, I have links and references to papers in there. The question would
> be should I remove those to keep to the style of the tutorial
Keep them - links to external webpages are fine - they work well in both PDF
and HTML. For references we currently don't have a formal bibliography - but
we do have some existing case of links to papers already, e.g.
http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec:SeqIO-fastq-conversion
Peter
From bugzilla-daemon at portal.open-bio.org Tue Oct 19 10:17:22 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 19 Oct 2010 10:17:22 -0400
Subject: [Biopython-dev] [Bug 3026] Bio.SeqIO.InsdcIO._split_multi_line():
Your description cannot be broken into nice lines!
In-Reply-To:
Message-ID: <201010191417.o9JEHM0x029641@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=3026
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2010-10-19 10:17 EST -------
(In reply to comment #4)
> I do not know what I would like to happen here in addition to the improved
> error message. Probably not get an error at all and have biopython able to
> cope with these cases as well. I have just asked asimpson at ludwig.org.br
> whether fix of the data in dbEST would be feasible.
The plain text GenBank file from the NCBI is fine (see comment 1), but the
HTML version is not. I don't think this is really a problem with the raw
data...
Anyway, I've just committed a fix which means Biopython will write an over
long line and issue a warning:
http://github.com/biopython/biopython/commit/f25ccef1e07129a377954021e08e980b82b6e795
Marking as fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Tue Oct 19 11:54:43 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 19 Oct 2010 16:54:43 +0100
Subject: [Biopython-dev] Merging Uniprot XML parser?
Message-ID:
Hi all,
I've fixed a few issues I felt were holding up merging Andrea's UniProt
XML parser.
I've now tested the uniprot_sprot.txt and uniprot_sprot.xml are parsed
into more or less equivalent objects, and that these can be written out
as GenBank (well, GenPept) files or as EMBL/IMGT files (given recent
work to support protein EMBL files - which do exist but are rarely used).
This required "fixing" Bug 3026 to cope with long annotation that cannot
be line wrapper nicely (lots of long URL strings in UniProt XML comments).
http://bugzilla.open-bio.org/show_bug.cgi?id=3026
I'm tempted to remove the warning because it is so common... or make
it use the same text each time so you get warned once.
There are also some additions to the Bio.SeqFeature position classes,
since SwissProt/UniProt files can have uncertain positions.
Could someone take a look at the code here (a rebased branch), as I'd
like some independent testing (and better yet, code review):
http://github.com/peterjc/biopython/tree/uniprot
Thanks,
Peter
From eric.talevich at gmail.com Tue Oct 19 22:01:20 2010
From: eric.talevich at gmail.com (Eric Talevich)
Date: Tue, 19 Oct 2010 22:01:20 -0400
Subject: [Biopython-dev] Bio.PDB on Python 3
In-Reply-To:
References:
Message-ID:
On Mon, Aug 16, 2010 at 9:47 AM, Peter wrote:
> Hi all,
>
> A while back I installed NumPy from their svn under Python 3, so that I
> could test more of Biopython. I hadn't really looked at Bio.PDB until
> recently because test_PDB.py depended on Bio.KDTree which needs
> some C code to be compiled (which we haven't tried yet).
>
[...]
>
> This has revealed there are at least two issues with Bio.PDB to be
> addressed (see below).
>
[...]
>
> ======================================================================
> ERROR: test_ExposureCN (__main__.Exposure)
> HSExposureCN.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "test_PDB.py", line 612, in setUp
> ? ?structure=PDBParser(PERMISSIVE=True).get_structure('X', pdb_filename)
> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
> line 64, in get_structure
> ? ?self._parse(file.readlines())
> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
> line 84, in _parse
> ? ?self.trailer=self._parse_coordinates(coords_trailer)
> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
> line 200, in _parse_coordinates
> ? ?fullname, serial_number, element)
> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/StructureBuilder.py",
> line 185, in init_atom
> ? ?duplicate_atom=residue[name]
> TypeError: 'DisorderedResidue' object is not subscriptable
>
These errors occur when parsing Tests/PDB/a_structure.pdb under
permissive mode. In this structure, residue 3 is disordered, and that
triggers some exciting things.
The bug seems to be related to this method of DisorderedEntityWrapper
in Bio/PDB/Entity.py:
def __getattr__(self, method):
"Forward the method call to the selected child."
if not hasattr(self, 'selected_child'):
# Avoid problems with pickling
# Unpickling goes into infinite loop!
raise AttributeError
return getattr(self.selected_child, method)
When running the test script, where we reach lines 185-186 in
StructureBuilder.py:
if residue.has_id(name):
duplicate_atom=residue[name]
it gets magical. The method 'has_id' is not defined on the
DisorderedResidue class. Instead, if residue is an instance of
DisorderedResidue (subclass of DisorderedEntityWrapper), instead of
Residue (subclass of Entity), then accessing residue.has_id on that
object calls __getattr__, which in turn calls
residue.selected_child.has_id(id).
The next line raises a TypeError in Python 3, but not in Python 2 --
residue[name] seems to find the appropriate __getitem__ implementation
in Python 2 only.
My hypothesis is that Python 2 treats this magic-method call to
residue.__getitem__ as an attribute access, allowing
DisorderedEntityWrapper.__getattr__ to forward this access to the
appropriate child, some Residue instance, which does implement
__getitem__. In Python 3, __getitem__-related syntax could be
implemented slightly differently, so it's not seen as a __getattr__
access and everything falls apart. (I could be wrong about all of
this.)
So here's what I'm doing:
- In DisorderedEntityWrapper, implement __getitem__(self, id) such
that self.selected_child[id] is returned instead. This fixes most of
the errors but produces/uncovers three new ones. These new errors also
seem to indicate that magic methods on DisorderedEntityWrapper aren't
being handled through __getattr__ in Python 3.
- Fix the new errors.
I'll post the patch here before pushing it upstream once I get it working.
Best,
Eric
From eric.talevich at gmail.com Tue Oct 19 22:52:27 2010
From: eric.talevich at gmail.com (Eric Talevich)
Date: Tue, 19 Oct 2010 22:52:27 -0400
Subject: [Biopython-dev] Bio.PDB on Python 3
In-Reply-To:
References:
Message-ID:
On Tue, Oct 19, 2010 at 10:01 PM, Eric Talevich wrote:
> On Mon, Aug 16, 2010 at 9:47 AM, Peter wrote:
>> Hi all,
>>
>> A while back I installed NumPy from their svn under Python 3, so that I
>> could test more of Biopython. I hadn't really looked at Bio.PDB until
>> recently because test_PDB.py depended on Bio.KDTree which needs
>> some C code to be compiled (which we haven't tried yet).
>>
> [...]
>>
>> This has revealed there are at least two issues with Bio.PDB to be
>> addressed (see below).
>>
> [...]
>>
>> ======================================================================
>> ERROR: test_ExposureCN (__main__.Exposure)
>> HSExposureCN.
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>> ?File "test_PDB.py", line 612, in setUp
>> ? ?structure=PDBParser(PERMISSIVE=True).get_structure('X', pdb_filename)
>> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
>> line 64, in get_structure
>> ? ?self._parse(file.readlines())
>> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
>> line 84, in _parse
>> ? ?self.trailer=self._parse_coordinates(coords_trailer)
>> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
>> line 200, in _parse_coordinates
>> ? ?fullname, serial_number, element)
>> ?File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/StructureBuilder.py",
>> line 185, in init_atom
>> ? ?duplicate_atom=residue[name]
>> TypeError: 'DisorderedResidue' object is not subscriptable
>>
>
[...]
>
> So here's what I'm doing:
> ?- In DisorderedEntityWrapper, implement __getitem__(self, id) such
> that self.selected_child[id] is returned instead. This fixes most of
> the errors but produces/uncovers three new ones. These new errors also
> seem to indicate that magic methods on DisorderedEntityWrapper aren't
> being handled through __getattr__ in Python 3.
> ?- Fix the new errors.
>
>
> I'll post the patch here before pushing it upstream once I get it working.
As if we didn't have a better mechanism for this... here's a patch
that seems to work on both Pythons.
-Eric
diff --git a/Bio/PDB/Entity.py b/Bio/PDB/Entity.py
index ed17308..af2fcc7 100644
--- a/Bio/PDB/Entity.py
+++ b/Bio/PDB/Entity.py
@@ -165,10 +165,27 @@ class DisorderedEntityWrapper:
raise AttributeError
return getattr(self.selected_child, method)
+ def __getitem__(self, id):
+ "Return the child with the given id."
+ return self.selected_child[id]
+
def __setitem__(self, id, child):
"Add a child, associated with a certain id."
self.child_dict[id]=child
+ def __iter__(self):
+ "Return the number of children."
+ return iter(self.selected_child)
+
+ def __len__(self):
+ "Return the number of children."
+ return len(self.selected_child)
+
+ def __sub__(self, other):
+ """Subtraction with another object."""
+ return self.selected_child - other
+
+
# Public methods
def get_id(self):
From bugzilla-daemon at portal.open-bio.org Wed Oct 20 02:22:56 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 20 Oct 2010 02:22:56 -0400
Subject: [Biopython-dev] [Bug 3147] New: AlignIO.parse doesn't raise
StopIteration on empty files
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=3147
Summary: AlignIO.parse doesn't raise StopIteration on empty files
Product: Biopython
Version: 1.55
Platform: PC
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: mdehoon at ims.u-tokyo.ac.jp
For example:
$ rm -rf test.aln
$ touch test.aln
$ python
Python 2.7 (r27:82500, Jul 6 2010, 13:27:45)
[GCC 4.3.4 20090804 (release) 1] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import AlignIO
>>> records = AlignIO.parse(open("test.aln"), 'clustal')
>>> records.next()
>>>
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Oct 20 05:12:06 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 20 Oct 2010 05:12:06 -0400
Subject: [Biopython-dev] [Bug 3147] AlignIO.parse doesn't raise
StopIteration on empty files
In-Reply-To:
Message-ID: <201010200912.o9K9C6og005150@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=3147
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2010-10-20 05:12 EST -------
In this case you are getting back None - which may have been allowed back on
Python 2.2, see also: http://docs.python.org/release/2.4/lib/typeiter.html
I'm used to iterators either returning None or raising StopIteration at the
end of the elements - but quite often I've had to write code like this:
while True:
try:
record = i.next()
except StopIteration:
record = None
if record is None:
break
...
The above documentation implies it would be correct to expect a StopIteration
exception here.
This also applies to some of the Bio.SeqIO parsers too I'm sure, and
potentially other parsers in Biopython.
To identify most issues we can just change test_SeqIO.py and test_AlignIO.py
to check for the exception...
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Oct 20 06:03:27 2010
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 20 Oct 2010 06:03:27 -0400
Subject: [Biopython-dev] [Bug 3147] AlignIO.parse doesn't raise
StopIteration on empty files
In-Reply-To:
Message-ID: <201010201003.o9KA3RY5006926@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=3147
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2010-10-20 06:03 EST -------
Added test and fixed for AlignIO,
http://github.com/biopython/biopython/commit/208d926d8e2e706a8bd5d0eee215a26c0457946c
Added test for SeqIO (passes already),
http://github.com/biopython/biopython/commit/246bd426094ecba9943aba2f58da8f3b7cc4a5f5
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Wed Oct 20 06:45:33 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 20 Oct 2010 11:45:33 +0100
Subject: [Biopython-dev] Bio.PDB on Python 3
In-Reply-To:
References:
Message-ID:
On Wed, Oct 20, 2010 at 3:52 AM, Eric Talevich wrote:
>
> As if we didn't have a better mechanism for this... here's a patch
> that seems to work on both Pythons.
> -Eric
Ha ha.
Thanks for digging into this - I thought it was going to be complicated,
and it sounds like it was.
The patch works fine for me - please check it into the master.
Cheers,
Peter
From eric.talevich at gmail.com Wed Oct 20 22:41:19 2010
From: eric.talevich at gmail.com (Eric Talevich)
Date: Wed, 20 Oct 2010 22:41:19 -0400
Subject: [Biopython-dev] DendroPy is now BSD-licensed (was: [Nexml-discuss]
NexML schema question)
Message-ID:
Folks,
About two months ago, Jeet Sukumaran mentioned on the NeXML-discuss mailing
list that he would be willing to relicense DendroPy, an excellent
phylogenetics library for Python, from GPL to the more permissive BSD
license.
And shortly thereafter, he did:
http://github.com/jeetsukumaran/DendroPy/commit/d3a91621fb62b37c311a462cae150772dd735771
For those just tuning in, DendroPy supports tree I/O in the NeXML format,
but not phyloXML; Biopython supports phyloXML but not NeXML. Since the
licenses are now compatible, we could probably make good use of Jeet's NeXML
parsing code at the very least. Unfortunately, as evidenced by my two-month
delay, I don't really have the leeway to do the integration myself this
semester. But here's a heads-up anyway.
Relatedly, Jaime Huerta Cepas (author of ETE, a Python Environment for Tree
Exploration) indicated interest in generating phyloXML and NeXML parsers
from XSD schemas -- another one to keep an eye out for.
Regards,
Eric
---------- Forwarded message ----------
From: Jeet Sukumaran
Date: Wed, Sep 22, 2010 at 12:51 PM
Subject: Re: [Nexml-discuss] NexML schema question
To: Eric Talevich
Cc: Jaime Huerta Cepas , "NeXML-discuss (list)" <
nexml-discuss at lists.sourceforge.net>
Hi Eric,
Neither Mark nor I have any objections to releasing the DendroPy code to the
Biopython library under the BSD license. Not sure what legalities are
involved beyond saying "go for it", but if that's all it takes then "go for
it!".
-- jeet
On 9/22/10 10:49 AM, Eric Talevich wrote:
> On Sep 22, 2010, at 9:13 AM, Jaime Huerta Cepas wrote:
>
>>
>> all I know is that Eric Talevich (the person who wrote the phyloXML parser
>> in biopython) seems to be working on this, as claimed in the biopython wiki.
>> But I don't think is ready yet.
>>
>
> I took a crack at NeXML parsing a while ago, but it's nowhere near
> ready, and I don't expect to be able to work on it again for several
> more months.
>
> If you're looking for a currently usable library for working with
> NeXML (I didn't catch the rest of this discussion), DendroPy is nice.
> Its internal representation of tree objects isn't the same as
> Biopython's Bio.Phylo, and it's GPL, so we can't just plug it directly
> into Biopython (which uses a more permissive BSD-style license). But
> serializing a tree to Newick from Biopython and then parsing the
> Newick string in DendroPy, or the reverse, would give you some basic
> interoperability.
>
>
> What I think is that XSD schemas could be automatically parsed to
>> generate parsers :) This would allow us to have a comprehensive and up to
>> date parser for the NexML schema that everyone can use.
>>
>
> Sure! PhyloXML is defined by an XSD schema, too. With this approach,
> would it be possible for parsed phyloXML and NeXML tree objects to
> share a base class, so the same methods are available on each?
>
> -Eric
>
>
From mjldehoon at yahoo.com Sat Oct 23 05:19:26 2010
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sat, 23 Oct 2010 02:19:26 -0700 (PDT)
Subject: [Biopython-dev] Tracking DTD files in Bio.Entrez
Message-ID: <893615.35060.qm@web62401.mail.re1.yahoo.com>
Hi everybody,
As you may know, the parser for XML data generated by NCBI in Bio.Entrez makes use of DTD files (from NCBI) to correctly interpret the XML data. Most (if not all) DTD files are included in the Biopython distribution in Bio/Entrez/DTDs, but particularly when NCBI updates their DTD files it may happen that a required DTD file is missing. I have now modified the parser so that it tracks the URL of DTD files, so that it can access DTDs over the internet if they are not available locally.
Still, parsing local DTD files is much faster than retrieving a remote DTD file, so when a DTD file is missing the parser will show a warning with the missing DTD, the URL where it can be found, and which directory it should be saved in (which typically is something like /usr/local/lib/python2.7/site-packages/Bio/Entrez/DTDs).
For users who do not have write permission to this directory, it may be good to also allow storing these files in the users home directory, for example in ~/.biopython/Bio/Entrez/DTDs. If we start using such a directory, we could also consider to automatically retrieve DTD files and save them in that directory without asking the user to do that manually.
I guess it's a trade-off between convenience for the user (if we download and save DTDs automatically), and transparency (we would be saving files in the user's home directory without him/her being aware of it).
Any opinions? Is this a good idea?
-Michiel.
From biopython at maubp.freeserve.co.uk Mon Oct 25 17:28:24 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 25 Oct 2010 22:28:24 +0100
Subject: [Biopython-dev] [Biopython] Getting involved
In-Reply-To:
References:
Message-ID:
On Mon, Oct 25, 2010 at 9:32 PM, Dragoslav Zaric wrote:
> Dear Peter,
>
> I think that this:
>
> "Can you program in C and are you familiar with the C/Python
> API? We will need to look at porting our C code from Python 2
> to Python 3, and this is quite complicated."
>
> is best idea for start. I can code in C, and have experience
> both with python 2.7 and 3. Will read tomorrow about C/Python
> API.
>
> Kind regards
Hi Dragoslav,
I'm glad you sound enthusiastic, and I hope you can make
some progress...
Our plan (following what the NumPy project are doing) is
to have a single code base targeting Python 2.x.
All the Python code is automatically converted using the
2to3 script into Python 3. There are a few special cases,
but that work is mostly done now.
All the C code will need to use #ifdef statements to make
the same C file work on both Python 2 and Python 3. The
bad news is that the basic API for writing C extension
modules for Python has changed.
What I suggest you do first, is make sure you can get
the latest Biopython source code from git, compile it
under Python 2, and run the unit tests. Then try 2to3
and running the tests under Python 3 (see the README
file).
Next I would trying updating one of the smaller C
modules in Biopython to work on Python 3. You'll
need to edit our setup.py to compile what you are
working on (currently we compile none of the C
code on Python 3). I don't yet have a feel for how
much work this will be.
Please sign up to the biopython-dev mailing list where
we can discuss things in more detail. The main list is
more for user support and general discussion.
Thanks, and good luck!
Peter
From zaricdragoslav at gmail.com Mon Oct 25 19:34:29 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 03:34:29 +0400
Subject: [Biopython-dev] [Biopython] Getting involved
In-Reply-To:
References:
Message-ID:
Dear Peter,
I have subscribed to biopython-dev mailing list and I have downloaded
source code with git.
kind regards
On Tue, Oct 26, 2010 at 1:28 AM, Peter wrote:
> On Mon, Oct 25, 2010 at 9:32 PM, Dragoslav Zaric wrote:
>> Dear Peter,
>>
>> I think that this:
>>
>> "Can you program in C and are you familiar with the C/Python
>> API? We will need to look at porting our C code from Python 2
>> to Python 3, and this is quite complicated."
>>
>> is best idea for start. I can code in C, and have experience
>> both with python 2.7 and 3. Will read tomorrow about C/Python
>> API.
>>
>> Kind regards
>
> Hi Dragoslav,
>
> I'm glad you sound enthusiastic, and I hope you can make
> some progress...
>
> Our plan (following what the NumPy project are doing) is
> to have a single code base targeting Python 2.x.
>
> All the Python code is automatically converted using the
> 2to3 script into Python 3. There are a few special cases,
> but that work is mostly done now.
>
> All the C code will need to use #ifdef statements to make
> the same C file work on both Python 2 and Python 3. The
> bad news is that the basic API for writing C extension
> modules for Python has changed.
>
> What I suggest you do first, is make sure you can get
> the latest Biopython source code from git, compile it
> under Python 2, and run the unit tests. Then try 2to3
> and running the tests under Python 3 (see the README
> file).
>
> Next I would trying updating one of the smaller C
> modules in Biopython to work on Python 3. You'll
> need to edit our setup.py to compile what you are
> working on (currently we compile none of the C
> code on Python 3). I don't yet have a feel for how
> much work this will be.
>
> Please sign up to the biopython-dev mailing list where
> we can discuss things in more detail. The main list is
> more for user support and general discussion.
>
> Thanks, and good luck!
>
> Peter
>
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From zaricdragoslav at gmail.com Tue Oct 26 03:24:50 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 11:24:50 +0400
Subject: [Biopython-dev] Plan for upgrade
Message-ID:
Dear Peter,
This is what I found on python 3 web pages:
----------------------------------------------------------------------------------------------------------------
Porting To Python 3.0
For porting existing Python 2.5 or 2.6 source code to Python 3.0, the
best strategy is the following:
(Prerequisite:) Start with excellent test coverage.
Port to Python 2.6. This should be no more work than the average port
from Python 2.x to Python 2.(x+1). Make sure all your tests pass.
(Still using 2.6:) Turn on the -3 command line switch. This enables
warnings about features that will be removed (or change) in 3.0. Run
your test suite again, and fix code that you get warnings about until
there are no warnings left, and all your tests still pass.
Run the 2to3 source-to-source translator over your source code tree.
(See 2to3 - Automated Python 2 to 3 code translation for more on this
tool.) Run the result of the translation under Python 3.0. Manually
fix up any remaining issues, fixing problems until all tests pass
again.
It is not recommended to try to write source code that runs unchanged
under both Python 2.6 and 3.0; you?d have to use a very contorted
coding style, e.g. avoiding print statements, metaclasses, and much
more. If you are maintaining a library that needs to support both
Python 2.6 and Python 3.0, the best approach is to modify step 3 above
by editing the 2.6 version of the source code and running the 2to3
translator again, rather than editing the 3.0 version of the source
code.
----------------------------------------------------------------------------------------------------------------
And this is page for 2to3 translator:
http://docs.python.org/release/3.0.1/library/2to3.html#to3-reference
So can we start to agree on approach and tactics.
Kind regards
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From zaricdragoslav at gmail.com Tue Oct 26 03:34:31 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 11:34:31 +0400
Subject: [Biopython-dev] Changed in C API
Message-ID:
Dear Peter,
The list of changes in Python 3 is not complete. This is current list:
------------------------------------------------------------------------------------------------
Due to time constraints, here is a very incomplete list of changes to the C API.
Support for several platforms was dropped, including but not limited
to Mac OS 9, BeOS, RISCOS, Irix, and Tru64.
PEP 3118: New Buffer API.
PEP 3121: Extension Module Initialization & Finalization.
PEP 3123: Making PyObject_HEAD conform to standard C.
No more C API support for restricted execution.
PyNumber_Coerce, PyNumber_CoerceEx, PyMember_Get, and PyMember_Set C
APIs are removed.
New C API PyImport_ImportModuleNoBlock, works like
PyImport_ImportModule but won?t block on the import lock (returning an
error instead).
Renamed the boolean conversion C-level slot and method: nb_nonzero is
now nb_bool.
Removed METH_OLDARGS and WITH_CYCLE_GC from the C API.
------------------------------------------------------------------------------------------------
Can you tell me what are exactly versions that we are converting, from
2.7 to 3.0.1 ??
This is also what I have read on python 3 web site:
------------------------------------------------------------------------------------------------
The net result of the 3.0 generalizations is that Python 3.0 runs the
pystone benchmark around 10% slower than Python 2.5. Most likely the
biggest cause is the removal of special-casing for small integers.
There?s room for improvement, but it will happen after 3.0 is
released!
------------------------------------------------------------------------------------------------
This means that python 3 is still no optimized or like all software
start to be worse with new versions :)
Kind regards
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From biopython at maubp.freeserve.co.uk Tue Oct 26 04:43:48 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 09:43:48 +0100
Subject: [Biopython-dev] Plan for upgrade
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 8:24 AM, Dragoslav Zaric
wrote:
> Dear Peter,
>
> This is what I found on python 3 web pages:
> ----------------------------------------------------------------------------------------------------------------
> Porting To Python 3.0
> For porting existing Python 2.5 or 2.6 source code to Python 3.0, the
> best strategy is the following:
>
> (Prerequisite:) Start with excellent test coverage.
> Port to Python 2.6. This should be no more work than the average port
> from Python 2.x to Python 2.(x+1). Make sure all your tests pass.
> (Still using 2.6:) Turn on the -3 command line switch. This enables
> warnings about features that will be removed (or change) in 3.0. Run
> your test suite again, and fix code that you get warnings about until
> there are no warnings left, and all your tests still pass.
> Run the 2to3 source-to-source translator over your source code tree.
> (See 2to3 - Automated Python 2 to 3 code translation for more on this
> tool.) Run the result of the translation under Python 3.0. Manually
> fix up any remaining issues, fixing problems until all tests pass
> again.
> It is not recommended to try to write source code that runs unchanged
> under both Python 2.6 and 3.0; you?d have to use a very contorted
> coding style, e.g. avoiding print statements, metaclasses, and much
> more. If you are maintaining a library that needs to support both
> Python 2.6 and Python 3.0, the best approach is to modify step 3 above
> by editing the 2.6 version of the source code and running the 2to3
> translator again, rather than editing the 3.0 version of the source
> code.
> ----------------------------------------------------------------------------------------------------------------
>
> And this is page for 2to3 translator:
>
> http://docs.python.org/release/3.0.1/library/2to3.html#to3-reference
>
> So can we start to agree on approach and tactics.
>
> Kind regards
Hi Dragoslav,
Yes, that is basically what we are doing for the pure python code.
We still write our code for Python 2.x (currently Python 2.4 to 2.7),
and then use 2to3 convert it to work on Python 3.x (currently
testing on 3.1, at the end of the year we'll be trying the planned
Python 3.2 beta as well). That is the easy part - its the C code
we need to handle now for our extension modules (and the 2to3
script does not do this). Perhaps I was too concise earlier.
http://lists.open-bio.org/pipermail/biopython-dev/2010-October/008311.html
Peter
From biopython at maubp.freeserve.co.uk Tue Oct 26 04:47:58 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 09:47:58 +0100
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 8:34 AM, Dragoslav Zaric
wrote:
> Dear Peter,
>
> The list of changes in Python 3 is not complete. This is current list:
> ...
>
> Can you tell me what are exactly versions that we are converting, from
> 2.7 to 3.0.1 ??
We currently support Python 2.4 to 2.7 (but plan to drop support
for Python 2.4 soon). We've been testing on Python 3.1 and except
to support later versions as they are released.
I personally don't really care about Python 3.0 (it would be nice if
that works too, but it is not essential).
> This means that python 3 is still no optimized or like all software
> start to be worse with new versions :)
Python 3.1 is already out and is faster than Python 3.0. Some things
are still slower than Python 2 though, in particular we've noticed this
for parsing since by default Python 3 uses unicode instead of byte
strings.
Peter
From zaricdragoslav at gmail.com Tue Oct 26 05:11:32 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 13:11:32 +0400
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
Ok Peter,
First it is my mistake that I was talking about python code upgrade. I
understand
that you want me to do C code upgrade, but at the end all should work
together so this
is why I looked at overall upgrade process.
I have downloaded latest python source code and I have searched for .c
files and .h files
and this is what I have found:
Bio\Cluster\cluster.c
Bio\Cluster\clustermodule.c
Bio\cMarkovModelmodule.c
Bio\cpairwise2module.c
Bio\csupport.c
Bio\KDTree\KDTree.c
Bio\KDTree\KDTreemodule.c
Bio\Motif\_pwm.c
Bio\Nexus\cnexus.c
Bio\PDB\mmCIF\lex.yy.c
Bio\PDB\mmCIF\mmcif_test.c
Bio\PDB\mmCIF\MMCIFlexmodule.c
Bio\trie.c
Bio\triemodule.c
Bio\Cluster\cluster.h
Bio\csupport.h
Bio\KDTree\KDTree.h
Bio\KDTree\Neighbor.h
Bio\trie.h
Are these all files you want me to upgrade to python 3.1 ?
Kind regards
On Tue, Oct 26, 2010 at 12:47 PM, Peter wrote:
> On Tue, Oct 26, 2010 at 8:34 AM, Dragoslav Zaric
> wrote:
>> Dear Peter,
>>
>> The list of changes in Python 3 is not complete. This is current list:
>> ...
>>
>> Can you tell me what are exactly versions that we are converting, from
>> 2.7 to 3.0.1 ??
>
> We currently support Python 2.4 to 2.7 (but plan to drop support
> for Python 2.4 soon). We've been testing on Python 3.1 and except
> to support later versions as they are released.
>
> I personally don't really care about Python 3.0 (it would be nice if
> that works too, but it is not essential).
>
>> This means that python 3 is still no optimized or like all software
>> start to be worse with new versions :)
>
> Python 3.1 is already out and is faster than Python 3.0. Some things
> are still slower than Python 2 though, in particular we've noticed this
> for parsing since by default Python 3 uses unicode instead of byte
> strings.
>
> Peter
>
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From biopython at maubp.freeserve.co.uk Tue Oct 26 05:47:21 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 10:47:21 +0100
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 10:11 AM, Dragoslav Zaric
wrote:
> Ok Peter,
>
> First it is my mistake that I was talking about python code upgrade.
> I understand that you want me to do C code upgrade, but at the
> end all should work together so this is why I looked at overall upgrade
> process.
>
> I have downloaded latest python source code and I have searched
> for .c files and .h files and this is what I have found:
>
> Bio\Cluster\cluster.c
> Bio\Cluster\clustermodule.c
> Bio\cMarkovModelmodule.c
> Bio\cpairwise2module.c
> Bio\csupport.c
> Bio\KDTree\KDTree.c
> Bio\KDTree\KDTreemodule.c
> Bio\Motif\_pwm.c
> Bio\Nexus\cnexus.c
> Bio\PDB\mmCIF\lex.yy.c
> Bio\PDB\mmCIF\mmcif_test.c
> Bio\PDB\mmCIF\MMCIFlexmodule.c
> Bio\trie.c
> Bio\triemodule.c
>
> Bio\Cluster\cluster.h
> Bio\csupport.h
> Bio\KDTree\KDTree.h
> Bio\KDTree\Neighbor.h
> Bio\trie.h
What OS are you using? From the slashes I'd guess
Windows (which may complicate things - getting the
compilers all setup is more work).
>
> Are these all files you want me to upgrade to python 3.1 ?
>
Yes - but not all of them are equally important, and some
will be more complicated to port.
For example, the Nexus, MarkovModelmodule and
cMarkovModelmodule C code have a Python fallback
(i.e. the C code is not essential, just faster).
Some of those (e.g. Bio.Cluster and Bio.KDTree) depend
on NumPy, which may make things more complicated.
You will need to install NumPy (for both Python 2 and 3).
Some may have string encoding issues (bytes vs unicode),
e.g. Nexus, Motif
The mmCIF module is not urgent. This is a file parser for
the Bio.PDB code, and we have discussed replacing this
in C. One reason for this is it currently depends on the
3rd party library flex.
I think Bio/Motif/_pwm.c would be a good module to start
with. It is a short simple module exposing a single
function to Python.
You should read this:
http://wiki.python.org/moin/PortingExtensionModulesToPy3k
Peter
From zaricdragoslav at gmail.com Tue Oct 26 06:01:46 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 14:01:46 +0400
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
Do not worry Peter,
I am writing from work, that is why I am using windows. At home I have
two lap tops
and both are Linux :) I do not have windows on any partition :)
I use and developo only on Linux outside of main job.
Ok, when I get home I will read
http://wiki.python.org/moin/PortingExtensionModulesToPy3k
and start to work on
Bio/Motif/_pwm.c
Kind regards
On Tue, Oct 26, 2010 at 1:47 PM, Peter wrote:
> On Tue, Oct 26, 2010 at 10:11 AM, Dragoslav Zaric
> wrote:
>> Ok Peter,
>>
>> First it is my mistake that I was talking about python code upgrade.
>> I understand that you want me to do C code upgrade, but at the
>> end all should work together so this is why I looked at overall upgrade
>> process.
>>
>> I have downloaded latest python source code and I have searched
>> for .c files and .h files and this is what I have found:
>>
>> Bio\Cluster\cluster.c
>> Bio\Cluster\clustermodule.c
>> Bio\cMarkovModelmodule.c
>> Bio\cpairwise2module.c
>> Bio\csupport.c
>> Bio\KDTree\KDTree.c
>> Bio\KDTree\KDTreemodule.c
>> Bio\Motif\_pwm.c
>> Bio\Nexus\cnexus.c
>> Bio\PDB\mmCIF\lex.yy.c
>> Bio\PDB\mmCIF\mmcif_test.c
>> Bio\PDB\mmCIF\MMCIFlexmodule.c
>> Bio\trie.c
>> Bio\triemodule.c
>>
>> Bio\Cluster\cluster.h
>> Bio\csupport.h
>> Bio\KDTree\KDTree.h
>> Bio\KDTree\Neighbor.h
>> Bio\trie.h
>
> What OS are you using? From the slashes I'd guess
> Windows (which may complicate things - getting the
> compilers all setup is more work).
>
>>
>> Are these all files you want me to upgrade to python 3.1 ?
>>
>
> Yes - but not all of them are equally important, and some
> will be more complicated to port.
>
> For example, the Nexus, MarkovModelmodule and
> cMarkovModelmodule C code have a Python fallback
> (i.e. the C code is not essential, just faster).
>
> Some of those (e.g. Bio.Cluster and Bio.KDTree) depend
> on NumPy, which may make things more complicated.
> You will need to install NumPy (for both Python 2 and 3).
>
> Some may have string encoding issues (bytes vs unicode),
> e.g. Nexus, Motif
>
> The mmCIF module is not urgent. This is a file parser for
> the Bio.PDB code, and we have discussed replacing this
> in C. One reason for this is it currently depends on the
> 3rd party library flex.
>
> I think Bio/Motif/_pwm.c would be a good module to start
> with. It is a short simple module exposing a single
> function to Python.
>
> You should read this:
> http://wiki.python.org/moin/PortingExtensionModulesToPy3k
>
> Peter
>
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From zaricdragoslav at gmail.com Tue Oct 26 06:03:30 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 14:03:30 +0400
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
Dear Peter,
You write this:
"What I suggest you do first, is make sure you can get
the latest Biopython source code from git, compile it
under Python 2, and run the unit tests. Then try 2to3
and running the tests under Python 3 (see the README
file)."
Can you tell me how do I run unit tests in any python version ??
Are there unit tests for C modules or these tests cover everything ??
Kind regards
On Tue, Oct 26, 2010 at 2:01 PM, Dragoslav Zaric
wrote:
> Do not worry Peter,
>
> I am writing from work, that is why I am using windows. At home I have
> two lap tops
> and both are Linux :) I do not have windows on any partition :)
>
> I use and developo only on Linux outside of main job.
>
> Ok, when I get home I will read
>
> http://wiki.python.org/moin/PortingExtensionModulesToPy3k
>
> and start to work on
>
> Bio/Motif/_pwm.c
>
> Kind regards
>
>
> On Tue, Oct 26, 2010 at 1:47 PM, Peter wrote:
>> On Tue, Oct 26, 2010 at 10:11 AM, Dragoslav Zaric
>> wrote:
>>> Ok Peter,
>>>
>>> First it is my mistake that I was talking about python code upgrade.
>>> I understand that you want me to do C code upgrade, but at the
>>> end all should work together so this is why I looked at overall upgrade
>>> process.
>>>
>>> I have downloaded latest python source code and I have searched
>>> for .c files and .h files and this is what I have found:
>>>
>>> Bio\Cluster\cluster.c
>>> Bio\Cluster\clustermodule.c
>>> Bio\cMarkovModelmodule.c
>>> Bio\cpairwise2module.c
>>> Bio\csupport.c
>>> Bio\KDTree\KDTree.c
>>> Bio\KDTree\KDTreemodule.c
>>> Bio\Motif\_pwm.c
>>> Bio\Nexus\cnexus.c
>>> Bio\PDB\mmCIF\lex.yy.c
>>> Bio\PDB\mmCIF\mmcif_test.c
>>> Bio\PDB\mmCIF\MMCIFlexmodule.c
>>> Bio\trie.c
>>> Bio\triemodule.c
>>>
>>> Bio\Cluster\cluster.h
>>> Bio\csupport.h
>>> Bio\KDTree\KDTree.h
>>> Bio\KDTree\Neighbor.h
>>> Bio\trie.h
>>
>> What OS are you using? From the slashes I'd guess
>> Windows (which may complicate things - getting the
>> compilers all setup is more work).
>>
>>>
>>> Are these all files you want me to upgrade to python 3.1 ?
>>>
>>
>> Yes - but not all of them are equally important, and some
>> will be more complicated to port.
>>
>> For example, the Nexus, MarkovModelmodule and
>> cMarkovModelmodule C code have a Python fallback
>> (i.e. the C code is not essential, just faster).
>>
>> Some of those (e.g. Bio.Cluster and Bio.KDTree) depend
>> on NumPy, which may make things more complicated.
>> You will need to install NumPy (for both Python 2 and 3).
>>
>> Some may have string encoding issues (bytes vs unicode),
>> e.g. Nexus, Motif
>>
>> The mmCIF module is not urgent. This is a file parser for
>> the Bio.PDB code, and we have discussed replacing this
>> in C. One reason for this is it currently depends on the
>> 3rd party library flex.
>>
>> I think Bio/Motif/_pwm.c would be a good module to start
>> with. It is a short simple module exposing a single
>> function to Python.
>>
>> You should read this:
>> http://wiki.python.org/moin/PortingExtensionModulesToPy3k
>>
>> Peter
>>
>
>
>
> --
> Dragoslav Zaric
>
> Professional Programmer
> MSc Astrophysics
>
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From biopython at maubp.freeserve.co.uk Tue Oct 26 06:12:02 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 11:12:02 +0100
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 11:03 AM, Dragoslav Zaric
wrote:
> Dear Peter,
>
> You write this:
>
> "What I suggest you do first, is make sure you can get
> the latest Biopython source code from git, compile it
> under Python 2, and run the unit tests. Then try 2to3
> and running the tests under Python 3 (see the README
> file)."
>
> Can you tell me how do I run unit tests in any python version ??
Have a look at the "The Biopython testing framework"
chapter in the tutorial (although this does not talk about
Python 3).
For python 2.x, from the Tests directory do:
python run_tests.py
For a particular version of Python, do:
python2.6 run_tests.py
For Python 3.x first convert the code with 2to3 as described
in the README file, then:
python3 run_tests.py
For a particular version of Python 3, do:
python3.1 run_tests.py
You can run selected tests rather than all of them, e.g.
python run_tests.py test_Motif.py
>
> Are there unit tests for C modules or these tests cover everything ??
>
The tests are all written in Python, and will test the C modules via
their Python interface.
Peter
From zaricdragoslav at gmail.com Tue Oct 26 06:59:05 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 14:59:05 +0400
Subject: [Biopython-dev] Changed in C API
In-Reply-To:
References:
Message-ID:
Ok Peter,
I will work on this tonight and let you know how is it going,
Kind regards
On Tue, Oct 26, 2010 at 2:12 PM, Peter wrote:
> On Tue, Oct 26, 2010 at 11:03 AM, Dragoslav Zaric
> wrote:
>> Dear Peter,
>>
>> You write this:
>>
>> "What I suggest you do first, is make sure you can get
>> the latest Biopython source code from git, compile it
>> under Python 2, and run the unit tests. Then try 2to3
>> and running the tests under Python 3 (see the README
>> file)."
>>
>> Can you tell me how do I run unit tests in any python version ??
>
> Have a look at the "The Biopython testing framework"
> chapter in the tutorial (although this does not talk about
> Python 3).
>
> For python 2.x, from the Tests directory do:
>
> python run_tests.py
>
> For a particular version of Python, do:
>
> python2.6 run_tests.py
>
> For Python 3.x first convert the code with 2to3 as described
> in the README file, then:
>
> python3 run_tests.py
>
> For a particular version of Python 3, do:
>
> python3.1 run_tests.py
>
> You can run selected tests rather than all of them, e.g.
>
> python run_tests.py test_Motif.py
>
>>
>> Are there unit tests for C modules or these tests cover everything ??
>>
>
> The tests are all written in Python, and will test the C modules via
> their Python interface.
>
> Peter
>
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From zaricdragoslav at gmail.com Tue Oct 26 12:28:00 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 20:28:00 +0400
Subject: [Biopython-dev] Test python 2.6
Message-ID:
Dear Peter,
I run
python run_tests.py in Tests folder with python 2.6.2 I got one error
in test file
test_SeqIO_online.py
I open the file and went to line that caused error and it looks like
it is not functional
error, it is just data error, because there is no data for in database
for supplied parameters:
("genome", ["fasta", "gb"], "X52960", 248, "Ktxz0HgMlhQmrKTuZpOxPZJ6zGU")
So I commented this line and leave other two and all tests passed after this.
Anyway, now I am installing python 3.1.2 and will run tests when
finish installation.
Kind regards
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From biopython at maubp.freeserve.co.uk Tue Oct 26 12:41:57 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 17:41:57 +0100
Subject: [Biopython-dev] Test python 2.6
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 5:28 PM, Dragoslav Zaric
wrote:
> Dear Peter,
>
> I run
>
> python run_tests.py in Tests folder with python 2.6.2 I got one error
> in test file test_SeqIO_online.py
> I open the file and went to line that caused error and it looks like
> it is not functional error, it is just data error, because there is no data
> for in database for supplied parameters:
>
> ("genome", ["fasta", "gb"], "X52960", 248, "Ktxz0HgMlhQmrKTuZpOxPZJ6zGU")
>
> So I commented this line and leave other two and all tests passed after this.
I'd noticed that failing a little while back, and had assumed it was just a
temporary network problem. In fact looks like the NCBI have changed
how searching against the genome database works. This update fixes
the test on Python 2.6:
http://github.com/biopython/biopython/commit/ad1dd31828c1488c72bffba3bc769c012439ea90
> Anyway, now I am installing python 3.1.2 and will run tests when
> finish installation.
Note there are some known failures on Python 3, this includes
test_SeqIO_online.py (bytes vs unicode).
Peter
From zaricdragoslav at gmail.com Tue Oct 26 12:45:37 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Tue, 26 Oct 2010 20:45:37 +0400
Subject: [Biopython-dev] Test python 2.6
In-Reply-To:
References:
Message-ID:
I run now 2to3 on biopython folder and when it finish I will run run_tests.py
This will also test C modules ?
Kind regards
On Tue, Oct 26, 2010 at 8:41 PM, Peter wrote:
> On Tue, Oct 26, 2010 at 5:28 PM, Dragoslav Zaric
> wrote:
>> Dear Peter,
>>
>> I run
>>
>> python run_tests.py in Tests folder with python 2.6.2 I got one error
>> in test file test_SeqIO_online.py
>> I open the file and went to line that caused error and it looks like
>> it is not functional error, it is just data error, because there is no data
>> for in database for supplied parameters:
>>
>> ("genome", ["fasta", "gb"], "X52960", 248, "Ktxz0HgMlhQmrKTuZpOxPZJ6zGU")
>>
>> So I commented this line and leave other two and all tests passed after this.
>
> I'd noticed that failing a little while back, and had assumed it was just a
> temporary network problem. In fact looks like the NCBI have changed
> how searching against the genome database works. This update fixes
> the test on Python 2.6:
>
> http://github.com/biopython/biopython/commit/ad1dd31828c1488c72bffba3bc769c012439ea90
>
>> Anyway, now I am installing python 3.1.2 and will run tests when
>> finish installation.
>
> Note there are some known failures on Python 3, this includes
> test_SeqIO_online.py (bytes vs unicode).
>
> Peter
>
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From biopython at maubp.freeserve.co.uk Tue Oct 26 12:59:44 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 17:59:44 +0100
Subject: [Biopython-dev] Test python 2.6
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 5:45 PM, Dragoslav Zaric
wrote:
> I run now 2to3 on biopython folder and when it finish I will run run_tests.py
> This will also test C modules ?
Using run_tests.py would cover everything unless it has been disabled
on Python 3, or depends on some C code which hasn't been compiled
(in which case the test should be skipped).
Note we've edited setup.py not to try and compile any C code on Python 3
(because currently none of it works). You'll need to edit setup.py to
compile any C code you work on for Python 3.
For C modules which don't use NumPy, change this bit:
...
elif sys.version_info[0] == 3:
# TODO - Must update our C extensions for Python 3
EXTENSIONS = []
...
For extensions using NumPy, see class build_ext_biopython
Peter
From barwil at gmail.com Tue Oct 26 16:46:50 2010
From: barwil at gmail.com (Bartek Wilczynski)
Date: Tue, 26 Oct 2010 22:46:50 +0200
Subject: [Biopython-dev] Moving Bio.Motif documentation into Tutorial.tex
In-Reply-To:
References:
Message-ID:
Hi all,
I've added the Bio.Motif section to the tutorial and pushed this to github.
I was able to build the tutorial in pdf, but I'm not sure about the html
version and whether it works for other people.
Any other comments are welcome as well
cheers
Bartek
On Tue, Oct 19, 2010 at 2:45 PM, Peter wrote:
> On Tue, Oct 19, 2010 at 1:34 PM, Bartek Wilczynski
> wrote:
> > Hi,
> >
> > I've started to look into merging Bio.Motif docs with the Tutorial. I
> have a
> > few questions:
> > - First, I need to find a good place in the tutorial to put it.
> > One possibility is to make a separate chapter for it, another option
> is
> > to put it as a subchapter in chapter 15 (cookbook).
> > I think it would be better to make it a separate chapter, similar to
> one
> > the ones discussing Bio.popgen or bio.phylo, So i thought it would make
> > sense to create it as a new chapter 13, entitled Sequence motif analysis
> > with Bio.Motif
>
> I agree, create a new chapter (and add yourself to the authors list).
> I'd definitely put it before the "Cookbook Chapter", and between the
> Phylogenetics and "Supervised learning methods" chapters seems
> reasonable.
>
> > -second, I have links and references to papers in there. The question
> would
> > be should I remove those to keep to the style of the tutorial
>
> Keep them - links to external webpages are fine - they work well in both
> PDF
> and HTML. For references we currently don't have a formal bibliography -
> but
> we do have some existing case of links to papers already, e.g.
>
> http://biopython.org/DIST/docs/tutorial/Tutorial.html#sec:SeqIO-fastq-conversion
>
> Peter
>
--
Bartek Wilczynski
==================
Postdoctoral fellow
EMBL, Furlong group
Meyerhoffstrasse 1,
69012 Heidelberg,
Germany
tel: +49 6221 387 8433
From biopython at maubp.freeserve.co.uk Tue Oct 26 17:53:47 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 22:53:47 +0100
Subject: [Biopython-dev] Tests in python 3.1.2
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 9:41 PM, Dragoslav Zaric
wrote:
> Dear Peter,
>
> I installed python 3.1.2, than run
>
> 2to3 -w biopython
Don't do that - see our README file:
2to3 --nofix=long --no-diffs -n -w Bio BioSQL Tests Scripts Doc/examples
2to3 --nofix=long --no-diffs -n -w -d Bio BioSQL Tests Scripts Doc/examples
You have to run 2to3 twice (strange design choice in the tool, this
is once for the code, and again with -d for the doctests which are
code examples within the docstring comments). You also need to
turn off the "long" fixer (otherwise it causes problems in Bio.Phylo).
> and after that
>
> python3.1 run_tests.py
>
> I capture screen output in log.txt file that I am sending you in attachment.
>
> Based on this log, can you advise me which way to go. Fix error one by one,
> or maybe I made mistake in installation/upgrade.
The attachment is too big for the mailing list, so your message was rejected.
I hope that helps.
Peter
From biopython at maubp.freeserve.co.uk Tue Oct 26 18:01:08 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 26 Oct 2010 23:01:08 +0100
Subject: [Biopython-dev] Moving Bio.Motif documentation into Tutorial.tex
In-Reply-To:
References:
Message-ID:
On Tue, Oct 26, 2010 at 9:46 PM, Bartek Wilczynski wrote:
> Hi all,
>
> I've added the Bio.Motif section to the tutorial and pushed this to github.
> I was able to build the tutorial in pdf, but I'm not sure about the html
> version and whether it works for other people.
>
> Any other comments are welcome as well
>
> cheers
> Bartek
Thanks Bartek - the HTML looks fine (but I haven't read it all yet):
http://biopython.org/DIST/docs/tutorial/Tutorial-dev.html
That should be updated automatically by a cron task running under
my username - let me know if it looks out of date.
Peter
From biopython at maubp.freeserve.co.uk Wed Oct 27 06:22:59 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 11:22:59 +0100
Subject: [Biopython-dev] Bio.Motif and FASTA output
Message-ID:
Hi Bartek,
I noticed a concern with one of your examples in the tutorial, going
from a Motif object to FASTA format,
>>> print m.format("fasta")
> instance 0
TATAA
> instance 1
TATTA
> instance 2
TATAA
> instance 3
TATAA
Our FASTA parser will treat that has having no identifiers (because
it goes greater than sign, space, text). How about this:
>>> print m.format("fasta")
>instance0
TATAA
>instance1
TATTA
>instance2
TATAA
>instance3
TATAA
With the above output, each sequence gets a unique identifier.
Peter
From barwil at gmail.com Wed Oct 27 06:34:52 2010
From: barwil at gmail.com (Bartek Wilczynski)
Date: Wed, 27 Oct 2010 12:34:52 +0200
Subject: [Biopython-dev] Bio.Motif and FASTA output
In-Reply-To:
References:
Message-ID:
Thanks for spotting the problem. Fixed now.
cheers
Bartek
On Wed, Oct 27, 2010 at 12:22 PM, Peter wrote:
> Hi Bartek,
>
> I noticed a concern with one of your examples in the tutorial, going
> from a Motif object to FASTA format,
>
> >>> print m.format("fasta")
> > instance 0
> TATAA
> > instance 1
> TATTA
> > instance 2
> TATAA
> > instance 3
> TATAA
>
> Our FASTA parser will treat that has having no identifiers (because
> it goes greater than sign, space, text). How about this:
>
> >>> print m.format("fasta")
> >instance0
> TATAA
> >instance1
> TATTA
> >instance2
> TATAA
> >instance3
> TATAA
>
> With the above output, each sequence gets a unique identifier.
>
> Peter
>
--
Bartek Wilczynski
==================
Postdoctoral fellow
EMBL, Furlong group
Meyerhoffstrasse 1,
69012 Heidelberg,
Germany
tel: +49 6221 387 8433
From biopython at maubp.freeserve.co.uk Wed Oct 27 06:34:39 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 11:34:39 +0100
Subject: [Biopython-dev] Bio.Motif length
Message-ID:
Hi Bartek,
(Another query after scanning over your new text in the tutorial)
Why do you have motif.length when len(motif) seems to do
basically the same thing? Can we deprecate the length
property (Zen of Python: There should be one -- and
preferably only one -- obvious way to do it)?
Thanks,
Peter
From biopython at maubp.freeserve.co.uk Wed Oct 27 06:46:21 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 11:46:21 +0100
Subject: [Biopython-dev] Bio.Motif and FASTA output
In-Reply-To:
References:
Message-ID:
On Wed, Oct 27, 2010 at 11:34 AM, Bartek Wilczynski wrote:
> Thanks for spotting the problem. Fixed now.
>
> cheers
> Bartek
Thanks.
BTW - Do you have two git usernames? Your recent commits show up
as authored by barwil but committed by bartekw - curious.
Peter
From barwil at gmail.com Wed Oct 27 06:53:58 2010
From: barwil at gmail.com (Bartek Wilczynski)
Date: Wed, 27 Oct 2010 12:53:58 +0200
Subject: [Biopython-dev] Bio.Motif length
In-Reply-To:
References:
Message-ID:
Hi Peter,
On Wed, Oct 27, 2010 at 12:34 PM, Peter wrote:
>
> Why do you have motif.length when len(motif) seems to do
> basically the same thing? Can we deprecate the length
> property (Zen of Python: There should be one -- and
> preferably only one -- obvious way to do it)?
>
>
I guess this is there just out of habit. I know that the .length property
and I tend to use it, but I agree that in the tutorial we should use len(m)
instead of m.length.
Speaking more globally, the length property is there from the beginning, I
don't think we should remove it. If we really want to make the API clean, we
could rename it to m._length to indicate that it should not be used directly
(especially setting it to some other value could have unwanted
consequences).
I can make the change in the tutorial (I need to change the expected output
of m.format("fasta") anyway), but making the change from .length to ._length
in the code would require a bit more time to make sure I'm not using it
anywhere in the code. What is your suggestion here?
cheers
B
From biopython at maubp.freeserve.co.uk Wed Oct 27 07:03:15 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 12:03:15 +0100
Subject: [Biopython-dev] Bio.Motif length
In-Reply-To:
References:
Message-ID:
On Wed, Oct 27, 2010 at 11:53 AM, Bartek Wilczynski wrote:
> Hi Peter,
>
> On Wed, Oct 27, 2010 at 12:34 PM, Peter wrote:
>
>>
>> Why do you have motif.length when len(motif) seems to do
>> basically the same thing? Can we deprecate the length
>> property (Zen of Python: There should be one -- and
>> preferably only one -- obvious way to do it)?
>>
>
> I guess this is there just out of habit. I know that the .length property
> and I tend to use it, but I agree that in the tutorial we should use len(m)
> instead of m.length.
>
> Speaking more globally, the length property is there from the beginning, I
> don't think we should remove it. If we really want to make the API clean, we
> could rename it to m._length to indicate that it should not be used directly
> (especially setting it to some other value could have unwanted
> consequences).
>
> I can make the change in the tutorial (I need to change the expected output
> of m.format("fasta") anyway), but making the change from .length to ._length
> in the code would require a bit more time to make sure I'm not using it
> anywhere in the code. What is your suggestion here?
What I would suggest is right now:
(1) Use len(...) in the tutorial and any docstrings. Also in the
__len__ docstring you could mention that using the .length
property is discouraged.
Then later as your time permits,
(2) Rename self.length to self._length throughout the code, check
tests pass
(3) Add a property length which acts as a proxy for self._length
and say in the docstring that you encourage len(...) instead.
This is to ensure existing code using .length still works.
Then later,
(4) Add a deprecation warning to the new length property.
One year and two releases later:
(5) Remove the length property (leaving the private _length
property only).
Peter
From barwil at gmail.com Wed Oct 27 09:10:46 2010
From: barwil at gmail.com (Bartek Wilczynski)
Date: Wed, 27 Oct 2010 15:10:46 +0200
Subject: [Biopython-dev] Bio.Motif length
In-Reply-To:
References:
Message-ID:
Hi,
On Wed, Oct 27, 2010 at 1:03 PM, Peter wrote:
>
> What I would suggest is right now:
>
> (1) Use len(...) in the tutorial and any docstrings. Also in the
> __len__ docstring you could mention that using the .length
> property is discouraged.
>
> This is now done and commited to the trunk.
> Then later as your time permits,
>
> (2) Rename self.length to self._length throughout the code, check
> tests pass
> (3) Add a property length which acts as a proxy for self._length
> and say in the docstring that you encourage len(...) instead.
> This is to ensure existing code using .length still works.
>
> I'll put these things on my todo list, and I'll make them on a branch, not
to mess things up.
> Then later,
>
> (4) Add a deprecation warning to the new length property.
>
> One year and two releases later:
>
> (5) Remove the length property (leaving the private _length
> property only).
>
> Is there a scheduled time for the next release? I'm just asking to see
whether I can try to still squeeze it into the nearest release or it will
need to wait for the next one.
Thanks for your input
Bartek
From biopython at maubp.freeserve.co.uk Wed Oct 27 09:25:02 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 14:25:02 +0100
Subject: [Biopython-dev] Bio.Motif length
In-Reply-To:
References:
Message-ID:
On Wed, Oct 27, 2010 at 2:10 PM, Bartek Wilczynski wrote:
>
> Is there a scheduled time for the next release? I'm just asking to see
> whether I can try to still squeeze it into the nearest release or it will
> need to wait for the next one.
>
I was thinking some point next month (November 2010), certainly
we want to do this well before the end of the year (when the NCBI
will be changing the DTD files for Entrez).
Peter
From zaricdragoslav at gmail.com Wed Oct 27 11:19:13 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Wed, 27 Oct 2010 19:19:13 +0400
Subject: [Biopython-dev] upgrade to python 3.1.2
Message-ID:
Hi Peter,
I did everything from scratch, get biopython with git, than tun those
two commands for
2to3 from README file and at the end I run run_tests.py from Tests folder.
I am sending you log file just to check am I on right track. I will
continue to investigate
errors.
One error is related to numpy module, so I will try to install numpy
for python 3.1.2
Anyway, two test FAIL, test_SeqIO_online and test_Wise
Kind regards
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
-------------- next part --------------
test_Ace ... ok
test_AlignIO ... ok
test_AlignIO_convert ... ok
test_BioSQL ... /home/maiev/work/biopython/BioSQL/Loader.py:799: UserWarning: order location operators are not fully supported
% feature.location_operator)
ok
test_BioSQL_SeqIO ... /home/maiev/work/biopython/BioSQL/Loader.py:799: UserWarning: bond location operators are not fully supported
% feature.location_operator)
ok
test_CAPS ... ok
test_Clustalw ... /home/maiev/work/biopython/Bio/Clustalw/__init__.py:83: PendingDeprecationWarning: This function is obsolete, and any new code should call Bio.AlignIO instead.
warnings.warn("This function is obsolete, and any new code should call Bio.AlignIO instead.", PendingDeprecationWarning)
ok
test_Clustalw_tool ... skipping. Install clustalw or clustalw2 if you want to use Bio.Clustalw.
test_Cluster ... skipping. If you want to use Bio.Cluster, install NumPy first and then reinstall Biopython
test_CodonTable ... ok
test_CodonUsage ... ok
test_Compass ... ok
test_Crystal ... ok
test_Dialign_tool ... skipping. Install DIALIGN2-2 if you want to use the Bio.Align.Applications wrapper.
test_DocSQL ... skipping. Install MySQLdb if you want to use Bio.DocSQL.
test_Emboss ... skipping. Install EMBOSS if you want to use Bio.Emboss.
test_EmbossPhylipNew ... skipping. Install the Emboss package 'PhylipNew' if you want to use the Bio.Emboss.Applications wrappers for phylogenetic tools.
test_EmbossPrimer ... ok
test_Entrez ... ok
test_Enzyme ... ok
test_FSSP ... ok
test_File ... ok
test_GACrossover ... ok
test_GAMutation ... ok
test_GAOrganism ... ok
test_GAQueens ... ok
test_GARepair ... ok
test_GASelection ... ok
test_GFF ... skipping. Environment is not configured for this test (not important if you do not plan to use Bio.GFF).
test_GFF2 ... skipping. Install MySQLdb if you want to use Bio.GFF.
test_GenBank ... ok
test_GenomeDiagram ... skipping. Install reportlab if you want to use Bio.Graphics.
test_GraphicsBitmaps ... skipping. Install ReportLab if you want to use Bio.Graphics.
test_GraphicsChromosome ... skipping. Install reportlab if you want to use Bio.Graphics.
test_GraphicsDistribution ... skipping. Install reportlab if you want to use Bio.Graphics.
test_GraphicsGeneral ... skipping. Install reportlab if you want to use Bio.Graphics.
test_HMMCasino ... ok
test_HMMGeneral ... ok
test_HotRand ... ok
test_IsoelectricPoint ... ok
test_KDTree ... skipping. Install NumPy if you want to use Bio.KDTree.
test_KEGG ... ok
test_KeyWList ... ok
test_Location ... ok
test_LocationParser ... skipping. This deprecated module doesn't work on Python 3.
test_LogisticRegression ... skipping. Install NumPy if you want to use Bio.LogisticRegression.
test_Mafft_tool ... skipping. Install MAFFT if you want to use the Bio.Align.Applications wrapper.
test_MarkovModel ... skipping. Install NumPy if you want to use Bio.MarkovModel.
test_Medline ... ok
test_Motif ... ok
test_Muscle_tool ... skipping. Install MUSCLE if you want to use the Bio.Align.Applications wrapper.
test_NCBIStandalone ... /home/maiev/work/biopython/Bio/Blast/NCBIStandalone.py:53: PendingDeprecationWarning: The plain text parser in this module still works at the time of writing, but is considered obsolete and updating it to cope with the latest versions of BLAST is not a priority for us.
warnings.warn("The plain text parser in this module still works at the time of writing, but is considered obsolete and updating it to cope with the latest versions of BLAST is not a priority for us.", PendingDeprecationWarning)
/home/maiev/work/biopython/Bio/Blast/NCBIStandalone.py:1850: PendingDeprecationWarning: This function is obsolete, you are encouraged to the command line wrapper Bio.Blast.Applications.BlastpgpCommandline instead.
warnings.warn("This function is obsolete, you are encouraged to the command line wrapper Bio.Blast.Applications.BlastpgpCommandline instead.", PendingDeprecationWarning)
/home/maiev/work/biopython/Bio/Blast/NCBIStandalone.py:1970: PendingDeprecationWarning: This function is obsolete, you are encouraged to the command line wrapper Bio.Blast.Applications.BlastrpsCommandline instead.
warnings.warn("This function is obsolete, you are encouraged to the command line wrapper Bio.Blast.Applications.BlastrpsCommandline instead.", PendingDeprecationWarning)
ok
test_NCBITextParser ... ok
test_NCBIXML ... ok
test_NCBI_BLAST_tools ... skipping. Install the NCBI BLAST+ command line tools if you want to use the Bio.Blast.Applications wrapper.
test_NCBI_qblast ... ok
test_NNExclusiveOr ... ok
test_NNGene ... ok
test_NNGeneral ... ok
test_Nexus ... ok
test_PDB ... skipping. Install NumPy if you want to use Bio.PDB.
test_PDB_KDTree ... skipping. Install NumPy if you want to use Bio.PDB.
test_ParserSupport ... ok
test_Pathway ... ok
test_Phd ... ok
test_Phylo ... ok
test_PhyloXML ... ok
test_Phylo_depend ... skipping. Install NetworkX if you want to use Bio.Phylo._utils.
test_PopGen_FDist ... skipping. Install FDist if you want to use Bio.PopGen.FDist.
test_PopGen_FDist_nodepend ... ok
test_PopGen_GenePop ... skipping. Install GenePop if you want to use Bio.PopGen.GenePop.
test_PopGen_GenePop_EasyController ... skipping. Install GenePop if you want to use Bio.PopGen.GenePop.
test_PopGen_GenePop_nodepend ... ok
test_PopGen_SimCoal ... skipping. Install SIMCOAL2 if you want to use Bio.PopGen.SimCoal.
test_PopGen_SimCoal_nodepend ... ok
test_Prank_tool ... skipping. Install PRANK if you want to use the Bio.Align.Applications wrapper.
test_Probcons_tool ... skipping. Install PROBCONS if you want to use the Bio.Align.Applications wrapper.
test_ProtParam ... ok
test_Restriction ... ok
test_SCOP_Astral ... ok
test_SCOP_Cla ... ok
test_SCOP_Des ... ok
test_SCOP_Dom ... ok
test_SCOP_Hie ... ok
test_SCOP_Raf ... ok
test_SCOP_Residues ... ok
test_SCOP_Scop ... ok
test_SVDSuperimposer ... skipping. Install NumPy if you want to use Bio.SVDSuperimposer.
test_SeqIO ... ok
test_SeqIO_FastaIO ... ok
test_SeqIO_QualityIO ... ok
test_SeqIO_convert ... ok
test_SeqIO_features ... ok
test_SeqIO_index ... skipping. Skipping since currently this is very slow on Python 3.
test_SeqIO_online ... FAIL
test_SeqRecord ... ok
test_SeqUtils ... ok
test_Seq_objs ... ok
test_SubsMat ... ok
test_SwissProt ... ok
test_TCoffee_tool ... skipping. Install TCOFFEE if you want to use the Bio.Align.Applications wrapper.
test_UniGene ... ok
test_UniGene_obsolete ... ok
test_Wise ... FAIL
test_align ... ok
test_geo ... ok
test_kNN ... ERROR
test_lowess ... skipping. Install NumPy if you want to use Bio.Statistics.lowess.
test_pairwise2 ... ok
test_prodoc ... ok
test_property_manager ... skipping. This deprecated module doesn't work on Python 3.
test_prosite1 ... ok
test_prosite2 ... ok
test_prosite_patterns ... skipping. The (deprecated) Bio.Prosite module uses the Python library sgmllib which is not supported on Python 3
test_psw ... ok
test_seq ... ok
test_translate ... ok
test_trie ... skipping. Could not import Bio.trie, check C code was compiled.
Bio.Alphabet docstring test ... ok
Bio.Application docstring test ... ok
Bio.SeqFeature docstring test ... ok
Bio.SeqRecord docstring test ... ok
Bio.SeqIO docstring test ... ok
Bio.SeqIO.AceIO docstring test ... ok
Bio.SeqIO.PhdIO docstring test ... ok
Bio.SeqIO.QualityIO docstring test ... ok
Bio.SeqIO.SffIO docstring test ... ok
Bio.SeqUtils docstring test ... ok
Bio.Align docstring test ... ok
Bio.Align.Generic docstring test ... ok
Bio.AlignIO docstring test ... ok
Bio.AlignIO.StockholmIO docstring test ... ok
Bio.Blast.Applications docstring test ... ok
Bio.Clustalw docstring test ... ok
Bio.Emboss.Applications docstring test ... ok
Bio.KEGG.Compound docstring test ... ok
Bio.KEGG.Enzyme docstring test ... ok
Bio.Wise docstring test ... ok
Bio.Wise.psw docstring test ... ok
Bio.Motif docstring test ... ok
======================================================================
ERROR: test_nuccore_X52960 (test_SeqIO_online.EntrezTests)
Bio.Entrez.efetch(nuccore, X52960, ...)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_SeqIO_online.py", line 85, in
method = lambda x : x.simple(d, f, e, l, c)
File "/home/maiev/work/biopython/Tests/test_SeqIO_online.py", line 63, in simple
record = SeqIO.read(handle, f)
File "/home/maiev/work/biopython/Bio/SeqIO/__init__.py", line 585, in read
first = next(iterator)
File "/home/maiev/work/biopython/Bio/SeqIO/FastaIO.py", line 39, in FastaIterator
if line[0] == ">":
IndexError: index out of range
======================================================================
ERROR: test_nucleotide_6273291 (test_SeqIO_online.EntrezTests)
Bio.Entrez.efetch(nucleotide, 6273291, ...)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_SeqIO_online.py", line 85, in
method = lambda x : x.simple(d, f, e, l, c)
File "/home/maiev/work/biopython/Tests/test_SeqIO_online.py", line 63, in simple
record = SeqIO.read(handle, f)
File "/home/maiev/work/biopython/Bio/SeqIO/__init__.py", line 585, in read
first = next(iterator)
File "/home/maiev/work/biopython/Bio/SeqIO/FastaIO.py", line 39, in FastaIterator
if line[0] == ">":
IndexError: index out of range
======================================================================
ERROR: test_protein_16130152 (test_SeqIO_online.EntrezTests)
Bio.Entrez.efetch(protein, 16130152, ...)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_SeqIO_online.py", line 85, in
method = lambda x : x.simple(d, f, e, l, c)
File "/home/maiev/work/biopython/Tests/test_SeqIO_online.py", line 63, in simple
record = SeqIO.read(handle, f)
File "/home/maiev/work/biopython/Bio/SeqIO/__init__.py", line 585, in read
first = next(iterator)
File "/home/maiev/work/biopython/Bio/SeqIO/FastaIO.py", line 39, in FastaIterator
if line[0] == ">":
IndexError: index out of range
======================================================================
FAIL: test_dnal (test_Wise.TestWiseDryRun)
Call dnal, and do a trivial check on its output.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_Wise.py", line 26, in test_dnal
self.assertTrue(sys.stdout.getvalue().startswith("dnal -kbyte 100000 seq1.fna seq2.fna"))
AssertionError: False is not True
======================================================================
FAIL: test_psw (test_Wise.TestWiseDryRun)
Call psw, and do a trivial check on its output.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_Wise.py", line 31, in test_psw
self.assertTrue(sys.stdout.getvalue().startswith("psw -kbyte 4 seq1.faa seq2.faa"))
AssertionError: False is not True
======================================================================
ERROR: test_kNN
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_kNN.py", line 12, in
import numpy
ImportError: No module named numpy
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run_tests.py", line 289, in runTest
suite = unittest.TestLoader().loadTestsFromName(name)
File "/usr/local/lib/python3.1/unittest.py", line 1266, in loadTestsFromName
module = __import__('.'.join(parts_copy))
File "/home/maiev/work/biopython/Tests/test_kNN.py", line 15, in
raise MissingPythonDependencyError(
NameError: name 'MissingPythonDependencyError' is not defined
----------------------------------------------------------------------
Ran 140 tests in 478.298 seconds
FAILED (failures = 3)
From biopython at maubp.freeserve.co.uk Wed Oct 27 11:33:05 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 16:33:05 +0100
Subject: [Biopython-dev] upgrade to python 3.1.2
In-Reply-To:
References:
Message-ID:
On Wed, Oct 27, 2010 at 4:19 PM, Dragoslav Zaric
wrote:
> Hi Peter,
>
> I did everything from scratch, get biopython with git, than tun those
> two commands for 2to3 from README file and at the end I run
> run_tests.py from Tests folder.
>
> I am sending you log file just to check am I on right track. I will
> continue to investigate errors.
>
> One error is related to numpy module, so I will try to install numpy
> for python 3.1.2
>
> Anyway, two test FAIL, test_SeqIO_online and test_Wise
>
> Kind regards
The problem with test_kNN.py was my mistake - it is meant to
be skipped when numpy is not installed. Fixed here:
http://github.com/biopython/biopython/commit/2ae15f94e7e90b237e982145f9697157ed1f801e
The "IndexError: index out of range" problem on Python 3 with
test_SeqIO_online.py is the known failure I mentioned before.
This is to do with bytes versus unicode handles.
The output from test_Wise.py is unexpected through (I don't
have Wise installed on my Mac - I should do that):
======================================================================
FAIL: test_psw (test_Wise.TestWiseDryRun)
Call psw, and do a trivial check on its output.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/maiev/work/biopython/Tests/test_Wise.py", line 31, in test_psw
self.assertTrue(sys.stdout.getvalue().startswith("psw -kbyte 4
seq1.faa seq2.faa"))
AssertionError: False is not True
Hopefully with the following change we'll get a more useful message:
http://github.com/biopython/biopython/commit/811f5ced0305fa41539b8867c594a119135ef682
Could you update your Biopython and re-test? You'll have to
repeat the 2to3 conversion, e.g.
git reset --hard
2to3 ...
etc
Peter
From biopython at maubp.freeserve.co.uk Wed Oct 27 11:46:13 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 16:46:13 +0100
Subject: [Biopython-dev] upgrade to python 3.1.2
In-Reply-To:
References:
Message-ID:
On Wed, Oct 27, 2010 at 4:40 PM, Dragoslav Zaric
wrote:
>
> ok, will do that and send you log file again,
>
You can just cut and paste the error messages - that
should be all we need.
Thanks,
Peter
From biopython at maubp.freeserve.co.uk Wed Oct 27 12:35:11 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 27 Oct 2010 17:35:11 +0100
Subject: [Biopython-dev] upgrade to python 3.1.2
In-Reply-To:
References:
Message-ID:
On Wed, Oct 27, 2010 at 5:14 PM, Dragoslav Zaric
wrote:
>
> Ok, errors:
>
> test_SeqIO_online ... FAIL
> test_Wise ... FAIL
>
> ======================================================================
> ERROR: test_nuccore_X52960 (test_SeqIO_online.EntrezTests)
> Bio.Entrez.efetch(nuccore, X52960, ...)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ...
> IndexError: index out of range
>
> ======================================================================
> ERROR: test_nucleotide_6273291 (test_SeqIO_online.EntrezTests)
> Bio.Entrez.efetch(nucleotide, 6273291, ...)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ...
> IndexError: index out of range
>
> ======================================================================
> ERROR: test_protein_16130152 (test_SeqIO_online.EntrezTests)
> Bio.Entrez.efetch(protein, 16130152, ...)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ...
> IndexError: index out of range
We're ignoring the above problem with test_SeqIO_online.py
on Python 3 for now.
> ======================================================================
> FAIL: test_dnal (test_Wise.TestWiseDryRun)
> Call dnal, and do a trivial check on its output.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/maiev/work/biopython/Tests/test_Wise.py", line 27, in test_dnal
> ? ?self.assertTrue(output.startswith("dnal -kbyte 100000 seq1.fna
> seq2.fna"), output[:200])
> AssertionError: dnal -kbyte 100000 -quiet seq1.fna seq2.fna > /tmp/tmpEVkZM8
>
>
> ======================================================================
> FAIL: test_psw (test_Wise.TestWiseDryRun)
> Call psw, and do a trivial check on its output.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "/home/maiev/work/biopython/Tests/test_Wise.py", line 33, in test_psw
> ? ?self.assertTrue(output.startswith("psw -kbyte 4 seq1.faa
> seq2.faa"), output[:200])
> AssertionError: psw -kbyte 4 -quiet seq1.faa seq2.faa > /tmp/tmpOJ3QL3
I remember this issue now:
http://lists.open-bio.org/pipermail/biopython-dev/2010-June/007904.html
(very end)
...
http://lists.open-bio.org/pipermail/biopython-dev/2010-June/007908.html
This was due to the psw/dnal wrappers sometimes automatically including
the command line switch -quiet switch. It happens if you redirect the unit
test output to a file. This change should solve it:
http://github.com/biopython/biopython/commit/4f430adad7a5b8bc021dec8b188963ca76612393
Thanks!
Peter
From tiagoantao at gmail.com Wed Oct 27 14:36:46 2010
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Wed, 27 Oct 2010 19:36:46 +0100
Subject: [Biopython-dev] README and python3
Message-ID:
Hi,
Just a minor issue with the README and python3.
The option --nofix does not exist in 2to3 for the 2.x version. So that
line will not work if the 2to3 happens to be from Python 2.X (can
happen if you have several versions installed).
--
"If you want to get laid, go to college.? If you want an education, go
to the library." - Frank Zappa
From biopython at maubp.freeserve.co.uk Thu Oct 28 05:17:09 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 28 Oct 2010 10:17:09 +0100
Subject: [Biopython-dev] README and python3
In-Reply-To:
References:
Message-ID:
2010/10/27 Tiago Ant?o :
> Hi,
>
> Just a minor issue with the README and python3.
> The option --nofix does not exist in 2to3 for the 2.x version. So that
> line will not work if the 2to3 happens to be from Python 2.X (can
> happen if you have several versions installed).
>
Hi Tiago,
Can you work out which version of 2to3 lacks the --nofix (or -x)
option, and which version of Python it came from?
The (Apple provided) Python 2.6.1 on my Mac seems to have
a 2to3 with the --nofix option, and I don't have Python 3 installed
on this machine. In addition to running 2to3 as a command line
script, you can call the library from within Python:
$ python2.6
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lib2to3.main import main
>>> main("lib2to3.fixes", ["--help"])
Usage: refactor.py [options] file|dir ...
Options:
-h, --help show this help message and exit
-d, --doctests_only Fix up doctests only
-f FIX, --fix=FIX Each FIX specifies a transformation; default: all
-x NOFIX, --nofix=NOFIX
Prevent a fixer from being run.
-l, --list-fixes List available transformations (fixes/fix_*.py)
-p, --print-function Modify the grammar so that print() is a function
-v, --verbose More verbose logging
-w, --write Write back modified files
-n, --nobackups Don't write backups for modified files.
Likewise on our Linux server the 2to3 from Python 2.6.6, 2.7 and
3.1.2 all seem to have it:
$ python2.6
Python 2.6.6 (r266:84292, Aug 31 2010, 16:21:14)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lib2to3.main import main
>>> main("lib2to3.fixes", ["--help"])
Usage: 2to3 [options] file|dir ...
Options:
-h, --help show this help message and exit
-d, --doctests_only Fix up doctests only
-f FIX, --fix=FIX Each FIX specifies a transformation; default: all
-j PROCESSES, --processes=PROCESSES
Run 2to3 concurrently
-x NOFIX, --nofix=NOFIX
Prevent a fixer from being run.
-l, --list-fixes List available transformations
-p, --print-function Modify the grammar so that print() is a function
-v, --verbose More verbose logging
--no-diffs Don't show diffs of the refactoring
-w, --write Write back modified files
-n, --nobackups Don't write backups for modified files.
$ python2.7
Python 2.7 (r27:82500, Jul 13 2010, 14:02:41)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lib2to3.main import main
>>> main("lib2to3.fixes", ["--help"])
Usage: 2to3 [options] file|dir ...
Options:
-h, --help show this help message and exit
-d, --doctests_only Fix up doctests only
-f FIX, --fix=FIX Each FIX specifies a transformation; default: all
-j PROCESSES, --processes=PROCESSES
Run 2to3 concurrently
-x NOFIX, --nofix=NOFIX
Prevent a fixer from being run.
-l, --list-fixes List available transformations
-p, --print-function Modify the grammar so that print() is a function
-v, --verbose More verbose logging
--no-diffs Don't show diffs of the refactoring
-w, --write Write back modified files
-n, --nobackups Don't write backups for modified files.
$ python3.1
Python 3.1.2 (r312:79147, Jul 15 2010, 12:43:37)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lib2to3.main import main
>>> main("lib2to3.fixes", ["--help"])
Usage: 2to3 [options] file|dir ...
Options:
-h, --help show this help message and exit
-d, --doctests_only Fix up doctests only
-f FIX, --fix=FIX Each FIX specifies a transformation; default: all
-j PROCESSES, --processes=PROCESSES
Run 2to3 concurrently
-x NOFIX, --nofix=NOFIX
Prevent a fixer from being run.
-l, --list-fixes List available transformations (fixes/fix_*.py)
-p, --print-function Modify the grammar so that print() is a function
-v, --verbose More verbose logging
--no-diffs Don't show diffs of the refactoring
-w, --write Write back modified files
-n, --nobackups Don't write backups for modified files.
Note that we *need* the --nofix option for the conversion of
Bio.Phylo to work (it uses long as an argument name,
short longitude).
Peter
From zaricdragoslav at gmail.com Thu Oct 28 10:16:34 2010
From: zaricdragoslav at gmail.com (Dragoslav Zaric)
Date: Thu, 28 Oct 2010 18:16:34 +0400
Subject: [Biopython-dev] _pwm.c
Message-ID:
Dear Peter,
I wrote you this yesterday:
I put this in setup.py:
class build_ext_biopython(build_ext):
def run(self):
if not check_dependencies_once():
return
# add software that requires NumPy to install
# TODO - Convert these for Python 3
if is_Numpy_installed():
import numpy
numpy_include_dir = numpy.get_include()
#self.extensions.append(
# Extension('Bio.Cluster.cluster',
# ['Bio/Cluster/clustermodule.c',
# 'Bio/Cluster/cluster.c'],
# include_dirs=[numpy_include_dir],
# ))
#self.extensions.append(
# Extension('Bio.KDTree._CKDTree',
# ["Bio/KDTree/KDTree.c",
# "Bio/KDTree/KDTreemodule.c"],
# include_dirs=[numpy_include_dir],
# ))
self.extensions.append(
Extension('Bio.Motif._pwm',
["Bio/Motif/_pwm.c"],
include_dirs=[numpy_include_dir],
))
build_ext.run(self)
and than I run:
python3.1 setup.py build_ext
This is output:
Biopython does not yet officially support Python 3, but you
can try it by first using the 2to3 script on our source code.
For details on how to use 2to3 with Biopython see README.
If you still haven't applied 2to3 to Biopython please abort now.
Do you want to continue this installation? (y/N):
y
running build_ext
building 'Bio.Motif._pwm' extension
creating build/temp.linux-i686-3.1
creating build/temp.linux-i686-3.1/Bio
creating build/temp.linux-i686-3.1/Bio/Motif
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall
-Wstrict-prototypes -fPIC
-I/usr/local/lib/python3.1/site-packages/numpy/core/include
-I/usr/local/include/python3.1 -c Bio/Motif/_pwm.c -o
build/temp.linux-i686-3.1/Bio/Motif/_pwm.o
Bio/Motif/_pwm.c: In function ?init_pwm?:
Bio/Motif/_pwm.c:123: warning: ?return? with a value, in function returning void
Bio/Motif/_pwm.c:125: warning: implicit declaration of function ?Py_InitModule4?
Bio/Motif/_pwm.c:129: warning: assignment makes pointer from integer
without a cast
gcc -pthread -shared build/temp.linux-i686-3.1/Bio/Motif/_pwm.o -o
build/lib/Bio/Motif/_pwm.so
So as you can see this is compiling, but there are some warnings. So what is
plan, to compile totally without warnings ??
regards
--
Dragoslav Zaric
Professional Programmer
MSc Astrophysics
From biopython at maubp.freeserve.co.uk Thu Oct 28 10:27:04 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 28 Oct 2010 15:27:04 +0100
Subject: [Biopython-dev] _pwm.c
In-Reply-To:
References:
Message-ID:
On Thu, Oct 28, 2010 at 3:16 PM, Dragoslav Zaric
wrote:
> Dear Peter,
>
> I wrote you this yesterday:
>
> I put this in setup.py:
>
> class build_ext_biopython(build_ext):
> ? def run(self):
> ? ? ? if not check_dependencies_once():
> ? ? ? ? ? return
> ? ? ? # add software that requires NumPy to install
> ? ? ? # TODO - Convert these for Python 3
> ? ? ? if is_Numpy_installed():
> ? ? ? ? ? import numpy
> ? ? ? ? ? numpy_include_dir = numpy.get_include()
> ? ? ? ? ? #self.extensions.append(
> ? ? ? ? ? # ? ?Extension('Bio.Cluster.cluster',
> ? ? ? ? ? # ? ? ? ? ? ? ?['Bio/Cluster/clustermodule.c',
> ? ? ? ? ? # ? ? ? ? ? ? ? 'Bio/Cluster/cluster.c'],
> ? ? ? ? ? # ? ? ? ? ? ? ?include_dirs=[numpy_include_dir],
> ? ? ? ? ? # ? ? ? ? ? ? ?))
> ? ? ? ? ? #self.extensions.append(
> ? ? ? ? ? # ? ?Extension('Bio.KDTree._CKDTree',
> ? ? ? ? ? # ? ? ? ? ? ? ?["Bio/KDTree/KDTree.c",
> ? ? ? ? ? # ? ? ? ? ? ? ? "Bio/KDTree/KDTreemodule.c"],
> ? ? ? ? ? # ? ? ? ? ? ? ?include_dirs=[numpy_include_dir],
> ? ? ? ? ? # ? ? ? ? ? ? ?))
> ? ? ? ? ? self.extensions.append(
> ? ? ? ? ? ? ? Extension('Bio.Motif._pwm',
> ? ? ? ? ? ? ? ? ? ? ? ? ["Bio/Motif/_pwm.c"],
> ? ? ? ? ? ? ? ? ? ? ? ? include_dirs=[numpy_include_dir],
> ? ? ? ? ? ? ? ? ? ? ? ? ))
> ? ? ? build_ext.run(self)
>
> and than I run:
>
> python3.1 setup.py build_ext
>
> This is output:
>
> Biopython does not yet officially support Python 3, but you
> can try it by first using the 2to3 script on our source code.
> For details on how to use 2to3 with Biopython see README.
> If you still haven't applied 2to3 to Biopython please abort now.
> Do you want to continue this installation? (y/N):
> y
> running build_ext
> building 'Bio.Motif._pwm' extension
> creating build/temp.linux-i686-3.1
> creating build/temp.linux-i686-3.1/Bio
> creating build/temp.linux-i686-3.1/Bio/Motif
> gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall
> -Wstrict-prototypes -fPIC
> -I/usr/local/lib/python3.1/site-packages/numpy/core/include
> -I/usr/local/include/python3.1 -c Bio/Motif/_pwm.c -o
> build/temp.linux-i686-3.1/Bio/Motif/_pwm.o
> Bio/Motif/_pwm.c: In function ?init_pwm?:
> Bio/Motif/_pwm.c:123: warning: ?return? with a value, in function returning void
> Bio/Motif/_pwm.c:125: warning: implicit declaration of function ?Py_InitModule4?
> Bio/Motif/_pwm.c:129: warning: assignment makes pointer from integer
> without a cast
> gcc -pthread -shared build/temp.linux-i686-3.1/Bio/Motif/_pwm.o -o
> build/lib/Bio/Motif/_pwm.so
>
>
> So as you can see this is compiling, but there are some warnings. So what is
> plan, to compile totally without warnings ??
Well ideally no warnings - but of those three warnings only the one about
Py_InitModule4 strikes me as important. This was part of the Python 2.x
C API used to tell Python about the functions your code provides, and has
been changed in Python 3.x (I think you must use PyModule_Create instead).
What happens if you try to use the compiled module in Python 3? e.g.
from Bio import Motif
from Bio.Motif import _pwm
Bartek - could you give us a short (Python 2) example of Bio.Motif
which uses the C module _pwm?
Peter
From barwil at gmail.com Thu Oct 28 10:37:18 2010
From: barwil at gmail.com (Bartek Wilczynski)
Date: Thu, 28 Oct 2010 16:37:18 +0200
Subject: [Biopython-dev] _pwm.c
In-Reply-To:
References:
Message-ID:
On Thu, Oct 28, 2010 at 4:27 PM, Peter wrote:
> On Thu, Oct 28, 2010 at 3:16 PM, Dragoslav Zaric
> wrote:
> > running build_ext
> > building 'Bio.Motif._pwm' extension
> > creating build/temp.linux-i686-3.1
> > creating build/temp.linux-i686-3.1/Bio
> > creating build/temp.linux-i686-3.1/Bio/Motif
> > gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall
> > -Wstrict-prototypes -fPIC
> > -I/usr/local/lib/python3.1/site-packages/numpy/core/include
> > -I/usr/local/include/python3.1 -c Bio/Motif/_pwm.c -o
> > build/temp.linux-i686-3.1/Bio/Motif/_pwm.o
> > Bio/Motif/_pwm.c: In function ?init_pwm?:
> > Bio/Motif/_pwm.c:123: warning: ?return? with a value, in function
> returning void
> > Bio/Motif/_pwm.c:125: warning: implicit declaration of function
> ?Py_InitModule4?
> > Bio/Motif/_pwm.c:129: warning: assignment makes pointer from integer
> > without a cast
> > gcc -pthread -shared build/temp.linux-i686-3.1/Bio/Motif/_pwm.o -o
> > build/lib/Bio/Motif/_pwm.so
> >
> >
> > So as you can see this is compiling, but there are some warnings. So what
> is
> > plan, to compile totally without warnings ??
>
> Well ideally no warnings - but of those three warnings only the one about
> Py_InitModule4 strikes me as important. This was part of the Python 2.x
> C API used to tell Python about the functions your code provides, and has
> been changed in Python 3.x (I think you must use PyModule_Create instead).
>
> What happens if you try to use the compiled module in Python 3? e.g.
>
> from Bio import Motif
> from Bio.Motif import _pwm
>
> Bartek - could you give us a short (Python 2) example of Bio.Motif
> which uses the C module _pwm?
>
Hi,
this is the fast implementation of DNA motif searching written by Michiel
some time ago. It is exposed in the Bio.Motif API in the form of .scanPWM
method:
Definition: m.scanPWM(self, seq)
Docstring:
Matrix of log-odds scores for a nucleotide sequence.
scans (using a fast C extension) a nucleotide sequence and returns
the matrix of log-odds scores for all positions
- the result is a one-dimensional numpy array
- the sequence can only be a DNA sequence
- the search is performed only on one strand
It's a very simple module so it should be relatively easy to convert it to
python3. Unfortunately, I have no experience in c extensions so I cannot
help much.
If you need a snippet for testing, you can use this:
from Bio import Seq
from Bio import Motif
m=Motif.read(open("Doc/cookbook/motif/SRF.pfm"),"jaspar-pfm")
m.scanPWM(Seq.Seq("ACGTGTGCGTAGTGCGT",m.alphabet))
result should be:
array([-29.18363571, -38.3365097 , -29.17756271, -38.04542542, -20.3014183 ,
-25.18009186], dtype=float32)
hope this helps
--
Bartek Wilczynski
==================
Postdoctoral fellow
EMBL, Furlong group
Meyerhoffstrasse 1,
69012 Heidelberg,
Germany
tel: +49 6221 387 8433
From biopython at maubp.freeserve.co.uk Thu Oct 28 11:54:07 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 28 Oct 2010 16:54:07 +0100
Subject: [Biopython-dev] _pwm.c
In-Reply-To:
References: