From jonathan.taylor at utoronto.ca Thu Jan 13 02:36:59 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Application instances code
Message-ID: <1105601819.18194.6.camel@localhost.localdomain>
Hi,
My last message got caught by the spam guard for some reason so excuse
the odd subject line.
I am doing quite a bit of infrastructure work using biopython. As I run
into problems I will try to make the code available to the biopython
project as appropriate.
Here is a small patch:
- allows you to easily figure out the return code of an application run
through the Bio.Application framework
- adds Fastacmd functionality to the Bio.Application framework
Any problems, just respond to the list as I monitor it regularly.
Regards,
Jonathan Taylor.
Botany Department, University of Toronto.
From jonathan.taylor at utoronto.ca Thu Jan 13 03:17:58 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Application instances code
In-Reply-To: <1105601819.18194.6.camel@localhost.localdomain>
References: <1105601819.18194.6.camel@localhost.localdomain>
Message-ID: <1105604278.18194.9.camel@localhost.localdomain>
My mail client did not seem to attach the patch when I resent.
Here,
Jonathan Taylor.
Botany Department, University of Toronto.
On Thu, 2005-01-13 at 02:36 -0500, Jonathan Taylor wrote:
> Hi,
>
> My last message got caught by the spam guard for some reason so excuse
> the odd subject line.
>
> I am doing quite a bit of infrastructure work using biopython. As I run
> into problems I will try to make the code available to the biopython
> project as appropriate.
>
> Here is a small patch:
> - allows you to easily figure out the return code of an application run
> through the Bio.Application framework
> - adds Fastacmd functionality to the Bio.Application framework
>
> Any problems, just respond to the list as I monitor it regularly.
>
> Regards,
> Jonathan Taylor.
> Botany Department, University of Toronto.
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
From jonathan.taylor at utoronto.ca Thu Jan 13 07:50:20 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Application instances code
In-Reply-To: <1105601819.18194.6.camel@localhost.localdomain>
References: <1105601819.18194.6.camel@localhost.localdomain>
Message-ID: <1105620620.18194.27.camel@localhost.localdomain>
http://bbc.botany.utoronto.ca/~jtaylor/biopython-bio_application-return_code_and_fastacmd_support.diff.patch
jon.
On Thu, 2005-01-13 at 02:36 -0500, Jonathan Taylor wrote:
> Hi,
>
> My last message got caught by the spam guard for some reason so excuse
> the odd subject line.
>
> I am doing quite a bit of infrastructure work using biopython. As I run
> into problems I will try to make the code available to the biopython
> project as appropriate.
>
> Here is a small patch:
> - allows you to easily figure out the return code of an application run
> through the Bio.Application framework
> - adds Fastacmd functionality to the Bio.Application framework
>
> Any problems, just respond to the list as I monitor it regularly.
>
> Regards,
> Jonathan Taylor.
> Botany Department, University of Toronto.
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
From jonathan.taylor at utoronto.ca Thu Jan 13 07:52:24 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Can we turn off the spam filter?
Message-ID: <1105620744.18194.30.camel@localhost.localdomain>
I had a terrible time trying to post a patch. I think it would not
accept anything with an attachment.
In any case if the list is members only then do we really need a spam
filter?
Regards,
Jonathan Taylor.
From jonathan.taylor at utoronto.ca Thu Jan 13 07:38:19 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Application instances code
In-Reply-To: <1105601819.18194.6.camel@localhost.localdomain>
References: <1105601819.18194.6.camel@localhost.localdomain>
Message-ID: <1105619899.18194.18.camel@localhost.localdomain>
my apologies.
On Thu, 2005-01-13 at 02:36 -0500, Jonathan Taylor wrote:
> Hi,
>
> My last message got caught by the spam guard for some reason so excuse
> the odd subject line.
>
> I am doing quite a bit of infrastructure work using biopython. As I run
> into problems I will try to make the code available to the biopython
> project as appropriate.
>
> Here is a small patch:
> - allows you to easily figure out the return code of an application run
> through the Bio.Application framework
> - adds Fastacmd functionality to the Bio.Application framework
>
> Any problems, just respond to the list as I monitor it regularly.
>
> Regards,
> Jonathan Taylor.
> Botany Department, University of Toronto.
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: biopython-bio_application-return_code_and_fastacmd_support.diff.patch
Type: text/x-patch
Size: 3149 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20050113/187dbd2c/biopython-bio_application-return_code_and_fastacmd_support.diff.bin
From jonathan.taylor at utoronto.ca Thu Jan 13 07:45:15 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Application instances code for biopython
In-Reply-To: <1105604278.18194.9.camel@localhost.localdomain>
References: <1105601819.18194.6.camel@localhost.localdomain>
<1105604278.18194.9.camel@localhost.localdomain>
Message-ID: <1105620315.18194.21.camel@localhost.localdomain>
my apologies.
Jon.
P.S. Can someone please turn down the spam filter.
On Thu, 2005-01-13 at 03:17 -0500, Jonathan Taylor wrote:
> My mail client did not seem to attach the patch when I resent.
>
> Here,
> Jonathan Taylor.
> Botany Department, University of Toronto.
>
>
> On Thu, 2005-01-13 at 02:36 -0500, Jonathan Taylor wrote:
> > Hi,
> >
> > My last message got caught by the spam guard for some reason so excuse
> > the odd subject line.
> >
> > I am doing quite a bit of infrastructure work using biopython. As I run
> > into problems I will try to make the code available to the biopython
> > project as appropriate.
> >
> > Here is a small patch:
> > - allows you to easily figure out the return code of an application run
> > through the Bio.Application framework
> > - adds Fastacmd functionality to the Bio.Application framework
> >
> > Any problems, just respond to the list as I monitor it regularly.
> >
> > Regards,
> > Jonathan Taylor.
> > Botany Department, University of Toronto.
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev@biopython.org
> > http://biopython.org/mailman/listinfo/biopython-dev
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: biopython-bio_application-return_code_and_fastacmd_support.diff.patch
Type: text/x-patch
Size: 3149 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20050113/d9b7b440/biopython-bio_application-return_code_and_fastacmd_support.diff.bin
From jonathan.taylor at utoronto.ca Thu Jan 13 07:48:29 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Finished writing the update
In-Reply-To: <1105601819.18194.6.camel@localhost.localdomain>
References: <1105601819.18194.6.camel@localhost.localdomain>
Message-ID: <1105620509.18194.24.camel@localhost.localdomain>
Let's try this header.
Jon.
P.S. the spam filter is crazy for this list. If its invite only can we
not turn it off?
On Thu, 2005-01-13 at 02:36 -0500, Jonathan Taylor wrote:
> Hi,
>
> My last message got caught by the spam guard for some reason so excuse
> the odd subject line.
>
> I am doing quite a bit of infrastructure work using biopython. As I run
> into problems I will try to make the code available to the biopython
> project as appropriate.
>
> Here is a small patch:
> - allows you to easily figure out the return code of an application run
> through the Bio.Application framework
> - adds Fastacmd functionality to the Bio.Application framework
>
> Any problems, just respond to the list as I monitor it regularly.
>
> Regards,
> Jonathan Taylor.
> Botany Department, University of Toronto.
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: biopython-bio_application-return_code_and_fastacmd_support.diff.patch
Type: text/x-patch
Size: 3149 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20050113/415e83b6/biopython-bio_application-return_code_and_fastacmd_support.diff.bin
From idoerg at burnham.org Thu Jan 13 12:15:11 2005
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Can we turn off the spam filter?
In-Reply-To: <1105620744.18194.30.camel@localhost.localdomain>
References: <1105620744.18194.30.camel@localhost.localdomain>
Message-ID: <41E6AC9F.4060809@burnham.org>
Jonathan Taylor wrote:
>I had a terrible time trying to post a patch. I think it would not
>accept anything with an attachment.
>
>In any case if the list is members only then do we really need a spam
>filter?
>
>Regards,
>Jonathan Taylor.
>
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev
>
>
>
>
Jonathan,
Thanks for the patch, and sorry for the frustration this caused you.
I'll patch and commit this to CVS.
Iddo
--
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9930
http://ffas.ljcrf.edu/~iddo
From jonathan.taylor at utoronto.ca Thu Jan 13 12:31:36 2005
From: jonathan.taylor at utoronto.ca (Jonathan Taylor)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] Can we turn off the spam filter?
In-Reply-To: <41E6AC9F.4060809@burnham.org>
References: <1105620744.18194.30.camel@localhost.localdomain>
<41E6AC9F.4060809@burnham.org>
Message-ID: <1105637496.18194.33.camel@localhost.localdomain>
No problem.
Sorry for all the posts. I thought those ones were not going to make it
to the list since I got emails saying mailman thought they were spam.
I'll prob have a few more patches coming in the next month or so.
Cheers.
Jon.
On Thu, 2005-01-13 at 09:15 -0800, Iddo Friedberg wrote:
> Jonathan Taylor wrote:
>
> >I had a terrible time trying to post a patch. I think it would not
> >accept anything with an attachment.
> >
> >In any case if the list is members only then do we really need a spam
> >filter?
> >
> >Regards,
> >Jonathan Taylor.
> >
> >_______________________________________________
> >Biopython-dev mailing list
> >Biopython-dev@biopython.org
> >http://biopython.org/mailman/listinfo/biopython-dev
> >
> >
> >
> >
>
> Jonathan,
>
> Thanks for the patch, and sorry for the frustration this caused you.
> I'll patch and commit this to CVS.
>
> Iddo
>
From bugzilla-daemon at portal.open-bio.org Mon Jan 17 09:52:02 2005
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] [Bug 1733] compiler recognition in setup.py
Message-ID: <200501171452.j0HEq2kc010627@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1733
------- Additional Comments From MBatLE@gmx.de 2005-01-17 09:52 -------
The consequence in that the 'Bio.KDTree._CKDTree' extension cant get build.
The relevant output of setup.py is:
building 'Bio.KDTree._CKDTree' extension
creating build/temp.linux-i686-2.3/Bio/KDTree
-I/usr/include/python2.3 -c Bio/KDTree/KDTree.swig.cpp -o
build/temp.linux-i686-2.3/Bio/KDTree/KDTree.swig.o
unable to execute -I/usr/include/python2.3: No such file or directory
error: command '-I/usr/include/python2.3' failed with exit status 1
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Jan 17 09:47:39 2005
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] [Bug 1733] New: compiler recognition in setup.py
Message-ID: <200501171447.j0HElcDe010590@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1733
Summary: compiler recognition in setup.py
Product: Biopython
Version: Not Applicable
Platform: PC
OS/Version: Linux
Status: NEW
Severity: major
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev@biopython.org
ReportedBy: MBatLE@gmx.de
in function build_extension (setup.py line 178) the c++ compiler cant get
determined in my installation.
My os.environ looks the following:
{'SSH_ASKPASS': '/usr/bin/gtk2-ssh-askpass', 'LESS': '-R', 'LESSOPEN':
'|lesspipe.sh %s', 'CVS_RSH': 'ssh', 'LOGNAME': 'root', 'USER': 'root',
'INPUTRC': '/etc/inputrc', 'QTDIR': '/usr/qt/3', 'PATH':
'/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.3.5:/opt/ati/bin:/opt/Acrobat5:/usr/X11R6/bin:/opt/blackdown-jdk-1.4.2.01/bin:/opt/blackdown-jdk-1.4.2.01/jre/bin:/usr/qt/3/bin:/usr/kde/3.3/sbin:/usr/kde/3.3/bin',
'PS1': '\\[\\033[01;31m\\]\\h \\[\\033[01;34m\\]\\W \\$ \\[\\033[00m\\]',
'DISPLAY': ':0.0', 'KDEDIR': '/usr/kde/3.3', 'TERM': 'Eterm', 'SHELL':
'/bin/bash', 'JDK_HOME': '/opt/blackdown-jdk-1.4.2.01', 'SHLVL': '1',
'CONFIG_PROTECT_MASK': '/etc/gconf /etc/terminfo', 'G_BROKEN_FILENAMES': '1',
'QMAKESPEC': 'linux-g++', 'EDITOR': '/bin/nano', 'MANPATH':
'/usr/share/man:/usr/local/share/man:/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5/man:/usr/X11R6/man::/opt/blackdown-jdk-1.4.2.01/man:/usr/qt/3/doc/man',
'JAVA_HOME': '/opt/blackdown-jdk-1.4.2.01', 'HOME': '/root', 'KDE_MALLOC': '1',
'INFODIR': '/usr/share/info', 'INFOPATH':
'/usr/share/info:/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5/info', 'GCCBITS':
'', 'CLASSPATH': '.', 'PLAT': 'linux-i686', 'XINITRC': '/etc/X11/xinit/xinitrc',
'MOZILLA_FIVE_HOME': '/usr/lib/mozilla', 'XAUTHORITY': '/root/.xauth78ixrX',
'KDEDIRS': '/usr', '_': '/usr/bin/python', 'JAVAC':
'/opt/blackdown-jdk-1.4.2.01/bin/javac', 'ANT_HOME': '/usr/share/ant-core',
'GDK_USE_XFT': '1', 'OLDPWD': '/root', 'HOSTNAME': 'gkws3', 'CONFIG_PROTECT':
'/usr/lib/mozilla/defaults/pref /usr/X11R6/lib/X11/xkb /usr/kde/3.3/share/config
/usr/kde/3.3/env /usr/kde/3.3/shutdown /usr/share/texmf/tex/generic/config/
/usr/share/texmf/tex/platex/config/ /usr/share/texmf/dvips/config/
/usr/share/texmf/dvipdfm/config/ /usr/share/texmf/xdvi/ /usr/share/config',
'PWD': '/tmp/biopython', 'SGML_CATALOG_FILES':
'/etc/sgml/sgml-ent.cat:/etc/sgml/sgml-docbook.cat:/etc/sgml/openjade-1.3.2.cat:/etc/sgml/sgml-docbook-4.1.cat:/etc/sgml/sgml-docbook-4.0.cat:/etc/sgml/dsssl-docbook-stylesheets.cat:/etc/sgml/sgml-docbook-3.0.cat:/etc/sgml/sgml-docbook-3.1.cat:/etc/sgml/sgml-lite.cat',
'MAIL': '/root/', 'PAGER': '/usr/bin/less', 'PYTHONDOCS':
'/usr/share/doc/python-docs-2.3.4/html'}
{'SSH_ASKPASS': '/usr/bin/gtk2-ssh-askpass', 'LESS': '-R', 'LESSOPEN':
'|lesspipe.sh %s', 'CVS_RSH': 'ssh', 'LOGNAME': 'root', 'USER': 'root',
'INPUTRC': '/etc/inputrc', 'QTDIR': '/usr/qt/3', 'PATH':
'/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.3.5:/opt/ati/bin:/opt/Acrobat5:/usr/X11R6/bin:/opt/blackdown-jdk-1.4.2.01/bin:/opt/blackdown-jdk-1.4.2.01/jre/bin:/usr/qt/3/bin:/usr/kde/3.3/sbin:/usr/kde/3.3/bin',
'PS1': '\\[\\033[01;31m\\]\\h \\[\\033[01;34m\\]\\W \\$ \\[\\033[00m\\]',
'DISPLAY': ':0.0', 'KDEDIR': '/usr/kde/3.3', 'TERM': 'Eterm', 'SHELL':
'/bin/bash', 'JDK_HOME': '/opt/blackdown-jdk-1.4.2.01', 'SHLVL': '1',
'CONFIG_PROTECT_MASK': '/etc/gconf /etc/terminfo', 'G_BROKEN_FILENAMES': '1',
'QMAKESPEC': 'linux-g++', 'EDITOR': '/bin/nano', 'MANPATH':
'/usr/share/man:/usr/local/share/man:/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5/man:/usr/X11R6/man::/opt/blackdown-jdk-1.4.2.01/man:/usr/qt/3/doc/man',
'JAVA_HOME': '/opt/blackdown-jdk-1.4.2.01', 'HOME': '/root', 'KDE_MALLOC': '1',
'INFODIR': '/usr/share/info', 'INFOPATH':
'/usr/share/info:/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5/info', 'GCCBITS':
'', 'CLASSPATH': '.', 'PLAT': 'linux-i686', 'XINITRC': '/etc/X11/xinit/xinitrc',
'MOZILLA_FIVE_HOME': '/usr/lib/mozilla', 'XAUTHORITY': '/root/.xauth78ixrX',
'KDEDIRS': '/usr', '_': '/usr/bin/python', 'JAVAC':
'/opt/blackdown-jdk-1.4.2.01/bin/javac', 'ANT_HOME': '/usr/share/ant-core',
'GDK_USE_XFT': '1', 'OLDPWD': '/root', 'HOSTNAME': 'gkws3', 'CONFIG_PROTECT':
'/usr/lib/mozilla/defaults/pref /usr/X11R6/lib/X11/xkb /usr/kde/3.3/share/config
/usr/kde/3.3/env /usr/kde/3.3/shutdown /usr/share/texmf/tex/generic/config/
/usr/share/texmf/tex/platex/config/ /usr/share/texmf/dvips/config/
/usr/share/texmf/dvipdfm/config/ /usr/share/texmf/xdvi/ /usr/share/config',
'PWD': '/tmp/biopython', 'SGML_CATALOG_FILES':
'/etc/sgml/sgml-ent.cat:/etc/sgml/sgml-docbook.cat:/etc/sgml/openjade-1.3.2.cat:/etc/sgml/sgml-docbook-4.1.cat:/etc/sgml/sgml-docbook-4.0.cat:/etc/sgml/dsssl-docbook-stylesheets.cat:/etc/sgml/sgml-docbook-3.0.cat:/etc/sgml/sgml-docbook-3.1.cat:/etc/sgml/sgml-lite.cat',
'MAIL': '/root/', 'PAGER': '/usr/bin/less', 'PYTHONDOCS':
'/usr/share/doc/python-docs-2.3.4/html'}
and (the elif fallback) wont get g++ because :
self.compiler.compiler_cxx = []
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mdehoon at ims.u-tokyo.ac.jp Tue Jan 18 03:27:07 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] [Bug 1733] compiler recognition in setup.py
In-Reply-To: <200501171452.j0HEq2kc010627@portal.open-bio.org>
References: <200501171452.j0HEq2kc010627@portal.open-bio.org>
Message-ID: <41ECC85B.2010702@ims.u-tokyo.ac.jp>
Compiling the two C++ modules Bio.KDTree and Bio.Affy has been a recurring
problem for many Biopython users. From the KDTree source code, it seems that
KDTree is implemented in C++ for the benefit of speed of the algorithms in
KDTree.cpp. Can these routines be implemented in C or (better yet) using
Numerical Python? Currently, Bio.KDTree is not included with the Windows
installer either, because of compilation problems, and it would be nice to make
this extension available to as many Biopython users as possible. On the other
hand, compilation errors such as below may scare off new users, even if they are
not planning to use Bio.KDTree and Bio.Affy, so it may be better to skip by
default compilation of these extensions in setup.py. Thomas, any suggestions?
--Michiel.
bugzilla-daemon@portal.open-bio.org wrote:
> http://bugzilla.open-bio.org/show_bug.cgi?id=1733
>
>
>
>
>
> ------- Additional Comments From MBatLE@gmx.de 2005-01-17 09:52 -------
> The consequence in that the 'Bio.KDTree._CKDTree' extension cant get build.
> The relevant output of setup.py is:
>
> building 'Bio.KDTree._CKDTree' extension
> creating build/temp.linux-i686-2.3/Bio/KDTree
> -I/usr/include/python2.3 -c Bio/KDTree/KDTree.swig.cpp -o
> build/temp.linux-i686-2.3/Bio/KDTree/KDTree.swig.o
> unable to execute -I/usr/include/python2.3: No such file or directory
> error: command '-I/usr/include/python2.3' failed with exit status 1
>
>
>
>
> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
>
>
--
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
From thamelry at binf.ku.dk Tue Jan 18 02:38:16 2005
From: thamelry at binf.ku.dk (thamelry@binf.ku.dk)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] [Bug 1733] compiler recognition in setup.py
In-Reply-To: <41ECC85B.2010702@ims.u-tokyo.ac.jp>
References: <200501171452.j0HEq2kc010627@portal.open-bio.org>
<41ECC85B.2010702@ims.u-tokyo.ac.jp>
Message-ID: <33213.80.164.86.229.1106033896.squirrel@www.binf.ku.dk>
> From the KDTree source code, it seems
> that KDTree is implemented in C++ for the benefit of speed
Correct.
> Can these routines be implemented in C or (better yet) using
> Numerical Python?
An implementation in C is of course possible. A Numpy implementation is
too slow (the KDTree prototype made use of Python/Numpy). But the problem
really lies with Distutils: it does not deal well with C++ code.
I don't think KDTree will be missed by many Biopython people, so it can be
left out as far as I am concerned. It probably makes sense to make it
available as an independent package (I know some astronomers are using it
to study star maps for example :-).
Best regards,
-Thomas
From mdehoon at ims.u-tokyo.ac.jp Wed Jan 19 08:54:58 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] [Bug 1733] compiler recognition in setup.py
In-Reply-To: <33213.80.164.86.229.1106033896.squirrel@www.binf.ku.dk>
References: <200501171452.j0HEq2kc010627@portal.open-bio.org>
<41ECC85B.2010702@ims.u-tokyo.ac.jp>
<33213.80.164.86.229.1106033896.squirrel@www.binf.ku.dk>
Message-ID: <41EE66B2.3060209@ims.u-tokyo.ac.jp>
Thanks, Thomas. Then unless somebody objects, I will switch off compilation of
KDTree by default in setup.py, but still have it included with the Biopython
source distribution.
--Michiel.
thamelry@binf.ku.dk wrote:
>>From the KDTree source code, it seems
>>that KDTree is implemented in C++ for the benefit of speed
>
>
> Correct.
>
>
>>Can these routines be implemented in C or (better yet) using
>>Numerical Python?
>
>
> An implementation in C is of course possible. A Numpy implementation is
> too slow (the KDTree prototype made use of Python/Numpy). But the problem
> really lies with Distutils: it does not deal well with C++ code.
>
> I don't think KDTree will be missed by many Biopython people, so it can be
> left out as far as I am concerned. It probably makes sense to make it
> available as an independent package (I know some astronomers are using it
> to study star maps for example :-).
>
> Best regards,
>
> -Thomas
>
>
>
>
>
>
>
>
--
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
From thamelry at binf.ku.dk Wed Jan 19 09:18:27 2005
From: thamelry at binf.ku.dk (Thomas Hamelryck)
Date: Sat Mar 5 14:43:56 2005
Subject: [Biopython-dev] [Bug 1733] compiler recognition in setup.py
In-Reply-To: <41EE66B2.3060209@ims.u-tokyo.ac.jp>
References: <200501171452.j0HEq2kc010627@portal.open-bio.org>
<33213.80.164.86.229.1106033896.squirrel@www.binf.ku.dk>
<41EE66B2.3060209@ims.u-tokyo.ac.jp>
Message-ID: <200501191518.27328.thamelry@binf.ku.dk>
On Wednesday 19 January 2005 14:54, Michiel Jan Laurens de Hoon wrote:
> Thanks, Thomas. Then unless somebody objects, I will switch off compilation
> of KDTree by default in setup.py, but still have it included with the
> Biopython source distribution.
That's fine by me.
-Thomas
From dlondon at ebi.ac.uk Thu Jan 20 12:58:59 2005
From: dlondon at ebi.ac.uk (Darin London)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] BOSC 2005
Message-ID: <20050120175859.GA7254@parrot.ebi.ac.uk>
{Please pass the word!}
MEETING ANNOUNCEMENT & CALL FOR SPEAKERS
The 6th annual Bioinformatics Open Source Conference (BOSC'2005) is organized by the
not-for-profit Open Bioinformatics Foundation. The meeting will take place
June 23-24, 2005 in Detroit, Michigan, USA, and is one of several Special Interest
Group (SIG) meetings occurring in conjunction with the 13th International Conference
on Intelligent Systems for Molecular Biology.
see http://www.iscb.org/ismb2005 for more information.
Because of the power of many Open Source bioinformatics packages in
use by the Research Community today, it is not too presumptuous to say
that the work of the Open Source Bioinformatics Community represents
the cutting edge of Bioinformatics in general. This has been repeatedly
demonstrated by the quality of presentations at previous BOSC conferences.
This year, at BOSC 2006, we want to continue this tradition of excellence,
while presenting this message to a wider part of the Research Community.
Please, pass this message on to anyone you know that is interested in
Bioinformatics software.
BOSC PROGRAM & CONTACT INFO
* Web: http://www.open-bio.org/bosc2005/
* Email: bosc@open-bio.org
FEES
TO BE ANNOUNCED. Watch the bosc website for more information.
SPEAKERS & ABSTRACTS WANTED
The program committee is currently seeking abstracts for talks at BOSC
2005. BOSC is a great opportunity for you to tell the community about
your use, development, or philosophy of open source software development
in bioinformatics. The committee will select several submitted abstracts
for 25-minute talks and others for shorter "lightning" talks. Accepted
abstracts will be published on the BOSC web site.
If you are interested in speaking at BOSC 2005,
please send us before April 26, 2005:
* an abstract (no more than a few paragraphs)
* a URL for the project page, if applicable
* information about the open source license used for your software or
your release plans.
Abstracts will be accepted for submission until April 26, 2005.
Abstracts chosen for presentation will be announced May 12, 2005
(before the ISMB Early Registration Deadline).
LIGHTNING-TALK SPEAKERS WANTED!
The program committee is currently seeking speakers for the lightning
talks at BOSC 2005. Lightning talks are quick - only five minutes
long - and a great opportunity for you to give people a quick
summary of your open source project, code, idea, or vision of the future.
If you are interested in giving a lightning talk at BOSC 2005,
please send us:
* a brief title and summary (one or two lines)
* a URL for the project page, if applicable
* information about the open source license used for your software or
your release plans.
We will accept entries on-line until BOSC starts, but
space for demos and lightning talks is limited.
SOFTWARE DEMONSTRATIONS WANTED!
If you are involved in the development of Open Source Bioinformatics Software,
you are invited to provide a short demonstration to attendees of BOSC 2005.
If you are interested in giving a software demonstration at BOSC 2005,
please send us:
* a brief title and summary (one or two lines)
* a URL for the project page, if applicable
* Internet connectivity requirements (e.g. website Application served on the
world wide web, or web based client application).
We will accept entries on-line until the BOSC starts, but
space for demos and lightning talks is limited.
** Because the mission of the OBF is to promote Open Source software, we will favor submissions for
projects that apply a recognized Open Source License, or adhere to the general Open Source Philosophy.
See the following websites for further details:
href="http://www.opensource.org/licenses/
href="http://www.opensource.org/docs/definition.php
SESSION CHAIRS WANTED
If you would like to be involved BOSC 2005, we invite you to chair a session. This will
not require much of your time. You will be given a schedule of presenters during your session.
You simply introduce each speaker, and manage the time of their presentation (25 minutes for full
presentations, 5-10 minutes for lightning talks/demos, depending on the number of entries).
If you are interested in chairing a session, please send us your name and affiliation (if applicable).
--
cheers,
Darin London dlondon@ebi.ac.uk European Bioinformatics Institute,
+44 (0)1223 49 2566 Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax) Cambridgeshire CB10 1SD, UK
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20050120/9bb1da79/attachment.bin
From biopython-dev at maubp.freeserve.co.uk Thu Jan 20 13:35:47 2005
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] GenBank feature iterator
Message-ID: <41EFFA03.7050202@maubp.freeserve.co.uk>
Hello
I'm trying to use BioPython to parse bacterial genomes from the NCBI:-
ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/
My initial impression is that all of the Bio.GenBank methods scale very
badly with the size of the input file.
For example, Nanoarchaeum equitans, file NC_005213.gbk is about 1.2 MB,
and can be loaded in about one minute using either the FeatureParser or
the RecordParser.
ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Nanoarchaeum_equitans/NC_005213.gbk
However, for larger files the parser seems to run out of system
resources, or maybe requires more time than I have been prepared to give
it. e.g. E. coli K12, file NC_000913.gbk (about 10MB):-
ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/
See also related posts in November 2004, e.g.
http://biopython.org/pipermail/biopython/2004-November/002470.html
To avoid the memory issues, I would like to make a single pass though
the file, iterating over the features (in particular, the CDS features)
one by one into SeqFeature objects (not holding them all in memory at once).
I have tried using the GenBank.Iterator, but as far as I can tell this
reads in a file and each "step" is an entire plasmid/chromosome (the
code looks for the LOCUS line).
It would seem that I would need:
A new "FeatureIterator", ideally using the existing Martel and
mxTextTools 'regular expressions on steroids' framework (which does seem
rather overwhelming!).
and:
A modified version of the FeatureParser to return (just) SeqFeature objects.
Any thoughts?
Thanks
Peter
--
PhD Student
MOAC Doctoral Training Centre
University of Warwick, UK
From mdehoon at ims.u-tokyo.ac.jp Fri Jan 21 04:21:05 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] Re: [BioPython] build errors. "recompile with
-fPIC" Biopython1.30
and CVS, Python2.3, SuSE 9.1, AMD64 . Solution Found
In-Reply-To:
References:
Message-ID: <41F0C981.4050700@ims.u-tokyo.ac.jp>
Thanks. I have updated setup.py in CVS.
Metzidis Anthony wrote:
> Hi,
> Thanks for following up on the problem.
>
> Taking your advice, I started from the original setup.py from CVS, and
> removed lines 201-202 as you instructed. The build completed
> successfully.
>
> Best,
> Tony
>
> -----Original Message-----
> From: Michiel Jan Laurens de Hoon [mailto:mdehoon@ims.u-tokyo.ac.jp]
> Sent: Friday, January 21, 2005 6:24 AM
> To: Metzidis Anthony; Biopython mailing list
> Subject: Re: [BioPython] build errors. "recompile with -fPIC"
> Biopython1.30 and CVS, Python2.3, SuSE 9.1, AMD64 . Solution Found
>
> Hi Metzidis,
>
> Thanks for the patch. It seems though that removing the hacks completely
> would
> break compilation under Python 2.2, which is still supported by
> Biopython. The
> problem can be solved more easily by removing lines 201-202 in setup.py:
>
> elif build: # fix for 2.3, only if we are making C++
> modules
> self.compiler.compiler_so = self.compiler.compiler_cxx
>
> Could you try and see if that fixes the compilation problem on your
> machine?
>
> Compilation of KDTree is likely to be switched off by default in future
> versions
> of biopython because of recurring compilation problems on various
> problems, but
> it would be nice to fix setup.py as much as possible anyway for people
> who want
> to use it.
>
> --Michiel.
>
>
>
> Metzidis Anthony wrote:
>
>
>>Hello everyone,
>>
>>
>>
>>My Biopython (both version 1.30 and CVS) build failed with the error
>>"recompile with -fPIC" on the file Bio/KDTree/KDTree.o.
>>
>>
>>
>>I'm using BioPython CVS and 1.30, Python 2.3 on SuSE 9.1, AMD64.
>>
>>
>>
>>I discovered that the problem was caused by some old hacks in setup.py
>>that altered the compilation of some c++ extensions. Those hacks were
>>relevant to Python 2.2 (according to the documentation), but
>
> conflicted
>
>>with python 2.3
>>
>>
>>
>>The solution was to remove those 'hacks'. After that, the build
>>proceeded successfully.
>>
>>
>>
>>I've attached a patch to the current CVS version of setup.py. It
>
> seems
>
>>to apply to biopython 1.30 as well.
>>
>>
>>
>>Perhaps one of the developers could integrate the ideas in the patch
>>into the code.
>>
>>
>>
>>Hope this helps someone!
>>
>>
>>
>>Have a great day!
>>
>>
>>
>>Best,
>>
>>Tony
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
> ------------------------------------------------------------------------
>
>>_______________________________________________
>>BioPython mailing list - BioPython@biopython.org
>>http://biopython.org/mailman/listinfo/biopython
>
>
--
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
From bugzilla-daemon at portal.open-bio.org Fri Jan 21 03:29:39 2005
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] [Bug 1735] Bio.Blast.NCBIStandalone.BlastParser
crashs with unusual alignment fragments
Message-ID: <200501210829.j0L8TdUG017823@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1735
------- Additional Comments From gebauer-jung@ice.mpg.de 2005-01-21 03:29 -------
Created an attachment (id=191)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=191&action=view)
blast output with unusual fragment
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Jan 21 03:26:54 2005
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] [Bug 1735] New:
Bio.Blast.NCBIStandalone.BlastParser crashs with unusual
alignment fragments
Message-ID: <200501210826.j0L8Qski017798@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1735
Summary: Bio.Blast.NCBIStandalone.BlastParser crashs with unusual
alignment fragments
Product: Biopython
Version: Not Applicable
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev@biopython.org
ReportedBy: gebauer-jung@ice.mpg.de
This is possibly rather a problem of the blastall program
than a bug of the parser, but maybe the parser should
be able to handle it.
I will attach the blast output which crashes the parser
due to the crazy fragment within the last alignment.
Following parameters were used for blastall:
-p blastn
-F F (filter off)
-g (gapped blast)
-G 1 (gap opening penalty)
-E 1 (gap extension penalty)
The parser crashes with the message:
SyntaxError: I could not find the query in line
Query: 0 --
Thanks
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Jan 26 17:39:04 2005
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] [Bug 1740] New: change in GenBank.NCBIDictionary()
Message-ID: <200501262239.j0QMd4XR001042@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1740
Summary: change in GenBank.NCBIDictionary()
Product: Biopython
Version: Not Applicable
Platform: All
URL: http://www.biopython.org/docs/tutorial/Tutorial004.html#
toc13
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Documentation
AssignedTo: biopython-dev@biopython.org
ReportedBy: andrea@salilab.org
In the tutorial and cookbook the python statement
>>> ncbi_dict = GenBank.NCBIDictionary()
should be changed to something like
>>> ncbi_dict = GenBank.NCBIDictionary(database='nucleotide', format = 'genbank')
---
The current statement produce the following error message:
Traceback (most recent call last):
File "./parse_pfam.py", line 234, in ?
main()
File "./parse_pfam.py", line 228, in main
example_genbank()
File "./parse_pfam.py", line 150, in example_genbank
ncbi_dict = GenBank.NCBIDictionary()
TypeError: __init__() takes at least 3 arguments (1 given)
---
This is my first bug I ever submitted. ;-) Hope the format is fine! ;-)
Andrea
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Jan 27 17:14:53 2005
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] [Bug 1741] New: Bug in fasta consumer in
Doc/tutorial.tex and Doc/examples/
Message-ID: <200501272214.j0RMErOG017867@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1741
Summary: Bug in fasta consumer in Doc/tutorial.tex and
Doc/examples/
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Documentation
AssignedTo: biopython-dev@biopython.org
ReportedBy: andrea@salilab.org
Overview:
---------
Script fasta_consumer.py in Doc/examples doesn't work
Steps to Reproduce:
-------------------
python fasta_consumer.py
Actual Results:
-------------
Traceback (most recent call last):
File "fasta_consumer.py", line 35, in ?
all_species = extract_organisms("ls_orchid.fasta", 94)
File "fasta_consumer.py", line 21, in extract_organisms
scanner = Fasta._Scanner()
AttributeError: 'module' object has no attribute '_Scanner'
Expected Results:
-----------------
number of species: 92
species names: ['C.irapeanum', 'C.californicum', 'C.fasciculatum',
'C.margaritaceum', 'C.lichiangense', 'C.yatabeanum', 'C.guttatum',
...
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mdehoon at ims.u-tokyo.ac.jp Fri Jan 28 04:40:32 2005
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] Re: [BioPython] documentation issues
In-Reply-To: <200501272056.j0RKuV5c026299@guitar.compbio.ucsf.edu>
References: <200501272056.j0RKuV5c026299@guitar.compbio.ucsf.edu>
Message-ID: <41FA0890.3090803@ims.u-tokyo.ac.jp>
andrea rossi wrote:
> from Bio import GenBank
>
> gi_list = GenBank.search_for("Opuntia AND rpl16")
> print " gi_list[0] ", gi_list[0]
>
> I get the message:
>
> gi_list[0] Error: Sequence Viewer does not have any Presentations
> for code='uilist_text'
>
> instead of:
>
> ['6273291', '6273290', '6273289', '6273287', '6273286', '6273285',
> '6273284']
It seems that NCBI no longer accepts the "uilist" identifier. E.g. try the url
http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&tool=EUtils_P
ython_client&db=nucleotide&email=biopython-dev%40biopython.org&rettype=uilist&id
=6273291
Does anybody know what this should be?
--
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
From hoffman at ebi.ac.uk Fri Jan 28 09:37:30 2005
From: hoffman at ebi.ac.uk (Michael Hoffman)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] Re: documentation issues
In-Reply-To: <41FA0890.3090803@ims.u-tokyo.ac.jp>
References: <200501272056.j0RKuV5c026299@guitar.compbio.ucsf.edu>
<41FA0890.3090803@ims.u-tokyo.ac.jp>
Message-ID:
On Fri, 28 Jan 2005, Michiel Jan Laurens de Hoon wrote:
> It seems that NCBI no longer accepts the "uilist" identifier. E.g. try the
> url
>
> http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?retmode=text&tool=EUtils_P
> ython_client&db=nucleotide&email=biopython-dev%40biopython.org&rettype=uilist&id
> =6273291
It works for PubMed, just not for nucleotide.
--
Michael Hoffman
European Bioinformatics Institute
From biopython-dev at maubp.freeserve.co.uk Sun Jan 30 06:13:58 2005
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] GenBank feature iterator
In-Reply-To: <41EFFA03.7050202@maubp.freeserve.co.uk>
References: <41EFFA03.7050202@maubp.freeserve.co.uk>
Message-ID: <41FCC176.9070203@maubp.freeserve.co.uk>
I wrote:
> I'm trying to use BioPython to parse bacterial genomes from the NCBI:-
>
> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/
>
> My initial impression is that all of the Bio.GenBank methods scale very
> badly with the size of the input file.
More detailed testing would appear to confirm this. On the bright
side, I haven't actually run into any errors parsing unknown formats.
> For example, Nanoarchaeum equitans, file NC_005213.gbk is about 1.2 MB,
> and can be loaded in about one minute using either the FeatureParser or
> the RecordParser.
>
> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Nanoarchaeum_equitans/NC_005213.gbk
>
> However, for larger files the parser seems to run out of system
> resources, or maybe requires more time than I have been prepared to give
> it. e.g. E. coli K12, file NC_000913.gbk (about 10MB):-
>
> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/
While my laptop did not have sufficient RAM to deal with this file,
my desktop could in about three minutes - see below.
> See also related posts in November 2004, e.g.
>
> http://biopython.org/pipermail/biopython/2004-November/002470.html
I have been doing some testing on this, and more memory makes a
marked difference in the size of GenBank file that can be loaded
(comparing my laptop and home desktop).
The following times and memory usage figures are on Windows 2000,
Python 2.3 running the attached script from idle.
NC_003065.gbk 480 kb, 4 seconds, 28 MB RAM
NC_003064.gbk 1,217 kb, 11 seconds, 56 MB RAM
NC_000854.gbk 3,391 kb, 45 seconds, 165 MB RAM
NC_003063.gbk 4,725 kb, 55 seconds, 195 MB RAM
NC_003062.gbk 6,574 kb, 88 seconds, 268 MB RAM
NC_005966.gbk 8,858 kb, 139 seconds, 372 MB RAM
NC_000913.gbk 10,267 kb, 171 seconds, 409 MB RAM
NC_000962.gbk 11,010 kb, 200 seconds, 486 MB RAM
NC_003997.gbk 12,026 kb, 228 seconds, 496 MB RAM
NC_002678.gbk 15,120 kb, 306 seconds, 586 MB RAM
The computer was a 2.26 GHz Intel Pentium 4, with 735 MB RAM.
For larger files (e.g. NC_005027.gbk at 18,211 kb) the system would
run out of memory and begin paging to disk. For this particular
example, the test eventually completed in half an hour.
I have not performed this test under Linux, but a couple of examples
suggests the behaviour is similar.
In consuming these vast amounts of memory is the GenBank parser
really running "as designed"?
It is conceivable that as smaller genomes have tended to be
sequenced first, that the parser was not originally expected to have
to deal with such large genomes.
Or is there a bug here?
> To avoid the memory issues, I would like to make a single pass though
> the file, iterating over the features (in particular, the CDS features)
> one by one into SeqFeature objects (not holding them all in memory at
> once).
>
> I have tried using the GenBank.Iterator, but as far as I can tell this
> reads in a file and each step is an entire plasmid/chromosome (the
> code looks for the LOCUS line).
>
> It would seem that I would need:
>
> A new FeatureIterator, ideally using the existing Martel and
> mxTextTools 'regular expressions on steroids' framework (which does seem
> rather overwhelming!).
>
> and:
>
> A modified version of the FeatureParser to return (just) SeqFeature
> objects.
I have tried (and so far failed) to understand how the Martel and
mxTextTools parser, and thus modify it in the way I had hoped.
Peter
--
Here is the script, inline rather than as an attachment, which the
mailing list didn't like:
#Following is based on example code from
#http://www.biopython.org/docs/tutorial/Tutorial.html
#3.4.2 Parsing GenBank records
#The example files are all from the NCBI's ftp site,
#ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/
import time
from Bio import GenBank
#The following times and memory usage figures are on Windows 2000,
Python 2.3 running this script from idle.
#The computer was a 2.26 GHz Intel Pentium 4, with 735 MB RAM.
#The memory usage is at the end of the script, rather than the peak
value which is slightly higher, as recorded
#by the windows task manager's process watch.
gb_file =
"C:\\genomes\\Bacteria\\Agrobacterium_tumefaciens_C58_Cereon\\NC_003065.gbk"
# 480 kb, 4 seconds, 28 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Agrobacterium_tumefaciens_C58_Cereon\\NC_003064.gbk"
# 1,217 kb, 11 seconds, 56 MB RAM
gb_file = "C:\\genomes\\Bacteria\\Aeropyrum_pernix\\NC_000854.gbk"
# 3,391 kb, 45 seconds, 165 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Agrobacterium_tumefaciens_C58_Cereon\\NC_003063.gbk"
# 4,725 kb, 55 seconds, 195 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Agrobacterium_tumefaciens_C58_Cereon\\NC_003062.gbk"
# 6,574 kb, 88 seconds, 268 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Acinetobacter_sp_ADP1\\NC_005966.gbk"
# 8,858 kb, 139 seconds, 372 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Escherichia_coli_K12\\NC_000913.gbk"
# 10,267 kb, 171 seconds, 409 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Mycobacterium_tuberculosis_H37Rv\\NC_000962.gbk"
# 11,010 kb, 200 seconds, 486 MB RAM
gb_file =
"C:\\genomes\\Bacteria\\Bacillus_anthracis_Ames\\NC_003997.gbk"
# 12,026 kb, 228 seconds, 496 MB RAM
gb_file = "C:\\genomes\\Bacteria\\Mesorhizobium_loti\\NC_002678.gbk"
# 15,120 kb, 306 seconds, 586 MB RAM
#During the following file, Windows ran out of RAM and was paging to
the hard disk
#almost continuously. After six minures, peak memory usage had been
about 700MB and
#I killed the process. Running this script from the command prompt
rather than idle
#took 30 minutes, however I do not have a memory usage figure on this:
#gb_file = "C:\\genomes\\Bacteria\\Pirellula_sp\\NC_005027.gbk"
# 18,211 kb
#The following are even larger test examples, that I have not atempted:
#gb_file =
"C:\\genomes\\Bacteria\\Bradyrhizobium_japonicum\\NC_004463.gbk"
# 19,500 kb
#gb_file =
"C:\\genomes\\Bacteria\\Streptomyces_coelicolor\\NC_003888.gbk"
# 24,390 kb
gb_handle = open(gb_file, 'r')
feature_parser = GenBank.FeatureParser()
start_time = time.time()
gb_iterator = GenBank.Iterator(gb_handle, feature_parser)
count = 0
while 1:
print "Staring...",
cur_record = gb_iterator.next()
print "Done"
if cur_record is None:
break
count = count + 1
# now do something with the record
print count, cur_record.name, len(cur_record.features),
len(cur_record.seq)
job_time = time.time() - start_time
print "Time elapsed %0.2f seconds" % job_time
From biopython-dev at maubp.freeserve.co.uk Sun Jan 30 06:51:02 2005
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] Can we turn off the spam filter?
In-Reply-To: <1105620744.18194.30.camel@localhost.localdomain>
References: <1105620744.18194.30.camel@localhost.localdomain>
Message-ID: <41FCCA26.6010507@maubp.freeserve.co.uk>
Jonathan Taylor wrote:
> I had a terrible time trying to post a patch. I think it would not
> accept anything with an attachment.
>
> In any case if the list is members only then do we really need a spam
> filter?
Its not just you it doesn't like, my message (sent 28 Jan 2004) with
a Python script (plain text with a .py extension) was also held for
moderator approval because:
"Message has a suspicious header"
(I cancelled it and sent the script in the body rather than as an
attachment)
Perhaps as a compromise, we should allow a few "harmless" attachment
types like .txt, .py and .diff (and continue to block things like
.exe and other windows executables).
Peter
From thamelry at binf.ku.dk Fri Jan 28 05:05:58 2005
From: thamelry at binf.ku.dk (Thomas Hamelryck)
Date: Sat Mar 5 14:43:57 2005
Subject: [Biopython-dev] New release?
In-Reply-To: <41FA0890.3090803@ims.u-tokyo.ac.jp>
References: <200501272056.j0RKuV5c026299@guitar.compbio.ucsf.edu>
<41FA0890.3090803@ims.u-tokyo.ac.jp>
Message-ID: <200501281105.58133.thamelry@binf.ku.dk>
Hi everybody,
Shouldn't we have a new release of Biopython soon?
The last release dates from May 2004 (!).
In the case of Bio.PDB for example, the CVS version contains a
whole load of bugfixes and new features. I guess the same applies
to other modules.
Cheers,
--
Thomas Hamelryck, Postdoctoral researcher
Bioinformatics center
University of Copenhagen
Universitetsparken 15 Bygning 10
2100 Copenhagen, Denmark
---
http://www.binf.ku.dk/users/thamelry/