From Paul.Czodrowski at merck.de  Tue May  3 06:56:10 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 3 May 2011 12:56:10 +0200
Subject: [Biopython] installation as non-administrator
Message-ID: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>


Dear folks,

I'm struggling around with the biopython installation.
As non-administrator, the manual states the following:
http://biopython.org/DIST/docs/install/Installation.html#htoc30

However, the setup.py (version 1.57) does not contain any entry "
include_dirs=["Bio/Cluster", "your_dir/include/python"]
", but rather only "Bio" entries.

(See attached file: setup.py)

Or do I oversee anything?


Regards,
Paul


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: setup.py
Type: application/octet-stream
Size: 11597 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20110503/46a0fb21/attachment.obj>

From p.j.a.cock at googlemail.com  Tue May  3 07:31:31 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 3 May 2011 12:31:31 +0100
Subject: [Biopython] installation as non-administrator
In-Reply-To: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
References: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
Message-ID: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>

On Tue, May 3, 2011 at 11:56 AM,  <Paul.Czodrowski at merck.de> wrote:
>
> Dear folks,
>
> I'm struggling around with the biopython installation.
> As non-administrator, the manual states the following:
> http://biopython.org/DIST/docs/install/Installation.html#htoc30
>
> However, the setup.py (version 1.57) does not contain any entry "
> include_dirs=["Bio/Cluster", "your_dir/include/python"]
> ", but rather only "Bio" entries.
>
> (See attached file: setup.py)

You didn't really need to attach a whole file, you could have
linked to our repository or quoted the bit of interest.

> Or do I oversee anything?

What OS are you using? Some flavour of Linux?

What version of NumPy do you have, and how was it installed?

What command did you use to attempt the install, and what
error message did you get.

Have you tried the --prefix argument?

e.g.

python setup.py build
python setup.py test
python setup.py install --prefix=$HOME

Peter

From anaryin at gmail.com  Tue May  3 07:32:05 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 3 May 2011 13:32:05 +0200
Subject: [Biopython] installation as non-administrator
In-Reply-To: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
References: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
Message-ID: <BANLkTikc60KMVGgbmST0-Xh60LXhPLgi3g@mail.gmail.com>

Hey Paul,

I usually keep a copy of biopython in my home directory either by supplying
the keyword --home=/my/home/directory or just by making "python setup.py
build" and then adding the temp/libxxx/ directory to my PYTHONPATH.

Hope it helps,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 3, 2011 at 12:56 PM, <Paul.Czodrowski at merck.de> wrote:

>
> Dear folks,
>
> I'm struggling around with the biopython installation.
> As non-administrator, the manual states the following:
> http://biopython.org/DIST/docs/install/Installation.html#htoc30
>
> However, the setup.py (version 1.57) does not contain any entry "
> include_dirs=["Bio/Cluster", "your_dir/include/python"]
> ", but rather only "Bio" entries.
>
> (See attached file: setup.py)
>
> Or do I oversee anything?
>
>
> Regards,
> Paul
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>


From anaryin at gmail.com  Tue May  3 07:32:47 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 3 May 2011 13:32:47 +0200
Subject: [Biopython] installation as non-administrator
In-Reply-To: <BANLkTikc60KMVGgbmST0-Xh60LXhPLgi3g@mail.gmail.com>
References: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
	<BANLkTikc60KMVGgbmST0-Xh60LXhPLgi3g@mail.gmail.com>
Message-ID: <BANLkTim2eeP9VN36+Mx-5G0Qa7yvPunJAA@mail.gmail.com>

Sorry, --prefix, not --home.

From mmokrejs at fold.natur.cuni.cz  Tue May  3 08:22:38 2011
From: mmokrejs at fold.natur.cuni.cz (Martin Mokrejs)
Date: Tue, 03 May 2011 14:22:38 +0200
Subject: [Biopython] How to optimize ACE file alignment (from newbler)
Message-ID: <4DBFF38E.7050406@fold.natur.cuni.cz>

Hi,
  I would like to ask you how can I optimize the ACE alignment with files
produced by newbler. I see only the high-quality region is aligned while
the rest is not. I typically ask newbler to place into the ace files untrimmed
reads so the low-quality sequence is present, you can see it could have been
included in the alignment and contribute the consensus quite well.
  I found a new feature of consed-20 being able to re-align the reads 
but that seemed to be too slow for me and had to kill re-processing of one
contig.
  Is there a way to direct some program that I want to re-align just some
columns since some position? That should first align to the consensus already
defined and afterwards continue with de novo alignment as long as it is possible.
  Alternatively, how do you edit ACE alignments (I mean manually adjust gaps,
move columns back and forth, re-order rows) and do you re-calculate the
consensus?
  This is some sort of a follow-up to "Newbler ACE file to SAM?"
posted to biopython-developers list at http://web.archiveorange.com/archive/v/5dAwXxUKZDTmQdM80MqQ
;)
Martin

From p.j.a.cock at googlemail.com  Tue May  3 09:46:25 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 3 May 2011 14:46:25 +0100
Subject: [Biopython] How to optimize ACE file alignment (from newbler)
In-Reply-To: <4DBFF38E.7050406@fold.natur.cuni.cz>
References: <4DBFF38E.7050406@fold.natur.cuni.cz>
Message-ID: <BANLkTikDn-jezoM=0g68RyyEFDbVBTEpoQ@mail.gmail.com>

On Tue, May 3, 2011 at 1:22 PM, Martin Mokrejs
<mmokrejs at fold.natur.cuni.cz> wrote:
> Hi,
> ?I would like to ask you how can I optimize the ACE alignment with files
> produced by newbler. I see only the high-quality region is aligned while
> the rest is not. I typically ask newbler to place into the ace files untrimmed
> reads so the low-quality sequence is present, you can see it could have been
> included in the alignment and contribute the consensus quite well.
> ?I found a new feature of consed-20 being able to re-align the reads
> but that seemed to be too slow for me and had to kill re-processing of one
> contig.
> ?Is there a way to direct some program that I want to re-align just some
> columns since some position? That should first align to the consensus already
> defined and afterwards continue with de novo alignment as long as it is possible.
> ?Alternatively, how do you edit ACE alignments (I mean manually adjust gaps,
> move columns back and forth, re-order rows) and do you re-calculate the
> consensus?
> ?This is some sort of a follow-up to "Newbler ACE file to SAM?"
> posted to biopython-developers list at http://web.archiveorange.com/archive/v/5dAwXxUKZDTmQdM80MqQ
> ;)
> Martin

Hi Martin,

Biopython only has an ACE parser, with no support for writing ACE files.
So, even if you did manipulate the parsed ACE file in Biopython, you'd
have to write your own output code (or use a simpler file format).

Regarding assembly editors, have you looked at Gap4 or Gap5?

This might be a good question to ask on the http://seqanswers.com
forum.

Peter


From Paul.Czodrowski at merck.de  Tue May  3 10:38:25 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 3 May 2011 16:38:25 +0200
Subject: [Biopython] Antwort: Re:  installation as non-administrator
In-Reply-To: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>
Message-ID: <OF4DF1F9B4.C7E3A645-ONC1257885.004F4A78-C1257885.00506C42@merck.de>

Dear Peter,


> >
> > Dear folks,
> >
> > I'm struggling around with the biopython installation.
> > As non-administrator, the manual states the following:
> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
> >
> > However, the setup.py (version 1.57) does not contain any entry "
> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
> > ", but rather only "Bio" entries.
> >
> > (See attached file: setup.py)
>
> You didn't really need to attach a whole file, you could have
> linked to our repository or quoted the bit of interest.
I'm sorry for this!


>
> > Or do I oversee anything?
>
> What OS are you using? Some flavour of Linux?
OpenSuse 11.3

>
> What version of NumPy do you have, and how was it installed?
NumPy version 1.3.0, installed locally by the built-in python routines.

>
> What command did you use to attempt the install, and what
> error message did you get.
python setup.py --build
==> ERROR MESSAGE
"
running build
running build_py
running build_ext
building 'Bio.Cluster.cluster' extension
gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
-fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-funwind-tables -fasynchronous-unwind-tables -g -fPIC
-I/usr/lib/python2.6/site-packages/numpy/core/include
-I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No such
file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
"

>
> Have you tried the --prefix argument?
>
> e.g.
>
> python setup.py build
> python setup.py test
> python setup.py install --prefix=$HOME
>
> Peter

python setup.py --test
==> ERROR MESSAGE
"
python setup.py test
running test
Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
Operating system: posix linux2
test_Ace ... ok
test_AlignIO ... ok
test_AlignIO_convert ... ok
test_BioSQL ... /xyz: UserWarning: order location operators are not fully
supported
  % feature.location_operator)
ok
test_BioSQL_SeqIO ... ERROR
test_CAPS ... ok
test_Clustalw ... ok
test_Clustalw_tool ... skipping. Install clustalw or clustalw2 if you want
to use Bio.Clustalw.
test_Cluster ... skipping. If you want to use Bio.Cluster, install NumPy
first and then reinstall Biopython
test_CodonTable ... ok
test_CodonUsage ... ok
test_Compass ... ok
test_Crystal ... ok
test_Dialign_tool ... skipping. Install DIALIGN2-2 if you want to use the
Bio.Align.Applications wrapper.
test_DocSQL ... ok
test_Emboss ... skipping. Install EMBOSS if you want to use Bio.Emboss.
test_EmbossPhylipNew ... skipping. Install the Emboss package 'PhylipNew'
if you want to use the Bio.Emboss.Applications wrappers for phylogenetic
tools.
test_EmbossPrimer ... ok
test_Entrez ... Segmentation fault (core dumped)
"

python setup.py install --prefix=$HOME
==> the same ERROR MESSAGE as from "python setup.py build"


Cheers & thanks in advance,

Paul

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From Paul.Czodrowski at merck.de  Tue May  3 10:47:00 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 3 May 2011 16:47:00 +0200
Subject: [Biopython] Antwort: Re:  installation as non-administrator
In-Reply-To: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>
Message-ID: <OF499B25B3.DDB632C4-ONC1257885.0050C9A2-C1257885.00513583@merck.de>

Dear Peter,

maybe as additonal question/issue:
numpy is not located in
"/usr/lib/python2.6/site-packages/numpy/core/include "
but in another, rather global, python-lib-directory.

As stated in my previous email,
python setup.py build gives
"gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
-fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-funwind-tables -fasynchronous-unwind-tables -g -fPIC
-I/usr/lib/python2.6/site-packages/numpy/core/include
-I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No such
file or directory"

and I would like to adapt the
"-I/usr/lib/python2.6/site-packages/numpy/core/includ" accordingly to the
directory where it is actually located.

Cheers & thanks,
Paul


> >
> > Dear folks,
> >
> > I'm struggling around with the biopython installation.
> > As non-administrator, the manual states the following:
> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
> >
> > However, the setup.py (version 1.57) does not contain any entry "
> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
> > ", but rather only "Bio" entries.
> >
> > (See attached file: setup.py)
>
> You didn't really need to attach a whole file, you could have
> linked to our repository or quoted the bit of interest.
>
> > Or do I oversee anything?
>
> What OS are you using? Some flavour of Linux?
>
> What version of NumPy do you have, and how was it installed?
>
> What command did you use to attempt the install, and what
> error message did you get.
>
> Have you tried the --prefix argument?
>
> e.g.
>
> python setup.py build
> python setup.py test
> python setup.py install --prefix=$HOME
>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Tue May  3 11:10:30 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 3 May 2011 16:10:30 +0100
Subject: [Biopython] Antwort: Re: installation as non-administrator
In-Reply-To: <OF4DF1F9B4.C7E3A645-ONC1257885.004F4A78-C1257885.00506C42@merck.de>
References: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>
	<OF4DF1F9B4.C7E3A645-ONC1257885.004F4A78-C1257885.00506C42@merck.de>
Message-ID: <BANLkTikAp_+w+r5Kc0OJ-gXFTziPdNOV+w@mail.gmail.com>

On Tue, May 3, 2011 at 3:38 PM,  <Paul.Czodrowski at merck.de> wrote:
> Dear Peter,
>
>
>> >
>> > Dear folks,
>> >
>> > I'm struggling around with the biopython installation.
>> > As non-administrator, the manual states the following:
>> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
>> >
>> > However, the setup.py (version 1.57) does not contain any entry "
>> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
>> > ", but rather only "Bio" entries.
>> >
>> > (See attached file: setup.py)
>>
>> You didn't really need to attach a whole file, you could have
>> linked to our repository or quoted the bit of interest.
>
> I'm sorry for this!

Don't worry too much, its a fairly small file otherwise I wouldn't
have let it though the moderation queue.

>> > Or do I oversee anything?
>>
>> What OS are you using? Some flavour of Linux?
>
> OpenSuse 11.3

Should be fine.

>>
>> What version of NumPy do you have, and how was it installed?
>
> NumPy version 1.3.0, installed locally by the built-in python routines.
>

Any reason for installing such an old version? I'm just curious.

Does NumPy work properly? At the very least, if you run python
does "import numpy" work or give an error? What happens if you
try and do this:

$ python
>>> import numpy
>>> numpy.get_include()
'/usr/local/lib/python2.6/site-packages/numpy/core/include'

(That's the output on one of our Linux machines)

If that doesn't work, perhaps your PYTHONPATH needs setting.
How/where did you install NumPy? e.g. python setup.py --prefix=$HOME


>> What command did you use to attempt the install, and what
>> error message did you get.
> python setup.py --build
> ==> ERROR MESSAGE
> "
> running build
> running build_py
> running build_ext
> building 'Bio.Cluster.cluster' extension
> gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
> -funwind-tables -fasynchronous-unwind-tables -g -fPIC
> -I/usr/lib/python2.6/site-packages/numpy/core/include
> -I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
> build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
> Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No such
> file or directory
> compilation terminated.
> error: command 'gcc' failed with exit status 1
> "

OK, it isn't finding the numpy header files. I'd guess from your next email the
file is /usr/lib/python2.6/site-packages/numpy/core/include/numpy/arrayobject.h

The hack suggested in the installation document is to edit our setup.py
file to point to the path explicitly. There is probably a more elegant way,
right now my guess is that NumPy is not on the python path (see above).

---

>From the test results,

> python setup.py test
> running test
> Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> Operating system: posix linux2
> test_Ace ... ok
> ...
> test_Entrez ... Segmentation fault (core dumped)

Oh, nasty! That should *not* happen, and is probably a separate
issue to the NumPy header install issue.

Peter

From mmokrejs at fold.natur.cuni.cz  Tue May  3 19:20:13 2011
From: mmokrejs at fold.natur.cuni.cz (Martin Mokrejs)
Date: Wed, 04 May 2011 01:20:13 +0200
Subject: [Biopython] How to optimize ACE file alignment (from newbler)
In-Reply-To: <BANLkTikDn-jezoM=0g68RyyEFDbVBTEpoQ@mail.gmail.com>
References: <4DBFF38E.7050406@fold.natur.cuni.cz>
	<BANLkTikDn-jezoM=0g68RyyEFDbVBTEpoQ@mail.gmail.com>
Message-ID: <4DC08DAD.9000100@fold.natur.cuni.cz>

Hi Peter,
  no I haven't played with gap5 yet, so far only with consed and tablet.
Thanks for noting biopython has no write support for ACE.
Martin

Peter Cock wrote:
> On Tue, May 3, 2011 at 1:22 PM, Martin Mokrejs
> <mmokrejs at fold.natur.cuni.cz> wrote:
>> Hi,
>>  I would like to ask you how can I optimize the ACE alignment with files
>> produced by newbler. I see only the high-quality region is aligned while
>> the rest is not. I typically ask newbler to place into the ace files untrimmed
>> reads so the low-quality sequence is present, you can see it could have been
>> included in the alignment and contribute the consensus quite well.
>>  I found a new feature of consed-20 being able to re-align the reads
>> but that seemed to be too slow for me and had to kill re-processing of one
>> contig.
>>  Is there a way to direct some program that I want to re-align just some
>> columns since some position? That should first align to the consensus already
>> defined and afterwards continue with de novo alignment as long as it is possible.
>>  Alternatively, how do you edit ACE alignments (I mean manually adjust gaps,
>> move columns back and forth, re-order rows) and do you re-calculate the
>> consensus?
>>  This is some sort of a follow-up to "Newbler ACE file to SAM?"
>> posted to biopython-developers list at http://web.archiveorange.com/archive/v/5dAwXxUKZDTmQdM80MqQ
>> ;)
>> Martin
> 
> Hi Martin,
> 
> Biopython only has an ACE parser, with no support for writing ACE files.
> So, even if you did manipulate the parsed ACE file in Biopython, you'd
> have to write your own output code (or use a simpler file format).
> 
> Regarding assembly editors, have you looked at Gap4 or Gap5?
> 
> This might be a good question to ask on the http://seqanswers.com
> forum.
> 
> Peter
> 
> 

From Paul.Czodrowski at merck.de  Wed May  4 04:47:14 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 10:47:14 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: installation as
	non-administrator
In-Reply-To: <BANLkTikAp_+w+r5Kc0OJ-gXFTziPdNOV+w@mail.gmail.com>
Message-ID: <OF02A48867.EFE17836-ONC1257886.002402DE-C1257886.0030459B@merck.de>

Dear Peter,


> > Dear Peter,
> >
> >
> >> >
> >> > Dear folks,
> >> >
> >> > I'm struggling around with the biopython installation.
> >> > As non-administrator, the manual states the following:
> >> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
> >> >
> >> > However, the setup.py (version 1.57) does not contain any entry "
> >> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
> >> > ", but rather only "Bio" entries.
> >> >
> >> > (See attached file: setup.py)
> >>
> >> You didn't really need to attach a whole file, you could have
> >> linked to our repository or quoted the bit of interest.
> >
> > I'm sorry for this!
>
> Don't worry too much, its a fairly small file otherwise I wouldn't
> have let it though the moderation queue.
>
> >> > Or do I oversee anything?
> >>
> >> What OS are you using? Some flavour of Linux?
> >
> > OpenSuse 11.3
>
> Should be fine.
>
> >>
> >> What version of NumPy do you have, and how was it installed?
> >
> > NumPy version 1.3.0, installed locally by the built-in python routines.
> >
>
> Any reason for installing such an old version? I'm just curious.

No logical reason... :)


>
> Does NumPy work properly? At the very least, if you run python
> does "import numpy" work or give an error? What happens if you
> try and do this:
>
> $ python
> >>> import numpy
> >>> numpy.get_include()
> '/usr/local/lib/python2.6/site-packages/numpy/core/include'
>
> (That's the output on one of our Linux machines)

We have the same output:
>>>
>>>
>>> numpy.get_include()
'/usr/lib/python2.6/site-packages/numpy/core/include'


>
> If that doesn't work, perhaps your PYTHONPATH needs setting.
> How/where did you install NumPy? e.g. python setup.py --prefix=$HOME

The /usr/lib python is installed via the yast OpenSuse.
But it seems to me that this installation did not work properly, since
there are only 2 files in the directory
" /usr/lib/python2.6/site-packages/numpy/core/include/numpy/":
- ufunc_api.txt
- multiarray_api.txt

However, we have another installation of NumPy which is located here:
"/SW/python/lib/python2.6/site-packages/lib/python2.6/site-packages/numpy"

And yes, there is a mix-up of the directories... :)


> >> What command did you use to attempt the install, and what
> >> error message did you get.
> > python setup.py --build
> > ==> ERROR MESSAGE
> > "
> > running build
> > running build_py
> > running build_ext
> > building 'Bio.Cluster.cluster' extension
> > gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
> > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
> > -funwind-tables -fasynchronous-unwind-tables -g -fPIC
> > -I/usr/lib/python2.6/site-packages/numpy/core/include
> > -I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
> > build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
> > Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No
such
> > file or directory
> > compilation terminated.
> > error: command 'gcc' failed with exit status 1
> > "
>
> OK, it isn't finding the numpy header files. I'd guess from your
> next email the
> file is /usr/lib/python2.6/site-
> packages/numpy/core/include/numpy/arrayobject.h

You are wrong about this.
The header file is locate here:

"/SW/python/lib/python2.6/site-packages/lib/python2.6/site-packages/numpy/core/include/numpy/"


By appropiately setting the PYTHONPATH, it works properly.


>
> The hack suggested in the installation document is to edit our setup.py
> file to point to the path explicitly. There is probably a more elegant
way,
> right now my guess is that NumPy is not on the python path (see above).
>
> ---
>
> >From the test results,
>
> > python setup.py test
> > running test
> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> > Operating system: posix linux2
> > test_Ace ... ok
> > ...
> > test_Entrez ... Segmentation fault (core dumped)
>
> Oh, nasty! That should *not* happen, and is probably a separate
> issue to the NumPy header install issue.


python setup.py install --prefix=$HOME works fine now.

Should the segmentation fault still be considered?

Cheers & thanks,
Paul

>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Wed May  4 05:06:11 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 10:06:11 +0100
Subject: [Biopython] Antwort: Re: Antwort: Re: installation as
	non-administrator
In-Reply-To: <OF02A48867.EFE17836-ONC1257886.002402DE-C1257886.0030459B@merck.de>
References: <BANLkTikAp_+w+r5Kc0OJ-gXFTziPdNOV+w@mail.gmail.com>
	<OF02A48867.EFE17836-ONC1257886.002402DE-C1257886.0030459B@merck.de>
Message-ID: <BANLkTimRtfygjwrUEXS=5x5874HZC6NcwQ@mail.gmail.com>

On Wed, May 4, 2011 at 9:47 AM,  <Paul.Czodrowski at merck.de> wrote:
> Dear Peter,
>
>>
>> Does NumPy work properly? At the very least, if you run python
>> does "import numpy" work or give an error? What happens if you
>> try and do this:
>>
>> $ python
>> >>> import numpy
>> >>> numpy.get_include()
>> '/usr/local/lib/python2.6/site-packages/numpy/core/include'
>>
>> (That's the output on one of our Linux machines)
>
> We have the same output:
>>>>
>>>>
>>>> numpy.get_include()
> '/usr/lib/python2.6/site-packages/numpy/core/include'
>
>
>>
>> If that doesn't work, perhaps your PYTHONPATH needs setting.
>> How/where did you install NumPy? e.g. python setup.py --prefix=$HOME
>
> The /usr/lib python is installed via the yast OpenSuse.
> But it seems to me that this installation did not work properly,
> since, there are only 2 files in the directory
> " /usr/lib/python2.6/site-packages/numpy/core/include/numpy/":
> - ufunc_api.txt
> - multiarray_api.txt
>
> However, we have another installation of NumPy which is located here:
> "/SW/python/lib/python2.6/site-packages/lib/python2.6/site-packages/numpy"
>
> And yes, there is a mix-up of the directories... :)

I think that explains why the Biopython install didn't work originally,
it found the broken NumPy under /usr/lib rather than your good one
installed under /SW/

You might want to try and remove the broken NumPy, as it may
cause you problems installing other python libraries.

>
> By appropiately setting the PYTHONPATH, it works properly.
>

OK, good.

>> >From the test results,
>>
>> > python setup.py test
>> > running test
>> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
>> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
>> > Operating system: posix linux2
>> > test_Ace ... ok
>> > ...
>> > test_Entrez ... Segmentation fault (core dumped)
>>
>> Oh, nasty! That should *not* happen, and is probably a separate
>> issue to the NumPy header install issue.
>
> python setup.py install --prefix=$HOME works fine now.
>
> Should the segmentation fault still be considered?

Yes please. I assume it still breaks? Can you try changing to the
Tests subdirectory from the Biopython source, and doing:

python test_Entrez.py

That should run just the Entrez tests, and hopefully give a bit
more information about what/when the segmentation fault
occurs. I suspect a problem in one of the Python C libraries
that Biopython is using (since as far as I can recall, all the
Bio.Entrez code is pure python).

Peter

From mictadlo at gmail.com  Wed May  4 05:59:13 2011
From: mictadlo at gmail.com (Michal)
Date: Wed, 04 May 2011 19:59:13 +1000
Subject: [Biopython] [BioRuby] Interesting BLAST 2.2.25+ XML behaviour
In-Reply-To: <398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu>
References: <BANLkTi=6_2bFpGhOwxtdjy-DzxUotVWxEg@mail.gmail.com>	<BANLkTinH0y4KQ7_AXt7Ly3TgN9fXxErUzA@mail.gmail.com>
	<398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu>
Message-ID: <4DC12371.3040204@gmail.com>

Hi Peter,
Do you have the script which read

https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml


and what would be the correct output?

Thank you in advance.

Cheers,
Michal

On 05/03/2011 11:31 PM, Chris Fields wrote:
> Haven't tried this using the latest BLAST+ myself, but it doesn't surprise me too much.  Also agree re: some kind of bug tracking with NCBI; I believe they have an internal one, but it would be nice to have a public interface to it.
>
> chris
>
> On May 3, 2011, at 4:24 AM, Peter Cock wrote:
>
>> Hello all,
>>
>> I've CC'd the BioPerl, BioRuby, BioJava and Biopython development mailing
>> lists to make sure you're aware of this, but can we continue any discussion
>> on the cross-project open-bio-l mailing list please?
>>
>> I noticed that recent versions of BLAST are not using a single<iteration>
>> block for each query, which was the historical behaviour and assumed
>> by the Biopython BLAST XML parser. This may be a bug in BLAST.
>> See link below for an example.
>>
>> Has anyone else noticed this, and has it been reported to the NCBI yet?
>>
>> Thanks,
>>
>> Peter
>>
>> (Not for the first time, I wish there was a public bug tracker for BLAST,
>> or at least a private bug tracker so we could talk about issues with an
>> NCBI assigned reference number.)
>>
>> ---------- Forwarded message ----------
>> From: Peter Cock<p.j.a.cock at googlemail.com>
>> Date: Wed, Apr 20, 2011 at 6:08 PM
>> Subject: Interesting BLAST 2.2.25+ XML behaviour
>> To: Biopython-Dev Mailing List<biopython-dev at biopython.org>
>>
>>
>> Hi all,
>>
>> Have a look at this XML file from a FASTA vs FASTA search
>> using blastp from  BLAST 2.2.25+ (current release), which
>> is a test file I created for the BLAST+ wrappers in Galaxy:
>>
>> https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml
>>
>> I just put it though the Biopython BLAST XML parser, and
>> was surprised not to get four records back (since as you
>> might guess from the filename, there were four queries).
>>
>> It appears this version of BLAST+ is incrementing the
>> iteration counter for each match... or something like that.
>>
>> Has anyone else noticed this? I wonder if it is accidental...
>>
>> Peter
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From p.j.a.cock at googlemail.com  Wed May  4 06:36:57 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 11:36:57 +0100
Subject: [Biopython] [BioRuby] Interesting BLAST 2.2.25+ XML behaviour
In-Reply-To: <4DC12371.3040204@gmail.com>
References: <BANLkTi=6_2bFpGhOwxtdjy-DzxUotVWxEg@mail.gmail.com>
	<BANLkTinH0y4KQ7_AXt7Ly3TgN9fXxErUzA@mail.gmail.com>
	<398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu>
	<4DC12371.3040204@gmail.com>
Message-ID: <BANLkTinV4uha74Y9jC_f=XLK5LufAn1xHw@mail.gmail.com>

On Wed, May 4, 2011 at 10:59 AM, Michal <mictadlo at gmail.com> wrote:
> Hi Peter,
> Do you have the script which read
>
> https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml
>
>
> and what would be the correct output?
>
> Thank you in advance.
>
> Cheers,
> Michal

Hi Michal,

I'm not quite sure what you're asking, but I'll try. First, the three
data files:

$ wget https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml
$ wget https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/four_human_proteins.fasta
$ wget https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/rhodopsin_proteins.fasta

The query file has four sequences,

$ grep -c "^>" four_human_proteins.fasta
4

$ grep "^>" four_human_proteins.fasta
>sp|Q9BS26|ERP44_HUMAN Endoplasmic reticulum resident protein 44 OS=Homo sapiens GN=ERP44 PE=1 SV=1
>sp|Q9NSY1|BMP2K_HUMAN BMP-2-inducible protein kinase OS=Homo sapiens GN=BMP2K PE=1 SV=2
>sp|P06213|INSR_HUMAN Insulin receptor OS=Homo sapiens GN=INSR PE=1 SV=4
>sp|P08100|OPSD_HUMAN Rhodopsin OS=Homo sapiens GN=RHO PE=1 SV=1

Based on past experience, I would expect 4 iteration blocks in the
XML, but in this case I have 24:

$ grep "<Iteration>" -c blastp_four_human_vs_rhodopsin.xml
24

Notice we get 6 iterations for each query (4 times 6 is 24):

$ grep "<Iteration_query-ID>" blastp_four_human_vs_rhodopsin.xml
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>

Now, using the two FASTA files directly and re-running blastp, what do I get?

$ ~/Downloads/ncbi-blast-2.2.25+/bin/blastp -query
four_human_proteins.fasta -subject rhodopsin_proteins.fasta -outfmt 5
| grep "<Iteration>" -c
24

Or again with -parse_deflines, which changes how the hit ID/def is presented:

$ ~/Downloads/ncbi-blast-2.2.25+/bin/blastp -query
four_human_proteins.fasta -subject rhodopsin_proteins.fasta -outfmt 5
-parse_deflines | grep "<Iteration>" -c
24

How about older versions?

$ ~/Downloads/ncbi-blast-2.2.24+/bin/blastp -query
four_human_proteins.fasta -subject rhodopsin_proteins.fasta -outfmt 5
BLAST engine error: XML formatting is only supported for a database search

I'll have to make a blast database first...

$ ~/Downloads/ncbi-blast-2.2.24+/bin/makeblastdb -in
rhodopsin_proteins.fasta -dbtype prot

Building a new DB, current time: 05/04/2011 11:22:57
New DB name:   rhodopsin_proteins.fasta
New DB title:  rhodopsin_proteins.fasta
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
Adding sequences from FASTA; added 6 sequences in 0.105655 seconds.

$ ~/Downloads/ncbi-blast-2.2.25+/bin/blastp -query
four_human_proteins.fasta -db rhodopsin_proteins.fasta -outfmt 5 |
grep "<Iteration>" -c
4

Look - just four identifiers as I expect! This also works if the database
is built with the -parse_seqids switch.

The same happens with older versions of BLAST+, one <Iteration>
block per query, so four iteration blocks for this example. I tried all
of 2.2.21+, 2.2.22+, 2.2.23+ and 2.2.24+ (running makeblastdb to
give a fresh database, then blastp).

That seems to demonstrate that bug is specific to the XML output
from FASTA vs FASTA (not FASTA vs DB), which is a new feature
in NCBI BLAST 2.2.25+

I will raise this with the NCBI, and report back.

However, even if the NCBI fix it in the next release, we (Bio*) may
want to update our parsers to cope with this quirk, or at least put a
warning in our BLAST XML parser documentation, as there will be
lots of installations of NCBI BLAST 2.2.25+ in the wild.

Peter

From Paul.Czodrowski at merck.de  Wed May  4 07:30:16 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 13:30:16 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: installation
	as	non-administrator
In-Reply-To: <BANLkTimRtfygjwrUEXS=5x5874HZC6NcwQ@mail.gmail.com>
Message-ID: <OF5C3E5063.6478DA17-ONC1257886.003E29CD-C1257886.003F32C7@merck.de>

Dear Peter,


> >> >From the test results,
> >>
> >> > python setup.py test
> >> > running test
> >> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> >> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> >> > Operating system: posix linux2
> >> > test_Ace ... ok
> >> > ...
> >> > test_Entrez ... Segmentation fault (core dumped)
> >>
> >> Oh, nasty! That should *not* happen, and is probably a separate
> >> issue to the NumPy header install issue.
> >
> > python setup.py install --prefix=$HOME works fine now.
> >
> > Should the segmentation fault still be considered?
>
> Yes please. I assume it still breaks? Can you try changing to the
> Tests subdirectory from the Biopython source, and doing:
>
> python test_Entrez.py

I cannot find the src directory.
Here is my Bio/ directory:
"
Affy
Align
AlignIO
Alphabet
Application
Blast
CAPS
Clustalw
Cluster
Compass
cpairwise2.so
Crystal
Data
DocSQL.py
DocSQL.pyc
Emboss
Entrez
ExPASy
File.py
File.pyc
FSSP
GA
GenBank
Geo
Graphics
HMM
HotRand.py
HotRand.pyc
Index.py
Index.pyc
__init__.py
__init__.pyc
InterPro
KDTree
KEGG
kNN.py
kNN.pyc
LogisticRegression.py
LogisticRegression.pyc
MarkovModel.py
MarkovModel.pyc
MaxEntropy.py
MaxEntropy.pyc
Medline
Motif
NaiveBayes.py
NaiveBayes.pyc
NeuralNetwork
Nexus
NMR
pairwise2.py
pairwise2.pyc
Parsers
ParserSupport.py
ParserSupport.pyc
Pathway
PDB
Phylo
PopGen
_py3k.py
_py3k.pyc
Restriction
SCOP
Search.py
Search.pyc
SeqFeature.py
SeqFeature.pyc
SeqIO
Seq.py
Seq.pyc
SeqRecord.py
SeqRecord.pyc
Sequencing
SeqUtils
Statistics
SubsMat
SVDSuperimposer
SwissProt
triefind.py
triefind.pyc
trie.so
UniGene
Wise
"

BTW, python setup.py install --prefix=$HOME did not break.

Thanks & Ceers,
Pau?

>
> That should run just the Entrez tests, and hopefully give a bit
> more information about what/when the segmentation fault
> occurs. I suspect a problem in one of the Python C libraries
> that Biopython is using (since as far as I can recall, all the
> Bio.Entrez code is pure python).
>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Wed May  4 07:41:07 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Wed, 4 May 2011 13:41:07 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: installation
 as non-administrator
In-Reply-To: <OF5C3E5063.6478DA17-ONC1257886.003E29CD-C1257886.003F32C7@merck.de>
References: <BANLkTimRtfygjwrUEXS=5x5874HZC6NcwQ@mail.gmail.com>
	<OF5C3E5063.6478DA17-ONC1257886.003E29CD-C1257886.003F32C7@merck.de>
Message-ID: <BANLkTikwgMftN9O7RRtqkbQ2tj1-+CGDAA@mail.gmail.com>

On the same level of Bio/ you have another directory called Tests/.

If I list my biopython directory:

joaor at home: ls biopython-git/
*Bio*         BioSQL      CONTRIB     DEPRECATED  Doc         LICENSE
MANIFEST.in NEWS        README      Scripts     *Tests*       build
do2to3.py   setup.py

The file Peter was talking about should be there.

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Wed, May 4, 2011 at 1:30 PM, <Paul.Czodrowski at merck.de> wrote:

> Dear Peter,
>
>
>
> > >> >From the test results,
> > >>
> > >> > python setup.py test
> > >> > running test
> > >> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> > >> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> > >> > Operating system: posix linux2
> > >> > test_Ace ... ok
> > >> > ...
> > >> > test_Entrez ... Segmentation fault (core dumped)
> > >>
> > >> Oh, nasty! That should *not* happen, and is probably a separate
> > >> issue to the NumPy header install issue.
> > >
> > > python setup.py install --prefix=$HOME works fine now.
> > >
> > > Should the segmentation fault still be considered?
> >
> > Yes please. I assume it still breaks? Can you try changing to the
> > Tests subdirectory from the Biopython source, and doing:
> >
> > python test_Entrez.py
>
> I cannot find the src directory.
> Here is my Bio/ directory:
> "
> Affy
> Align
> AlignIO
> Alphabet
> Application
> Blast
> CAPS
> Clustalw
> Cluster
> Compass
> cpairwise2.so
> Crystal
> Data
> DocSQL.py
> DocSQL.pyc
> Emboss
> Entrez
> ExPASy
> File.py
> File.pyc
> FSSP
> GA
> GenBank
> Geo
> Graphics
> HMM
> HotRand.py
> HotRand.pyc
> Index.py
> Index.pyc
> __init__.py
> __init__.pyc
> InterPro
> KDTree
> KEGG
> kNN.py
> kNN.pyc
> LogisticRegression.py
> LogisticRegression.pyc
> MarkovModel.py
> MarkovModel.pyc
> MaxEntropy.py
> MaxEntropy.pyc
> Medline
> Motif
> NaiveBayes.py
> NaiveBayes.pyc
> NeuralNetwork
> Nexus
> NMR
> pairwise2.py
> pairwise2.pyc
> Parsers
> ParserSupport.py
> ParserSupport.pyc
> Pathway
> PDB
> Phylo
> PopGen
> _py3k.py
> _py3k.pyc
> Restriction
> SCOP
> Search.py
> Search.pyc
> SeqFeature.py
> SeqFeature.pyc
> SeqIO
> Seq.py
> Seq.pyc
> SeqRecord.py
> SeqRecord.pyc
> Sequencing
> SeqUtils
> Statistics
> SubsMat
> SVDSuperimposer
> SwissProt
> triefind.py
> triefind.pyc
> trie.so
> UniGene
> Wise
> "
>
> BTW, python setup.py install --prefix=$HOME did not break.
>
> Thanks & Ceers,
> Pau?
>
> >
> > That should run just the Entrez tests, and hopefully give a bit
> > more information about what/when the segmentation fault
> > occurs. I suspect a problem in one of the Python C libraries
> > that Biopython is using (since as far as I can recall, all the
> > Bio.Entrez code is pure python).
> >
> > Peter
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Wed May  4 08:40:06 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 14:40:06 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 installation as non-administrator
In-Reply-To: <BANLkTikwgMftN9O7RRtqkbQ2tj1-+CGDAA@mail.gmail.com>
Message-ID: <OF27B1521F.A060468A-ONC1257886.00457F1B-C1257886.004597A6@merck.de>


Dear Joao & Peter,

this is what I got:

"
Test error handling when presented with Fasta non-XML data ... ok
Test error handling when presented with GenBank non-XML data ... ok
Test parsing XML returned by EFetch, Nucleotide database (first test) ...
ERROR
Test parsing XML returned by EFetch, Protein database ... ERROR
Test parsing XML returned by EFetch, OMIM database ... ERROR
Test parsing XML returned by EFetch, PubMed database (first test) ...
Segmentation fault (core dumped)
"


Cheers,
Paul

> On the same level of Bio/ you have another directory called Tests/.
>
> If I list my biopython directory:
>
> joaor at home: ls biopython-git/
> *Bio*         BioSQL      CONTRIB     DEPRECATED  Doc         LICENSE
> MANIFEST.in NEWS        README      Scripts     *Tests*       build
> do2to3.py   setup.py
>
> The file Peter was talking about should be there.
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Wed May  4 09:17:21 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 14:17:21 +0100
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 installation as non-administrator
In-Reply-To: <OF27B1521F.A060468A-ONC1257886.00457F1B-C1257886.004597A6@merck.de>
References: <BANLkTikwgMftN9O7RRtqkbQ2tj1-+CGDAA@mail.gmail.com>
	<OF27B1521F.A060468A-ONC1257886.00457F1B-C1257886.004597A6@merck.de>
Message-ID: <BANLkTim_YYfgFpgtzKdPT-2SJXjXFZY83Q@mail.gmail.com>

On Wed, May 4, 2011 at 1:40 PM,  <Paul.Czodrowski at merck.de> wrote:
>
> Dear Joao & Peter,
>
> this is what I got:
>
> "
> Test error handling when presented with Fasta non-XML data ... ok
> Test error handling when presented with GenBank non-XML data ... ok
> Test parsing XML returned by EFetch, Nucleotide database (first test) ...
> ERROR
> Test parsing XML returned by EFetch, Protein database ... ERROR
> Test parsing XML returned by EFetch, OMIM database ... ERROR
> Test parsing XML returned by EFetch, PubMed database (first test) ...
> Segmentation fault (core dumped)
> "
>
>
> Cheers,
> Paul

Hmm, something amiss with the XML parsing I think, we're
using the Python standard library xml.parsers.expat here.

You said you were using OpenSuse 11.3, and the start of our test
suite reported the following:

Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
Operating system: posix linux2

What version of expat do you have? Try:

$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.parsers import expat
>>> print expat.__version__
$Revision: 17640 $

Do you fancy trying gdb to get a stack trace for us?

I've had a quick Google, and the following issue *might* be
related: http://bugs.python.org/issue4877

Peter

From Paul.Czodrowski at merck.de  Wed May  4 09:36:42 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 15:36:42 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 Antwort: Re: installation as non-administrator
In-Reply-To: <BANLkTim_YYfgFpgtzKdPT-2SJXjXFZY83Q@mail.gmail.com>
Message-ID: <OF5BE89F6E.17E09B10-ONC1257886.004A5D76-C1257886.004AC62C@merck.de>

[Contact details redacted.]

Peter Cock wrote on 04.05.2011 15:17:21:

> On Wed, May 4, 2011 at 1:40 PM,  <Paul.Czodrowski at merck.de> wrote:
> >
> > Dear Joao & Peter,
> >
> > this is what I got:
> >
> > "
> > Test error handling when presented with Fasta non-XML data ... ok
> > Test error handling when presented with GenBank non-XML data ... ok
> > Test parsing XML returned by EFetch, Nucleotide database (first
test) ...
> > ERROR
> > Test parsing XML returned by EFetch, Protein database ... ERROR
> > Test parsing XML returned by EFetch, OMIM database ... ERROR
> > Test parsing XML returned by EFetch, PubMed database (first test) ...
> > Segmentation fault (core dumped)
> > "
> >
> >
> > Cheers,
> > Paul
>
> Hmm, something amiss with the XML parsing I think, we're
> using the Python standard library xml.parsers.expat here.
>
> You said you were using OpenSuse 11.3, and the start of our test
> suite reported the following:
>
> Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> Operating system: posix linux2
>
> What version of expat do you have? Try:
>
> $ python
> Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
> [GCC 4.4.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from xml.parsers import expat
> >>> print expat.__version__
> $Revision: 17640 $

$Revision: 1.1 $

>
> Do you fancy trying gdb to get a stack trace for us?

How shall I understand your question? Shall I use the gnu debugger in order
to get some debuggable output?

What is the worst case scenario related to biopython, i.e. could it
ultimately lead to any errors/instabilities?


Cheers,
Paul

>
> I've had a quick Google, and the following issue *might* be
> related: http://bugs.python.org/issue4877
>
> Peter
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Wed May  4 10:13:47 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 15:13:47 +0100
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 Antwort: Re: installation as non-administrator
In-Reply-To: <OF5BE89F6E.17E09B10-ONC1257886.004A5D76-C1257886.004AC62C@merck.de>
References: <BANLkTim_YYfgFpgtzKdPT-2SJXjXFZY83Q@mail.gmail.com>
	<OF5BE89F6E.17E09B10-ONC1257886.004A5D76-C1257886.004AC62C@merck.de>
Message-ID: <BANLkTi=pMSuYnAGBhPoP85Q53pQ25Bfhiw@mail.gmail.com>

On Wed, May 4, 2011 at 2:36 PM,  <Paul.Czodrowski at merck.de> wrote:
>>
>> Do you fancy trying gdb to get a stack trace for us?
>
> How shall I understand your question? Shall I use the gnu debugger
> in order to get some debuggable output?

Yes please.

With hindsight, "Could you try using the gnu debugger (gdb) to get
a stack trace?" would have been clearer. Are you familiar with gdb?

Was it the "Do you fancy *activity*?" phrasing that was unclear?
Basically meaning "Would you like to do *activity*?".

> What is the worst case scenario related to biopython, i.e. could it
> ultimately lead to any errors/instabilities?

It looks like if you tried to use Biopython's Bio.Entrez module to
parse XML files from the NCBI it would crash. If you are not going
to use that module, you should be fine.

Peter

From Paul.Czodrowski at merck.de  Wed May  4 10:25:23 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 16:25:23 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 Antwort: Re: Antwort: Re: installation as non-administrator
In-Reply-To: <BANLkTi=pMSuYnAGBhPoP85Q53pQ25Bfhiw@mail.gmail.com>
Message-ID: <OF94F7B8BE.1990457F-ONC1257886.004F0E16-C1257886.004F3B27@merck.de>

[Contact details redacted.]

Peter Cock <p.j.a.cock at googlemail.com> wrote on 04.05.2011 16:13:47:

> On Wed, May 4, 2011 at 2:36 PM,  <Paul.Czodrowski at merck.de> wrote:
> >>
> >> Do you fancy trying gdb to get a stack trace for us?
> >
> > How shall I understand your question? Shall I use the gnu debugger
> > in order to get some debuggable output?
>
> Yes please.
>
> With hindsight, "Could you try using the gnu debugger (gdb) to get
> a stack trace?" would have been clearer. Are you familiar with gdb?
>
> Was it the "Do you fancy *activity*?" phrasing that was unclear?
> Basically meaning "Would you like to do *activity*?".

Yes, it was just the expression you used. I have to admit that English is
not my mother tongue.

>
> > What is the worst case scenario related to biopython, i.e. could it
> > ultimately lead to any errors/instabilities?
>
> It looks like if you tried to use Biopython's Bio.Entrez module to
> parse XML files from the NCBI it would crash. If you are not going
> to use that module, you should be fine.

Good news, thanks :)

And thanks for all the other help, also to JOAO!!


Cheers,
Paul

>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From Paul.Czodrowski at merck.de  Tue May 10 03:50:23 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 09:50:23 +0200
Subject: [Biopython] PDB parsing
Message-ID: <OF13FB6CF9.43C8C805-ONC125788C.002A7B07-C125788C.002B113B@merck.de>


Dear folks,

how do I add a B-factor as well as an occupancy column to a PDB file?

I guess Bio.PDB is the appropriate module.
But I already fail with regards to a simple PDB load...


Cheers,
Paul

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Tue May 10 04:30:04 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 10 May 2011 10:30:04 +0200
Subject: [Biopython] PDB parsing
In-Reply-To: <OF13FB6CF9.43C8C805-ONC125788C.002A7B07-C125788C.002B113B@merck.de>
References: <OF13FB6CF9.43C8C805-ONC125788C.002A7B07-C125788C.002B113B@merck.de>
Message-ID: <BANLkTikXPx1FkAgkoez2UzY0O2O+H1AfyQ@mail.gmail.com>

Hey Paul,

When you parse a PDB file with PDBParser it automatically retrieves both
B-factor and occupancy. If it fails to do so for any reason, it defaults
those values to 0.

After parsing, you can set those values explicitly by modifying the
corresponding attribute of the Atom object. So, for example, to change the
B-factor of all your atoms to 10.0, you just have to do:

for atom in structure.get_atoms():
>   atom.bfactor = 10.0
>

Hope this answered your question.

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:

>
> Dear folks,
>
> how do I add a B-factor as well as an occupancy column to a PDB file?
>
> I guess Bio.PDB is the appropriate module.
> But I already fail with regards to a simple PDB load...
>
>
> Cheers,
> Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Tue May 10 05:19:54 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 11:19:54 +0200
Subject: [Biopython] Antwort: Re:  PDB parsing
In-Reply-To: <BANLkTikXPx1FkAgkoez2UzY0O2O+H1AfyQ@mail.gmail.com>
Message-ID: <OF4766FF65.603639F6-ONC125788C.00314D66-C125788C.0033431F@merck.de>

Dear Joao,

this one does not work:
"

structure_id = "1234"
PDBFILE = open(filename,'r').read()
p = PDBParser(PERMISSIVE=1)
p._parse(PDBFILE)
pp = p.get_structure(structure_id, PDBFILE)


for atom in pp.get_atoms():
 atom.bfactor = 10.0
 print atom.bfactor
"


"p.get_structure(structure_id, PDBFILE)" seems to get the structural data,
but setting the bfactor does not give any output.


Cheers & Thanks,
Paul


> Hey Paul,
>
> When you parse a PDB file with PDBParser it automatically retrieves both
> B-factor and occupancy. If it fails to do so for any reason, it defaults
> those values to 0.
>
> After parsing, you can set those values explicitly by modifying the
> corresponding attribute of the Atom object. So, for example, to change
the
> B-factor of all your atoms to 10.0, you just have to do:
>
> for atom in structure.get_atoms():
> >   atom.bfactor = 10.0
> >
>
> Hope this answered your question.
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>
>
>
> On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
>
> >
> > Dear folks,
> >
> > how do I add a B-factor as well as an occupancy column to a PDB file?
> >
> > I guess Bio.PDB is the appropriate module.
> > But I already fail with regards to a simple PDB load...
> >
> >
> > Cheers,
> > Paul

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Tue May 10 05:27:37 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 10 May 2011 11:27:37 +0200
Subject: [Biopython] Antwort: Re: PDB parsing
In-Reply-To: <OF4766FF65.603639F6-ONC125788C.00314D66-C125788C.0033431F@merck.de>
References: <BANLkTikXPx1FkAgkoez2UzY0O2O+H1AfyQ@mail.gmail.com>
	<OF4766FF65.603639F6-ONC125788C.00314D66-C125788C.0033431F@merck.de>
Message-ID: <BANLkTinNk59YFkaGv8p6kfiF+YkYNrFfhw@mail.gmail.com>

Hey Paul,

First of all, you should not call _parse on your own. That is called already
when you call get_structure(). Generally, if a method has an underscore
behind its name it means it shouldn't really be called unless you really
know what you want to do with it.

What version of Biopython are you using?

I'd do this:

structure_id = "1234"
> PDBFILE = open(filename,'r')
> p = PDBParser(PERMISSIVE=1)
> pp = p.get_structure(structure_id, PDBFILE)
>
> for atom in pp.get_atoms():
>  atom.bfactor = 10.0
>  print atom.bfactor
>

It works pretty well here, with version 1.57.

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:

> Dear Joao,
>
> this one does not work:
> "
>
> structure_id = "1234"
> PDBFILE = open(filename,'r').read()
> p = PDBParser(PERMISSIVE=1)
> p._parse(PDBFILE)
> pp = p.get_structure(structure_id, PDBFILE)
>
>
> for atom in pp.get_atoms():
>  atom.bfactor = 10.0
>  print atom.bfactor
> "
>
>
> "p.get_structure(structure_id, PDBFILE)" seems to get the structural data,
> but setting the bfactor does not give any output.
>
>
>
>
> Cheers & Thanks,
> Paul
>
>
> > Hey Paul,
> >
> > When you parse a PDB file with PDBParser it automatically retrieves both
> > B-factor and occupancy. If it fails to do so for any reason, it defaults
> > those values to 0.
> >
> > After parsing, you can set those values explicitly by modifying the
> > corresponding attribute of the Atom object. So, for example, to change
> the
> > B-factor of all your atoms to 10.0, you just have to do:
> >
> > for atom in structure.get_atoms():
> > >   atom.bfactor = 10.0
> > >
> >
> > Hope this answered your question.
> >
> > Cheers,
> >
> > Jo?o [...] Rodrigues
> > http://nmr.chem.uu.nl/~joao
> >
> >
> >
> > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> >
> > >
> > > Dear folks,
> > >
> > > how do I add a B-factor as well as an occupancy column to a PDB file?
> > >
> > > I guess Bio.PDB is the appropriate module.
> > > But I already fail with regards to a simple PDB load...
> > >
> > >
> > > Cheers,
> > > Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Tue May 10 05:32:33 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 11:32:33 +0200
Subject: [Biopython] Antwort: Re:  Antwort: Re: PDB parsing
In-Reply-To: <BANLkTinNk59YFkaGv8p6kfiF+YkYNrFfhw@mail.gmail.com>
Message-ID: <OFCA9D82C3.3D436E85-ONC125788C.00344BAC-C125788C.00346BAC@merck.de>

Dear Jo?o,


cool, thank you very much so far!

How do I output the newly generated PDBfile?

Cheers & thanks,
Paul


> Hey Paul,
>
> First of all, you should not call _parse on your own. That is called
> already when you call get_structure(). Generally, if a method has an
> underscore behind its name it means it shouldn't really be called
> unless you really know what you want to do with it.
>
> What version of Biopython are you using?
>
> I'd do this:

> structure_id = "1234"
> PDBFILE = open(filename,'r')
> p = PDBParser(PERMISSIVE=1)
> pp = p.get_structure(structure_id, PDBFILE)
>
> for atom in pp.get_atoms():
> ?atom.bfactor = 10.0
> ?print atom.bfactor
>
> It works pretty well here, with version 1.57.
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>
>

> On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:
> Dear Joao,
>
> this one does not work:
> "
>
> structure_id = "1234"
> PDBFILE = open(filename,'r').read()
> p = PDBParser(PERMISSIVE=1)
> p._parse(PDBFILE)
> pp = p.get_structure(structure_id, PDBFILE)
>
>
> for atom in pp.get_atoms():
> ?atom.bfactor = 10.0
> ?print atom.bfactor
> "
>
>
> "p.get_structure(structure_id, PDBFILE)" seems to get the structural
data,
> but setting the bfactor does not give any output.
>
>
>
>
> Cheers & Thanks,
> Paul
>
>
> > Hey Paul,
> >
> > When you parse a PDB file with PDBParser it automatically retrieves
both
> > B-factor and occupancy. If it fails to do so for any reason, it
defaults
> > those values to 0.
> >
> > After parsing, you can set those values explicitly by modifying the
> > corresponding attribute of the Atom object. So, for example, to change
> the
> > B-factor of all your atoms to 10.0, you just have to do:
> >
> > for atom in structure.get_atoms():
> > > ? atom.bfactor = 10.0
> > >
> >
> > Hope this answered your question.
> >
> > Cheers,
> >
> > Jo?o [...] Rodrigues
> > http://nmr.chem.uu.nl/~joao
> >
> >
> >
> > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> >
> > >
> > > Dear folks,
> > >
> > > how do I add a B-factor as well as an occupancy column to a PDB file?
> > >
> > > I guess Bio.PDB is the appropriate module.
> > > But I already fail with regards to a simple PDB load...
> > >
> > >
> > > Cheers,
> > > Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended
recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message
and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and
does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish
and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list ?- ?Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Tue May 10 05:38:23 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 10 May 2011 11:38:23 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: PDB parsing
In-Reply-To: <OFCA9D82C3.3D436E85-ONC125788C.00344BAC-C125788C.00346BAC@merck.de>
References: <BANLkTinNk59YFkaGv8p6kfiF+YkYNrFfhw@mail.gmail.com>
	<OFCA9D82C3.3D436E85-ONC125788C.00344BAC-C125788C.00346BAC@merck.de>
Message-ID: <BANLkTimPyKubjCJCEzXY_ssdYMzpnSoy8A@mail.gmail.com>

Use PDBIO.

from Bio.PDB import PDBIO
IO = PDBIO()
IO.set_structure(your_structure)
IO.save(output_filename)

You can also control which parts of the structure to output with Select.

Check the documentation<http://www.biopython.org/DIST/docs/cookbook/biopdb_faq.pdf>,
it will make you progress much faster :)

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 10, 2011 at 11:32 AM, <Paul.Czodrowski at merck.de> wrote:

> Dear Jo?o,
>
>
> cool, thank you very much so far!
>
> How do I output the newly generated PDBfile?
>
> Cheers & thanks,
> Paul
>
>
>
> > Hey Paul,
> >
> > First of all, you should not call _parse on your own. That is called
> > already when you call get_structure(). Generally, if a method has an
> > underscore behind its name it means it shouldn't really be called
> > unless you really know what you want to do with it.
> >
> > What version of Biopython are you using?
> >
> > I'd do this:
>
> > structure_id = "1234"
> > PDBFILE = open(filename,'r')
> > p = PDBParser(PERMISSIVE=1)
> > pp = p.get_structure(structure_id, PDBFILE)
> >
> > for atom in pp.get_atoms():
> >  atom.bfactor = 10.0
> >  print atom.bfactor
> >
> > It works pretty well here, with version 1.57.
> >
> > Cheers,
> >
> > Jo?o [...] Rodrigues
> > http://nmr.chem.uu.nl/~joao
> >
> >
>
> > On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:
> > Dear Joao,
> >
> > this one does not work:
> > "
> >
> > structure_id = "1234"
> > PDBFILE = open(filename,'r').read()
> > p = PDBParser(PERMISSIVE=1)
> > p._parse(PDBFILE)
> > pp = p.get_structure(structure_id, PDBFILE)
> >
> >
> > for atom in pp.get_atoms():
> >  atom.bfactor = 10.0
> >  print atom.bfactor
> > "
> >
> >
> > "p.get_structure(structure_id, PDBFILE)" seems to get the structural
> data,
> > but setting the bfactor does not give any output.
> >
> >
> >
> >
> > Cheers & Thanks,
> > Paul
> >
> >
> > > Hey Paul,
> > >
> > > When you parse a PDB file with PDBParser it automatically retrieves
> both
> > > B-factor and occupancy. If it fails to do so for any reason, it
> defaults
> > > those values to 0.
> > >
> > > After parsing, you can set those values explicitly by modifying the
> > > corresponding attribute of the Atom object. So, for example, to change
> > the
> > > B-factor of all your atoms to 10.0, you just have to do:
> > >
> > > for atom in structure.get_atoms():
> > > >   atom.bfactor = 10.0
> > > >
> > >
> > > Hope this answered your question.
> > >
> > > Cheers,
> > >
> > > Jo?o [...] Rodrigues
> > > http://nmr.chem.uu.nl/~joao
> > >
> > >
> > >
> > > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> > >
> > > >
> > > > Dear folks,
> > > >
> > > > how do I add a B-factor as well as an occupancy column to a PDB file?
> > > >
> > > > I guess Bio.PDB is the appropriate module.
> > > > But I already fail with regards to a simple PDB load...
> > > >
> > > >
> > > > Cheers,
> > > > Paul
> >
> > This message and any attachment are confidential and may be privileged or
> > otherwise protected from disclosure. If you are not the intended
> recipient,
> > you must not copy this message or attachment or disclose the contents to
> > any other person. If you have received this transmission in error, please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in this
> > message which may arise as a result of E-Mail-transmission or for damages
> > resulting from any unauthorized changes of the content of this message
> and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
> does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://disclaimer.merck.de to access the German, French, Spanish
> and
> > Portuguese versions of this disclaimer.
> >
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Tue May 10 07:05:50 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 13:05:50 +0200
Subject: [Biopython] Antwort: Re:  Antwort: Re: Antwort: Re: PDB parsing
In-Reply-To: <BANLkTimPyKubjCJCEzXY_ssdYMzpnSoy8A@mail.gmail.com>
Message-ID: <OF1FDFD069.9C28606B-ONC125788C.003C1775-C125788C.003CF5D1@merck.de>

Dear Joao,

thanks for your help and the documentation link!
So far, I was aware of this documentation
http://biopython.org/DIST/docs/tutorial/Tutorial.html

wherein PDB parsing is only briefly covered.

And, yes, progress is faster now!


Cheers,
Paul


> Use PDBIO.
>
> from Bio.PDB import PDBIO
> IO = PDBIO()
> IO.set_structure(your_structure)
> IO.save(output_filename)
>
> You can also control which parts of the structure to output with Select.
>
> Check the documentation<http://www.biopython.
> org/DIST/docs/cookbook/biopdb_faq.pdf>,
> it will make you progress much faster :)
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>
>
>
> On Tue, May 10, 2011 at 11:32 AM, <Paul.Czodrowski at merck.de> wrote:
>
> > Dear Jo?o,
> >
> >
> > cool, thank you very much so far!
> >
> > How do I output the newly generated PDBfile?
> >
> > Cheers & thanks,
> > Paul
> >
> >
> >
> > > Hey Paul,
> > >
> > > First of all, you should not call _parse on your own. That is called
> > > already when you call get_structure(). Generally, if a method has an
> > > underscore behind its name it means it shouldn't really be called
> > > unless you really know what you want to do with it.
> > >
> > > What version of Biopython are you using?
> > >
> > > I'd do this:
> >
> > > structure_id = "1234"
> > > PDBFILE = open(filename,'r')
> > > p = PDBParser(PERMISSIVE=1)
> > > pp = p.get_structure(structure_id, PDBFILE)
> > >
> > > for atom in pp.get_atoms():
> > >  atom.bfactor = 10.0
> > >  print atom.bfactor
> > >
> > > It works pretty well here, with version 1.57.
> > >
> > > Cheers,
> > >
> > > Jo?o [...] Rodrigues
> > > http://nmr.chem.uu.nl/~joao
> > >
> > >
> >
> > > On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:
> > > Dear Joao,
> > >
> > > this one does not work:
> > > "
> > >
> > > structure_id = "1234"
> > > PDBFILE = open(filename,'r').read()
> > > p = PDBParser(PERMISSIVE=1)
> > > p._parse(PDBFILE)
> > > pp = p.get_structure(structure_id, PDBFILE)
> > >
> > >
> > > for atom in pp.get_atoms():
> > >  atom.bfactor = 10.0
> > >  print atom.bfactor
> > > "
> > >
> > >
> > > "p.get_structure(structure_id, PDBFILE)" seems to get the structural
> > data,
> > > but setting the bfactor does not give any output.
> > >
> > >
> > >
> > >
> > > Cheers & Thanks,
> > > Paul
> > >
> > >
> > > > Hey Paul,
> > > >
> > > > When you parse a PDB file with PDBParser it automatically retrieves
> > both
> > > > B-factor and occupancy. If it fails to do so for any reason, it
> > defaults
> > > > those values to 0.
> > > >
> > > > After parsing, you can set those values explicitly by modifying the
> > > > corresponding attribute of the Atom object. So, for example, to
change
> > > the
> > > > B-factor of all your atoms to 10.0, you just have to do:
> > > >
> > > > for atom in structure.get_atoms():
> > > > >   atom.bfactor = 10.0
> > > > >
> > > >
> > > > Hope this answered your question.
> > > >
> > > > Cheers,
> > > >
> > > > Jo?o [...] Rodrigues
> > > > http://nmr.chem.uu.nl/~joao
> > > >
> > > >
> > > >
> > > > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> > > >
> > > > >
> > > > > Dear folks,
> > > > >
> > > > > how do I add a B-factor as well as an occupancy column to a PDB
file?
> > > > >
> > > > > I guess Bio.PDB is the appropriate module.
> > > > > But I already fail with regards to a simple PDB load...
> > > > >
> > > > >
> > > > > Cheers,
> > > > > Paul
> > >
> > > This message and any attachment are confidential and may be
privileged or
> > > otherwise protected from disclosure. If you are not the intended
> > recipient,
> > > you must not copy this message or attachment or disclose the contents
to
> > > any other person. If you have received this transmission in error,
please
> > > notify the sender immediately and delete the message and any
attachment
> > > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > > subsidiaries do not accept liability for any omissions or errors in
this
> > > message which may arise as a result of E-Mail-transmission or for
damages
> > > resulting from any unauthorized changes of the content of this
message
> > and
> > > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > > subsidiaries do not guarantee that this message is free of viruses
and
> > does
> > > not accept liability for any damages caused by any virus transmitted
> > > therewith.
> > >
> > > Click http://disclaimer.merck.de to access the German, French,
Spanish
> > and
> > > Portuguese versions of this disclaimer.
> > >
> > >
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> >
> > This message and any attachment are confidential and may be privileged
or
> > otherwise protected from disclosure. If you are not the intended
recipient,
> > you must not copy this message or attachment or disclose the contents
to
> > any other person. If you have received this transmission in error,
please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in
this
> > message which may arise as a result of E-Mail-transmission or for
damages
> > resulting from any unauthorized changes of the content of this message
and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://disclaimer.merck.de to access the German, French, Spanish
and
> > Portuguese versions of this disclaimer.
> >
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> >
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From sainitin7 at gmail.com  Thu May 12 04:39:28 2011
From: sainitin7 at gmail.com (sai nitin)
Date: Thu, 12 May 2011 10:39:28 +0200
Subject: [Biopython] Problem in accessing pcassay database
Message-ID: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>

Hi all,

I am new to Biopython i want to access pcassay database programatically the
exact issue is described below

--- I have list of Bioassay AIDs i want retrieve all Names i treid esummary
to do this but it is giving error
also tried to efetch but didnt succeed..

Can any body tell me possible solution...

Thanks

-- 

Sainitin D

From p.j.a.cock at googlemail.com  Thu May 12 05:15:37 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 12 May 2011 10:15:37 +0100
Subject: [Biopython] Problem in accessing pcassay database
In-Reply-To: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>
References: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>
Message-ID: <BANLkTinv=45jL_e--=Cj3a4CjpMUPo8v-Q@mail.gmail.com>

On Thu, May 12, 2011 at 9:39 AM, sai nitin <sainitin7 at gmail.com> wrote:
> Hi all,
>
> I am new to Biopython i want to access pcassay database programatically the
> exact issue is described below
>
> --- I have list of Bioassay AIDs i want retrieve all Names i treid esummary
> to do this but it is giving error
> also tried to efetch but didnt succeed..
>
> Can any body tell me possible solution...
>
> Thanks

Hi,

Can you do this by hand? Which website would you use? If NCBI Entrez,
then it should be possible using Biopython's Bio.Entrez module.

Could you give an example, say two Bioassay AIDs, and the expected
results (e.g. URLs to NCBI webpage).

Peter

From p.j.a.cock at googlemail.com  Thu May 12 15:04:45 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 12 May 2011 20:04:45 +0100
Subject: [Biopython] Problem in accessing pcassay database
In-Reply-To: <BANLkTim1p32E=RnLGqoVRikv+-0E7tdAjw@mail.gmail.com>
References: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>
	<BANLkTinv=45jL_e--=Cj3a4CjpMUPo8v-Q@mail.gmail.com>
	<BANLkTim1p32E=RnLGqoVRikv+-0E7tdAjw@mail.gmail.com>
Message-ID: <BANLkTi=aXruQWV-gcM=CtQJjQycL6g_h_A@mail.gmail.com>

Please CC the mailing list on any reply.

On Thu, May 12, 2011 at 6:59 PM, sai nitin <sainitin7 at gmail.com> wrote:
> Hi Peter,
> Thanks for reply ya tried with Bio.entrez module (biopython) Ok let me
> explain issue more clearly...Say i have AID as follows
> 1. AID:?504582? i want to?retrieve Description section details from this URL
> (http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=504582&loc=ea_ras)
> Like this i have 20 -30 AIDs I want to do this for all of them
> Any suggestions it would be gr8 help
> Thanks,
> Sainitin

If you look on the page you linked to, notice AID 504582 is itself a
link to Entrez,
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=search&db=pcassay&term=504582

So, I would expect an Entrez search for 504582 in the pcassay database
to work. Trying this by hand on the NCBI Entrez website work fine,
then from Biopython you could do the same search with
Entrez.esearch(db="pcassay", term="504582")

Peter


From mictadlo at gmail.com  Sun May 15 01:35:07 2011
From: mictadlo at gmail.com (Michal)
Date: Sun, 15 May 2011 15:35:07 +1000
Subject: [Biopython] multiprocessing problem with pysam
In-Reply-To: <20110412013119.GF2053@kunkel>
References: <4DA1137E.1090803@gmail.com> <20110410111510.GA2634@kunkel>
	<4DA2EC9D.7040004@gmail.com> <20110412013119.GF2053@kunkel>
Message-ID: <4DCF660B.30309@gmail.com>

Hello,
Thank you Brad. I have written the following new code:

import re
import os
import pysam
from pprint import pprint
from multiprocessing import Pool


class Test():

     def __init__(self, bam_filename, cultivars):
         self.__bam_fh = pysam.Samfile(bam_filename, "rb")
         self.__cultivars = cultivars

     def run(self, ref_name):
         print os.getpid(), ref_name, self.__cultivars
         return (os.getpid(), ref_name)


if __name__ == '__main__':
     cultivars = 'Ja,Ea,As'.replace(' ', '').split(',')
     bam_filename = "/media/usb/tests/test.bam"

     bamfile = pysam.Samfile(bam_filename, "rb")

     ref_names = bamfile.references
     ref_lengths = bamfile.lengths
     bamfile.close()

#    for ref_name in ref_names:
#        Test(bam_filename, cultivars).run(ref_names)


     pool = Pool()
     results = dict(pool.imap_unordered(
         Test(bam_filename, cultivars).run, ref_names))
     pool.close()
     pool.join()
     pprint(results)


and got the follwing error:

Exception in thread Thread-2:
Traceback (most recent call last):
   File "/home/mictadlo/apps/python/lib/python2.7/threading.py", line 
530, in __bootstrap_inner
     self.run()
   File "/home/mictadlo/apps/python/lib/python2.7/threading.py", line 
483, in run
     self.__target(*self.__args, **self.__kwargs)
   File 
"/home/mictadlo/apps/python/lib/python2.7/multiprocessing/pool.py", line 
285, in _handle_tasks
     put(task)
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup 
__builtin__.instancemethod failed


I have search and found two possible solution for this problem:
* http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html
* http://www.rueckstiess.net/research/snippets/show/ca1d7d90

However, is there a better way to solve it or the above solution are not 
good?

Thank you in advance.

Michal


From chapmanb at 50mail.com  Sun May 15 11:53:46 2011
From: chapmanb at 50mail.com (Brad Chapman)
Date: Sun, 15 May 2011 11:53:46 -0400
Subject: [Biopython] multiprocessing problem with pysam
In-Reply-To: <4DCF660B.30309@gmail.com>
References: <4DA1137E.1090803@gmail.com> <20110410111510.GA2634@kunkel>
	<4DA2EC9D.7040004@gmail.com> <20110412013119.GF2053@kunkel>
	<4DCF660B.30309@gmail.com>
Message-ID: <20110515155346.GD2530@kunkel>

Michal;

[multiprocessing]
> class Test():
>     def __init__(self, bam_filename, cultivars):
>         self.__bam_fh = pysam.Samfile(bam_filename, "rb")
>         self.__cultivars = cultivars
> 
>     def run(self, ref_name):
>         print os.getpid(), ref_name, self.__cultivars
>         return (os.getpid(), ref_name)
[...]
>     pool = Pool()
>     results = dict(pool.imap_unordered(
>         Test(bam_filename, cultivars).run, ref_names))
[...]
> and got the follwing error:
> 
> Exception in thread Thread-2:
[...]
> PicklingError: Can't pickle <type 'instancemethod'>: attribute
> lookup __builtin__.instancemethod failed

multiprocessing is sensitive to passing or calling complex class
objects. My suggestion is to use functions without associated state
attributes and pass in your information as standard python objects
(strings, lists, dicts). I use a little decorator to make writing
the functions passed easier:

import functools
def map_wrap(f):
    @functools.wraps(f)
    def wrapper(*args, **kwargs):
        return apply(f, *args, **kwargs)
    return wrapper

Then would write your function as:

@map_wrap
def run_test(bam_filename, cultivars, ref_name):
    bam_fh = pysam.Samfile(bam_filename, "rb")
    print os.getpid(), ref_name, cultivars
    return (os.getpid(), ref_name)

and call it with:

cultivars = 'Ja,Ea,As'.replace(' ', '').split(',')
bam_filename = "/media/usb/tests/test.bam"
bamfile = pysam.Samfile(bam_filename, "rb")
ref_names = bamfile.references
bamfile.close()

pool = Pool()
results = dict(pool.imap(run_test, ((bam_filename, cultivars, ref)
                                    for ref in ref_names)))
pool.close()

Hope this helps,
Brad

From aradwen at gmail.com  Wed May 18 11:28:25 2011
From: aradwen at gmail.com (Radhouane Aniba)
Date: Wed, 18 May 2011 11:28:25 -0400
Subject: [Biopython] Snippets Sharing
Message-ID: <BANLkTikEPiHafURBdsEo2rtxmsFxtKRs5A@mail.gmail.com>

Hi guys,

I apologize if that mail sounds like an ad, please consider it just like an
annoucement.

I just wanted you to be aware of the change that occured to biocoders.net

We restructured it to be an online collaboration tool for bioinformatics,
you could create groups for your projects, interact with other users, upload
snippets and software packages that you find useful, discuss latest topics
in bioinformatics, find newest jobs (we partner with simplyhired jobboard)
and much more.

I am not writing an extended mail so that you don't feel like spammed, it is
not my goal. Just come an explore biocoders.net new formula.

Cheers,

Radhouane

From p.j.a.cock at googlemail.com  Wed May 18 16:42:02 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 18 May 2011 21:42:02 +0100
Subject: [Biopython] gff3 problem
In-Reply-To: <20110408121041.GM20963@sobchak>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
Message-ID: <BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>

On Fri, Apr 8, 2011 at 1:10 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
> Leighton and Peter;
>
>> > Just to further complicate matters, the symbol convention for GFF3 differs
>> > from Biopython in terms of the categories it defines:
>> > + is positive strand
>> > - is negative strand
>> > . is not stranded (i.e. strand not relevant)
>> > ? is strand relevant, but not known
>> > http://www.sequenceontology.org/gff3.shtml
>
> Yes, although this strikes me a bit like fuzzy features in terms of
> usefulness.
>
>> > The latter two are distinct, but not distinguished by convention in
>> > Biopython:
>> > The obvious (to me) mapping of the four allowed Biopython symbols to the
>> > GFF3 convention is:
>> > +1 -> +
>> > -1 -> -
>> > None -> .
>> > 0 -> ?
>> > because 'None' is semantically close to 'has no strand information of
>> > consequence', and 0 is the mean of +1 and -1 ;)
>
> That's fine by me. Right now both '?' and '.' are converted to None
> so I lose the subtle distinction GFF is introducing:
>
> strand_map = {'+' : 1, '-' : -1, '?' : None, None: None}
>
> If everyone agrees on that coding it's no problem to swap it over.
> Brad

So was the consensus that we should reword the Bio.SeqFeature
docstring so say the four valid values for strand are (with GFF3
equivalents in brackets):

+1 = Forward (+ in GFF3)
-1 = Reverse (- in GFF3)
0 = Not stranded (. in GFF3)
None = Unknown (? in GFF3)

And should features on a protein sequence should then have strand 0?

Peter

From hxcan at stupidbeauty.com  Thu May 19 01:00:37 2011
From: hxcan at stupidbeauty.com (=?GB2312?B?ssy78Mqk?=)
Date: Thu, 19 May 2011 13:00:37 +0800
Subject: [Biopython] missing dtd file
Message-ID: <4DD4A3F5.8020406@stupidbeauty.com>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20110519/d6b242dc/attachment.html>

From p.j.a.cock at googlemail.com  Thu May 19 03:57:17 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 19 May 2011 08:57:17 +0100
Subject: [Biopython] missing dtd file
In-Reply-To: <4DD4A3F5.8020406@stupidbeauty.com>
References: <4DD4A3F5.8020406@stupidbeauty.com>
Message-ID: <BANLkTi=Fv7iLowfmPwRikkSjxi2M7on-kA@mail.gmail.com>

2011/5/19 ?????? <hxcan at stupidbeauty.com>:
> Hello
>
>
> Entrez module gives this warning:
>
> /usr/lib/python2.6/site-packages/Bio/Entrez/Parser.py:495: UserWarning:
> Unable to load DTD file eLink_101123.dtd.
>
> Bio.Entrez uses NCBI's DTD files to parse XML files ...
>
> For this purpose, please download eLink_101123.dtd from
>
> http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eLink_101123.dtd
>
> ...

Thank you for alerting us, that file will be included in our next release.

Could you update your copy of Biopython successfully?

Peter


From esa.aalto at oulu.fi  Thu May 19 09:02:17 2011
From: esa.aalto at oulu.fi (Esa Aalto)
Date: Thu, 19 May 2011 16:02:17 +0300
Subject: [Biopython] An error with Concatenate nexus
Message-ID: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>

Dear group,

I'm trying to concatenate 20 nexus files with the instructions given
here:

http://www.biopython.org/wiki/Concatenate_nexus

but it doesn't work:

Traceback (most recent call last):
  File "C:\Python27\concate_nexus.py", line 36, in <module>
    nexi =  [(handle.name, Nexus.Nexus(handle)) for handle in handles]
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 555, in
__init__
    self.read(input)
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 618, in
read
    self._parse_nexus_block(title, contents)
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 659, in
_parse_nexus_block
    getattr(self,'_'+line.command)(line.options)
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 1021, in
_codonposset
    raise NexusError('Formatting Error in codonposset: %s ' % options)
NexusError: Formatting Error in codonposset: * UNTITLED = 1: 1-577\3, 2:
2-578\3, 3: 3-579\3

The end of the first of my nex files looks like this:

BEGIN SETS;
   TaxSet A_thaliana = 1;
   TaxSet A_lyrata = 2;
   TaxSet Boh = 3-32;
   TaxSet Ice = 33-60;
   TaxSet Ith = 61-92;
   TaxSet Kar = 93-124;
   TaxSet Lom = 125-156;
   TaxSet NC = 157-196;
   TaxSet Pl = 197-236;
   TaxSet Sp = 237-274;
   TaxSet Stu = 275-294;
   TaxSet South = 3-32 197-236;
   TaxSet North = 125-156 237-274;
   TaxSet lyrata = 2-294;
END;

BEGIN CODONS; 
   CODONPOSSET * UNTITLED = 
      1: 1-577\3,
      2: 2-578\3,
      3: 3-579\3;
   CODESET * UNTITLED = Universal: all;
END;

BEGIN CODONUSAGE;
END;

BEGIN DnaSP;
   Genome= Diploid;
   ChromosomalLocation= Autosome;
   VariationType= DNA_Seq_Pol;
   Species= ---;
   ChromosomeName= ---;
   GenomicPosition= 1;
   GenomicAssembly= ---;
   DnaSPversion= Ver. 5.10.00;
END;

Could someone tell what's wrong here? Is it my nexus files or something
in the code?

Thanks for your help!

Esa Aalto


From cy at cymon.org  Thu May 19 10:30:36 2011
From: cy at cymon.org (Cymon Cox)
Date: Thu, 19 May 2011 15:30:36 +0100
Subject: [Biopython] An error with Concatenate nexus
In-Reply-To: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>
References: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>
Message-ID: <BANLkTikpgu_znao_YnAa40Hzr73SHB2ybg@mail.gmail.com>

Hi Esa,

At first glance this looks like a bug.

But given that Nexus.combine() is going to discard your codonposset
character partition anyway, you could try deleting it from the Nexus file
before combining.

Regards, Cymon

On 19 May 2011 14:02, Esa Aalto <esa.aalto at oulu.fi> wrote:

> Dear group,
>
> I'm trying to concatenate 20 nexus files with the instructions given
> here:
>
> http://www.biopython.org/wiki/Concatenate_nexus
>
> but it doesn't work:
>
> Traceback (most recent call last):
>  File "C:\Python27\concate_nexus.py", line 36, in <module>
>    nexi =  [(handle.name, Nexus.Nexus(handle)) for handle in handles]
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 555, in
> __init__
>    self.read(input)
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 618, in
> read
>    self._parse_nexus_block(title, contents)
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 659, in
> _parse_nexus_block
>    getattr(self,'_'+line.command)(line.options)
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 1021, in
> _codonposset
>    raise NexusError('Formatting Error in codonposset: %s ' % options)
> NexusError: Formatting Error in codonposset: * UNTITLED = 1: 1-577\3, 2:
> 2-578\3, 3: 3-579\3
>
> The end of the first of my nex files looks like this:
>
> BEGIN SETS;
>   TaxSet A_thaliana = 1;
>   TaxSet A_lyrata = 2;
>   TaxSet Boh = 3-32;
>   TaxSet Ice = 33-60;
>   TaxSet Ith = 61-92;
>   TaxSet Kar = 93-124;
>   TaxSet Lom = 125-156;
>   TaxSet NC = 157-196;
>   TaxSet Pl = 197-236;
>   TaxSet Sp = 237-274;
>   TaxSet Stu = 275-294;
>   TaxSet South = 3-32 197-236;
>   TaxSet North = 125-156 237-274;
>   TaxSet lyrata = 2-294;
> END;
>
> BEGIN CODONS;
>   CODONPOSSET * UNTITLED =
>      1: 1-577\3,
>      2: 2-578\3,
>      3: 3-579\3;
>   CODESET * UNTITLED = Universal: all;
> END;
>
> BEGIN CODONUSAGE;
> END;
>
> BEGIN DnaSP;
>   Genome= Diploid;
>   ChromosomalLocation= Autosome;
>   VariationType= DNA_Seq_Pol;
>   Species= ---;
>   ChromosomeName= ---;
>   GenomicPosition= 1;
>   GenomicAssembly= ---;
>   DnaSPversion= Ver. 5.10.00;
> END;
>
> Could someone tell what's wrong here? Is it my nexus files or something
> in the code?
>
> Thanks for your help!
>
> Esa Aalto
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


--

From fkelesh at gmail.com  Fri May 20 05:33:03 2011
From: fkelesh at gmail.com (Fatih Keles)
Date: Fri, 20 May 2011 12:33:03 +0300
Subject: [Biopython] installing biopython on mac os x 10.6
Message-ID: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>

Hi,

I was trying to install Biopython on mac os x 10.6 using X11. However,
It gives this error :
"""

running install
running build
running build_py
running build_ext
building 'Bio.cpairwise2' extension
gcc-4.0 -fno-strict-aliasing -fno-common -dynamic -arch ppc -arch i386
-g -O2 -DNDEBUG -g -O3 -IBio
-I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
-c Bio/cpairwise2module.c -o
build/temp.macosx-10.3-fat-2.7/Bio/cpairwise2module.o
unable to execute gcc-4.0: No such file or directory
error: command 'gcc-4.0' failed with exit status 1
"""

I couldn't find the problem. I would be happy if you help me.

Thanks,

keles

From p.j.a.cock at googlemail.com  Fri May 20 05:40:16 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 20 May 2011 10:40:16 +0100
Subject: [Biopython] installing biopython on mac os x 10.6
In-Reply-To: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>
References: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>
Message-ID: <BANLkTimNZC3F0ehZ34wz8seHLyybRYYRAQ@mail.gmail.com>

On Fri, May 20, 2011 at 10:33 AM, Fatih Keles <fkelesh at gmail.com> wrote:
> Hi,
>
> I was trying to install Biopython on mac os x 10.6 using X11. However,
> It gives this error :
> """
>
> running install
> running build
> running build_py
> running build_ext
> building 'Bio.cpairwise2' extension
> gcc-4.0 -fno-strict-aliasing -fno-common -dynamic -arch ppc -arch i386
> -g -O2 -DNDEBUG -g -O3 -IBio
> -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
> -c Bio/cpairwise2module.c -o
> build/temp.macosx-10.3-fat-2.7/Bio/cpairwise2module.o
> unable to execute gcc-4.0: No such file or directory
> error: command 'gcc-4.0' failed with exit status 1
> """
>
> I couldn't find the problem. I would be happy if you help me.
>
> Thanks,
>
> keles

Have you installed Apple X Code, the development suite that
comes with Apple's version of gcc (C compiler)? What we say
on the download page of the wiki is:

>> For Mac OS X, we recommend installing from source (see below).
>> You will need to have installed Apple's XCode tools including the
>> optional 10.4 SDK (check the option for 10.4 support when
>> installing Xcode tools).

Peter

From chapmanb at 50mail.com  Fri May 20 07:15:35 2011
From: chapmanb at 50mail.com (Brad Chapman)
Date: Fri, 20 May 2011 07:15:35 -0400
Subject: [Biopython] gff3 problem
In-Reply-To: <BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
Message-ID: <20110520111535.GC21651@sobchak>

Peter;

[SeqFeature support for not-stranded elements]
> So was the consensus that we should reword the Bio.SeqFeature
> docstring so say the four valid values for strand are (with GFF3
> equivalents in brackets):
> 
> +1 = Forward (+ in GFF3)
> -1 = Reverse (- in GFF3)
> 0 = Not stranded (. in GFF3)
> None = Unknown (? in GFF3)
> 
> And should features on a protein sequence should then have strand 0?

That sounds great. I can make the corresponding change to the GFF
library. Let me know if there are any other roadblocks to
integrating that. Thanks much,
Brad

From p.j.a.cock at googlemail.com  Fri May 20 07:27:04 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 20 May 2011 12:27:04 +0100
Subject: [Biopython] gff3 problem
In-Reply-To: <20110520111535.GC21651@sobchak>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
	<20110520111535.GC21651@sobchak>
Message-ID: <BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>

On Fri, May 20, 2011 at 12:15 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
> Peter;
>
> [SeqFeature support for not-stranded elements]
>> So was the consensus that we should reword the Bio.SeqFeature
>> docstring so say the four valid values for strand are (with GFF3
>> equivalents in brackets):
>>
>> +1 = Forward (+ in GFF3)
>> -1 = Reverse (- in GFF3)
>> 0 = Not stranded (. in GFF3)
>> None = Unknown (? in GFF3)
>>
>> And should features on a protein sequence then have strand 0?
>
> That sounds great. I can make the corresponding change to the GFF
> library. Let me know if there are any other roadblocks to
> integrating that. Thanks much,
> Brad

I've remembered a corner case, mixed strand features. e.g the
Arabidopsis thaliana chloroplast complete genome, AP000423
in EMBL, NC_000932 in GenBank (one of our unit test files).
e.g. gene with join(complement(69611..69724),139856..140650)

Clearly the child features have well defined strands (+1 and -1).
The parent feature (the join) is mixed strand. Currently our
GenBank parser uses None for this. So maybe:

+1 = Forward (+ in GFF3)
-1 = Reverse (- in GFF3)
0 = Not stranded (. in GFF3)
None = Mixed or unknown (? in GFF3)

Peter

From cjfields at illinois.edu  Fri May 20 09:24:30 2011
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 May 2011 08:24:30 -0500
Subject: [Biopython] gff3 problem
In-Reply-To: <BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
	<20110520111535.GC21651@sobchak>
	<BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
Message-ID: <E092D5A9-E200-414B-AA92-B9C6578B090E@illinois.edu>

On May 20, 2011, at 6:27 AM, Peter Cock wrote:

> On Fri, May 20, 2011 at 12:15 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>> Peter;
>> 
>> [SeqFeature support for not-stranded elements]
>>> So was the consensus that we should reword the Bio.SeqFeature
>>> docstring so say the four valid values for strand are (with GFF3
>>> equivalents in brackets):
>>> 
>>> +1 = Forward (+ in GFF3)
>>> -1 = Reverse (- in GFF3)
>>> 0 = Not stranded (. in GFF3)
>>> None = Unknown (? in GFF3)
>>> 
>>> And should features on a protein sequence then have strand 0?
>> 
>> That sounds great. I can make the corresponding change to the GFF
>> library. Let me know if there are any other roadblocks to
>> integrating that. Thanks much,
>> Brad
> 
> I've remembered a corner case, mixed strand features. e.g the
> Arabidopsis thaliana chloroplast complete genome, AP000423
> in EMBL, NC_000932 in GenBank (one of our unit test files).
> e.g. gene with join(complement(69611..69724),139856..140650)
> 
> Clearly the child features have well defined strands (+1 and -1).
> The parent feature (the join) is mixed strand. Currently our
> GenBank parser uses None for this. So maybe:
> 
> +1 = Forward (+ in GFF3)
> -1 = Reverse (- in GFF3)
> 0 = Not stranded (. in GFF3)
> None = Mixed or unknown (? in GFF3)
> 
> Peter

That's essentially what bioperl does for 'split' locations (actually, I think it is just undef, which would translate to '?' for GFF3).

chris


From laserson at mit.edu  Fri May 20 17:14:32 2011
From: laserson at mit.edu (Uri Laserson)
Date: Fri, 20 May 2011 17:14:32 -0400
Subject: [Biopython] Serialize SeqRecord to JSON?
Message-ID: <BANLkTinGvFuT8NCmO8-VkvMjwEWd7qzC-g@mail.gmail.com>

Does anyone know of a solution for this?

Thanks!
Uri

...................................................................................
Uri Laserson
Graduate Student, Biomedical Engineering
Harvard-MIT Division of Health Sciences and Technology
M +1 917 742 8019
laserson at mit.edu

From mjldehoon at yahoo.com  Fri May 20 23:59:24 2011
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Fri, 20 May 2011 20:59:24 -0700 (PDT)
Subject: [Biopython] installing biopython on mac os x 10.6
In-Reply-To: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>
Message-ID: <782468.28393.qm@web161211.mail.bf1.yahoo.com>

Probably you don't have a C compiler installed on your computer. The easiest way to get one is to install Apple's Xcode package.

--Michiel.

--- On Fri, 5/20/11, Fatih Keles <fkelesh at gmail.com> wrote:

> From: Fatih Keles <fkelesh at gmail.com>
> Subject: [Biopython] installing biopython on mac os x 10.6
> To: biopython at lists.open-bio.org
> Date: Friday, May 20, 2011, 5:33 AM
> Hi,
> 
> I was trying to install Biopython on mac os x 10.6 using
> X11. However,
> It gives this error :
> """
> 
> running install
> running build
> running build_py
> running build_ext
> building 'Bio.cpairwise2' extension
> gcc-4.0 -fno-strict-aliasing -fno-common -dynamic -arch ppc
> -arch i386
> -g -O2 -DNDEBUG -g -O3 -IBio
> -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
> -c Bio/cpairwise2module.c -o
> build/temp.macosx-10.3-fat-2.7/Bio/cpairwise2module.o
> unable to execute gcc-4.0: No such file or directory
> error: command 'gcc-4.0' failed with exit status 1
> """
> 
> I couldn't find the problem. I would be happy if you help
> me.
> 
> Thanks,
> 
> keles
> _______________________________________________
> Biopython mailing list? -? Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 


From sainitin7 at gmail.com  Mon May 23 04:32:07 2011
From: sainitin7 at gmail.com (sai nitin)
Date: Mon, 23 May 2011 10:32:07 +0200
Subject: [Biopython] Problem to retreive compound names using CID from
	PubChem
Message-ID: <BANLkTimz=mv=VwmBFcnomCXjyqoqS_bg+g@mail.gmail.com>

Hi all,

Myself sainitin i have list of CIDs from Pubchem Database  i want retereive
corresponding compundnames to automate this process im using Biopython
Entrez module (Entrez.esummary) when i give one CID and try to retreive name
of the compound  error is occuring

Code
h = Entrez.esummary(db = "pccompound",id = "449489")
r = Entrez.read(h)
r[0]["SourceName"]

Error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'SourceName'

Can anybody help me to solve this

Thanks
-- 

Sainitin D

From fkauff at biologie.uni-kl.de  Mon May 23 06:19:30 2011
From: fkauff at biologie.uni-kl.de (Frank Kauff)
Date: Mon, 23 May 2011 12:19:30 +0200
Subject: [Biopython] An error with Concatenate nexus
In-Reply-To: <BANLkTikpgu_znao_YnAa40Hzr73SHB2ybg@mail.gmail.com>
References: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>
	<BANLkTikpgu_znao_YnAa40Hzr73SHB2ybg@mail.gmail.com>
Message-ID: <4DDA34B2.9010907@biologie.uni-kl.de>

Hi Esa,

are you using an up-to-date Nexus parser? The codonposset below can be 
read without problems when I copy-paste it into one of my nexus files. 
Or, if you like, send me a copy of your complete nexus file for a check.

Cheers,
Frank


On 05/19/2011 04:30 PM, Cymon Cox wrote:
> Hi Esa,
>
> At first glance this looks like a bug.
>
> But given that Nexus.combine() is going to discard your codonposset
> character partition anyway, you could try deleting it from the Nexus file
> before combining.
>
> Regards, Cymon
>
> On 19 May 2011 14:02, Esa Aalto<esa.aalto at oulu.fi>  wrote:
>
>> Dear group,
>>
>> I'm trying to concatenate 20 nexus files with the instructions given
>> here:
>>
>> http://www.biopython.org/wiki/Concatenate_nexus
>>
>> but it doesn't work:
>>
>> Traceback (most recent call last):
>>   File "C:\Python27\concate_nexus.py", line 36, in<module>
>>     nexi =  [(handle.name, Nexus.Nexus(handle)) for handle in handles]
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 555, in
>> __init__
>>     self.read(input)
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 618, in
>> read
>>     self._parse_nexus_block(title, contents)
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 659, in
>> _parse_nexus_block
>>     getattr(self,'_'+line.command)(line.options)
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 1021, in
>> _codonposset
>>     raise NexusError('Formatting Error in codonposset: %s ' % options)
>> NexusError: Formatting Error in codonposset: * UNTITLED = 1: 1-577\3, 2:
>> 2-578\3, 3: 3-579\3
>>
>> The end of the first of my nex files looks like this:
>>
>> BEGIN SETS;
>>    TaxSet A_thaliana = 1;
>>    TaxSet A_lyrata = 2;
>>    TaxSet Boh = 3-32;
>>    TaxSet Ice = 33-60;
>>    TaxSet Ith = 61-92;
>>    TaxSet Kar = 93-124;
>>    TaxSet Lom = 125-156;
>>    TaxSet NC = 157-196;
>>    TaxSet Pl = 197-236;
>>    TaxSet Sp = 237-274;
>>    TaxSet Stu = 275-294;
>>    TaxSet South = 3-32 197-236;
>>    TaxSet North = 125-156 237-274;
>>    TaxSet lyrata = 2-294;
>> END;
>>
>> BEGIN CODONS;
>>    CODONPOSSET * UNTITLED =
>>       1: 1-577\3,
>>       2: 2-578\3,
>>       3: 3-579\3;
>>    CODESET * UNTITLED = Universal: all;
>> END;
>>
>> BEGIN CODONUSAGE;
>> END;
>>
>> BEGIN DnaSP;
>>    Genome= Diploid;
>>    ChromosomalLocation= Autosome;
>>    VariationType= DNA_Seq_Pol;
>>    Species= ---;
>>    ChromosomeName= ---;
>>    GenomicPosition= 1;
>>    GenomicAssembly= ---;
>>    DnaSPversion= Ver. 5.10.00;
>> END;
>>
>> Could someone tell what's wrong here? Is it my nexus files or something
>> in the code?
>>
>> Thanks for your help!
>>
>> Esa Aalto
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>
>
> --
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From chapmanb at 50mail.com  Mon May 23 06:42:56 2011
From: chapmanb at 50mail.com (Brad Chapman)
Date: Mon, 23 May 2011 06:42:56 -0400
Subject: [Biopython] Problem to retreive compound names using CID from
 PubChem
In-Reply-To: <BANLkTimz=mv=VwmBFcnomCXjyqoqS_bg+g@mail.gmail.com>
References: <BANLkTimz=mv=VwmBFcnomCXjyqoqS_bg+g@mail.gmail.com>
Message-ID: <20110523104256.GA2365@kunkel>

Sainitin;

> Code
> h = Entrez.esummary(db = "pccompound",id = "449489")
> r = Entrez.read(h)
> r[0]["SourceName"]
> 
> Error
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> KeyError: 'SourceName'
> 
> Can anybody help me to solve this

The 'r' object you've parsed from Entrez contains a list of
dictionaries. The information that is in each dictionary will be
dependent on the database you are retrieving from. In this case
there is no SourceName information, so python returns a KeyError to
indicate this.

You can examine the items in the dictionary with:

for key, val in r[0].iteritems():
    print key, val

[...]
InChI InChI=1S/C9H12IN2O8P/c10-4-2-12(9(15)11-8(4)14)7-1-5(13)6(20-7)3-19-21(16,17)18/h2,5-7,13H,1,3H2,(H,11,14,15)(H2,16,17,18)/t5-,6+,7+/m0/s1
TautomerCount 3
SourceIDList []
BondChiralCount 0
MeSHTermList ["5-iodo-2'-deoxyuridine 5'-monophosphate", '5-iodo-dUMP', 'IdUMP', 'iododeoxyuridylate', 'iododeoxyuridylate, 125I-labeled']
[...]

There are also a number of good online resources for learning Python
which will help give experience in debugging these kind of errors:

http://learnpythonthehardway.org/index
http://diveintopython.org/

Hope this helps,
Brad

From p.j.a.cock at googlemail.com  Mon May 23 07:01:51 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 23 May 2011 12:01:51 +0100
Subject: [Biopython] Serialize SeqRecord to JSON?
In-Reply-To: <BANLkTinGvFuT8NCmO8-VkvMjwEWd7qzC-g@mail.gmail.com>
References: <BANLkTinGvFuT8NCmO8-VkvMjwEWd7qzC-g@mail.gmail.com>
Message-ID: <BANLkTiktqaj=7V_y-ajvfA+etqpKi1_SOA@mail.gmail.com>

On Fri, May 20, 2011 at 10:14 PM, Uri Laserson <laserson at mit.edu> wrote:
> Does anyone know of a solution for this?
>
> Thanks!
> Uri

I thought JSON was more suited to holding simple data structures,
rather than serialising arbitrary complex objects.

Which bits of data do you need? The basics like the
id/name/description and sequence could be presented like a tuple and
encoded in JSON. Annotations begins to get complicated - but a
dictionary of basic types should be fine. I suspect the biggest hurdle
would be trying to encode any features.

Peter

From sdavis2 at mail.nih.gov  Mon May 23 14:08:47 2011
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 23 May 2011 14:08:47 -0400
Subject: [Biopython] [OT] Bioconductor-2011 conference.
Message-ID: <BANLkTimpfpNywiJTLhm4P5xtYeXmGFdPrA@mail.gmail.com>

All,

Sorry for the slightly off-topic post, but I know there are some
overlaps between Bioconductor and Biopython user groups.

The Bioconductor-2011 conference will be held July 28-29, 2011
(optional: July 27 - Developer Day) at the Fred Hutchinson Cancer
Research Center in Seattle, WA.  This conference highlights current
developments within and beyond?Bioconductor, an international open
source and open development software project for the analysis and
comprehension of high-throughput genomic data. ?The conference
provides a forum in which to discuss the use and design of software
for analyzing data arising in biology with a focus on Bioconductor and
genomic data.

If interested, see the website:

https://secure.bioconductor.org/BioC2011/

Thanks,
Sean


From laserson at mit.edu  Mon May 23 15:42:35 2011
From: laserson at mit.edu (Uri Laserson)
Date: Mon, 23 May 2011 15:42:35 -0400
Subject: [Biopython] reading Alphabet from file
Message-ID: <BANLkTinD93es+AiPhO-dvdSKdoEEB5WxVQ@mail.gmail.com>

Hi all,

I am trying to implement a method that will convert a SeqRecord to a JSON
serializable object.  One piece of data that must be stored for a Seq object
is the alphabet type.  When I read this from file, what is the best practice
to reload a the same alphabet type?

Thanks!
Uri

...................................................................................
Uri Laserson
Graduate Student, Biomedical Engineering
Harvard-MIT Division of Health Sciences and Technology
M +1 917 742 8019
laserson at mit.edu

From p.j.a.cock at googlemail.com  Mon May 23 18:09:02 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 23 May 2011 23:09:02 +0100
Subject: [Biopython]  reading Alphabet from file
In-Reply-To: <BANLkTinD93es+AiPhO-dvdSKdoEEB5WxVQ@mail.gmail.com>
References: <BANLkTinD93es+AiPhO-dvdSKdoEEB5WxVQ@mail.gmail.com>
Message-ID: <BANLkTinsTbtzFjkgNfJ1+yuJcweX7es_xQ@mail.gmail.com>

On Monday, May 23, 2011, Uri Laserson <laserson at mit.edu> wrote:
> Hi all,
>
> I am trying to implement a method that will convert a SeqRecord to a JSON
> serializable object. ?One piece of data that must be stored for a Seq object
> is the alphabet type. ?When I read this from file, what is the best practice
> to reload a the same alphabet type?
>
> Thanks!
> Uri

Hmm, that's tricky because the Biopython alphabet haerachy is so
complicated. Or richly detailed depending on your point of view ;-)

In your position I would apply the KISS principle and reduce it to
Protein, DNA, RNA or unknown - and use the generic_protein etc classes
on reconstruction. Unless you need more detail than that?

Peter


From p.j.a.cock at googlemail.com  Tue May 24 07:26:25 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 24 May 2011 12:26:25 +0100
Subject: [Biopython] gff3 problem
In-Reply-To: <BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
	<20110520111535.GC21651@sobchak>
	<BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
Message-ID: <BANLkTimHqZTMzMVmZY0O=hYF6xQxcjY6gQ@mail.gmail.com>

On Fri, May 20, 2011 at 12:27 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Fri, May 20, 2011 at 12:15 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>> Peter;
>>
>> [SeqFeature support for not-stranded elements]
>>> So was the consensus that we should reword the Bio.SeqFeature
>>> docstring so say the four valid values for strand are (with GFF3
>>> equivalents in brackets):
>>>
>>> +1 = Forward (+ in GFF3)
>>> -1 = Reverse (- in GFF3)
>>> 0 = Not stranded (. in GFF3)
>>> None = Unknown (? in GFF3)
>>>
>>> And should features on a protein sequence then have strand 0?
>>
>> That sounds great. I can make the corresponding change to the
>> GFF library. Let me know if there are any other roadblocks to
>> integrating that. Thanks much,
>> Brad

Going over this a fresh now, in my email of 20 May, I had mixed up
Leighton's original suggestion. The two special cases (0 and None)
are a bit of a pain:

http://lists.open-bio.org/pipermail/biopython/2011-April/007194.html

Back in April, Leighton wrote:
> The obvious (to me) mapping of the four allowed Biopython symbols to the
> GFF3 convention is:
> +1 -> +
> -1 -> -
> None -> .
> 0 -> ?
> because 'None' is semantically close to 'has no strand information of
> consequence', and 0 is the mean of +1 and -1 ;)
> Cheers,
> L.

i.e.

+1 = Forward (+ in GFF3)
-1 = Reverse (- in GFF3)
0 = Stranded but unknown (? in GFF3)
None = Not stranded (. in GFF3)

SeqFeature docstring updated:
https://github.com/biopython/biopython/commit/ea64c74758dccfc7e6c0940e31a214293ecc59d3

This way proteins features should have strand None (which is what the
current GenBank/EMBL parser does anyway).

Note that the SeqFeature default is strand=None which is still OK.

Mixed strand isn't needed in the GFF3 model, but we already use
None for this. Perhaps it should be 0 rather than None under this model?

Peter

From hxcan at stupidbeauty.com  Sun May 29 03:18:22 2011
From: hxcan at stupidbeauty.com (=?GB2312?B?ssy78Mqk?=)
Date: Sun, 29 May 2011 15:18:22 +0800
Subject: [Biopython] Another warning of "missing dtd file"
Message-ID: <4DE1F33E.8020700@stupidbeauty.com>

 /usr/lib/python2.6/site-packages/Bio/Entrez/Parser.py:495: UserWarning:
Unable to load DTD file bookdoc_110101.dtd.

Bio.Entrez uses NCBI's DTD files to parse XML files returned by NCBI Entrez.
Though most of NCBI's DTD files are included in the Biopython distribution,
sometimes you may find that a particular DTD file is missing. While we can
access the DTD file through the internet, the parser is much faster if the
required DTD files are available locally.

For this purpose, please download bookdoc_110101.dtd from

http://www.ncbi.nlm.nih.gov/entrez/query/DTD/bookdoc_110101.dtd

and save it either in directory

/usr/lib/python2.6/site-packages/Bio/Entrez/DTDs

or in directory

/Data/.biopython/Bio/Entrez/DTDs

in order for Bio.Entrez to find it.

Alternatively, you can save bookdoc_110101.dtd in the directory
Bio/Entrez/DTDs in the Biopython distribution, and reinstall Biopython.

Please also inform the Biopython developers about this missing DTD, by
reporting a bug on http://bugzilla.open-bio.org/ or sign up to our mailing
list and emailing us, so that we can include it with the next release of
Biopython.

Proceeding to access the DTD file through the internet...

warnings.warn(message)


From p.j.a.cock at googlemail.com  Sun May 29 06:00:58 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sun, 29 May 2011 11:00:58 +0100
Subject: [Biopython] Another warning of "missing dtd file"
In-Reply-To: <4DE1F33E.8020700@stupidbeauty.com>
References: <4DE1F33E.8020700@stupidbeauty.com>
Message-ID: <BANLkTi=xsp6djyW2rRFt2TQn69xijX5BiQ@mail.gmail.com>

2011/5/29 ?????? <hxcan at stupidbeauty.com>:
>  /usr/lib/python2.6/site-packages/Bio/Entrez/Parser.py:495: UserWarning:
> Unable to load DTD file bookdoc_110101.dtd.
> ,,,
> For this purpose, please download bookdoc_110101.dtd from
>
> http://www.ncbi.nlm.nih.gov/entrez/query/DTD/bookdoc_110101.dtd
>
> ...
> Please also inform the Biopython developers about this missing DTD, by
> reporting a bug on http://bugzilla.open-bio.org/ or sign up to our mailing
> list and emailing us, so that we can include it with the next release of
> Biopython.

Thank you, that's been added. I don't see anything else missing from
this list, but I know it is a partial listing:

http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/index.shtml

Peter


From sainitin7 at gmail.com  Tue May 31 07:34:54 2011
From: sainitin7 at gmail.com (sai nitin)
Date: Tue, 31 May 2011 13:34:54 +0200
Subject: [Biopython] Query regarding Bioassay database
Message-ID: <BANLkTikJB5Y4KvEdJ7D0tPLKucTzZ_Dtxw@mail.gmail.com>

Hello,

Myself sainitin  i have one query regarding Eutilities use for pubchem and
bioassay database as follows

Question: I have list of pubchem IDs i have to get corresponding bioassay
IDS which are unspecified
for example it should print as following

PubchemID:Bioassay IDs (unspecified)

Please can any one give some suggestions how to retreive unspecified
Bioassay IDS for given Pubchem IDS using Biopython

Thanks in Advance

-- 

Sainitin D

From p.j.a.cock at googlemail.com  Tue May 31 08:30:15 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 31 May 2011 13:30:15 +0100
Subject: [Biopython] Query regarding Bioassay database
In-Reply-To: <BANLkTikJB5Y4KvEdJ7D0tPLKucTzZ_Dtxw@mail.gmail.com>
References: <BANLkTikJB5Y4KvEdJ7D0tPLKucTzZ_Dtxw@mail.gmail.com>
Message-ID: <BANLkTinPVLO5Lhax=XRr-VrYuixaEWaBJQ@mail.gmail.com>

On Tue, May 31, 2011 at 12:34 PM, sai nitin <sainitin7 at gmail.com> wrote:
> Hello,
>
> Myself sainitin ?i have one query regarding Eutilities use for pubchem and
> bioassay database as follows
>
> Question: I have list of pubchem IDs i have to get corresponding bioassay
> IDS which are unspecified
> for example it should print as following
>
> PubchemID:Bioassay IDs (unspecified)
>
> Please can any one give some suggestions how to retreive unspecified
> Bioassay IDS for given Pubchem IDS using Biopython
>
> Thanks in Advance

Try Entrez Link (ELink), possibly with the pcassay_pccompound link. See
the links in the Biopython documentation for Bio.Entrez.ELink, especially:
http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/entrezlinks.html

If you could give a more complete example it would help. In particular,
an example of a positive match between pubchem and bioassay.

Peter


From Paul.Czodrowski at merck.de  Tue May  3 10:56:10 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 3 May 2011 12:56:10 +0200
Subject: [Biopython] installation as non-administrator
Message-ID: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>


Dear folks,

I'm struggling around with the biopython installation.
As non-administrator, the manual states the following:
http://biopython.org/DIST/docs/install/Installation.html#htoc30

However, the setup.py (version 1.57) does not contain any entry "
include_dirs=["Bio/Cluster", "your_dir/include/python"]
", but rather only "Bio" entries.

(See attached file: setup.py)

Or do I oversee anything?


Regards,
Paul


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: setup.py
Type: application/octet-stream
Size: 11597 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20110503/46a0fb21/attachment-0002.obj>

From p.j.a.cock at googlemail.com  Tue May  3 11:31:31 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 3 May 2011 12:31:31 +0100
Subject: [Biopython] installation as non-administrator
In-Reply-To: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
References: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
Message-ID: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>

On Tue, May 3, 2011 at 11:56 AM,  <Paul.Czodrowski at merck.de> wrote:
>
> Dear folks,
>
> I'm struggling around with the biopython installation.
> As non-administrator, the manual states the following:
> http://biopython.org/DIST/docs/install/Installation.html#htoc30
>
> However, the setup.py (version 1.57) does not contain any entry "
> include_dirs=["Bio/Cluster", "your_dir/include/python"]
> ", but rather only "Bio" entries.
>
> (See attached file: setup.py)

You didn't really need to attach a whole file, you could have
linked to our repository or quoted the bit of interest.

> Or do I oversee anything?

What OS are you using? Some flavour of Linux?

What version of NumPy do you have, and how was it installed?

What command did you use to attempt the install, and what
error message did you get.

Have you tried the --prefix argument?

e.g.

python setup.py build
python setup.py test
python setup.py install --prefix=$HOME

Peter


From anaryin at gmail.com  Tue May  3 11:32:05 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 3 May 2011 13:32:05 +0200
Subject: [Biopython] installation as non-administrator
In-Reply-To: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
References: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
Message-ID: <BANLkTikc60KMVGgbmST0-Xh60LXhPLgi3g@mail.gmail.com>

Hey Paul,

I usually keep a copy of biopython in my home directory either by supplying
the keyword --home=/my/home/directory or just by making "python setup.py
build" and then adding the temp/libxxx/ directory to my PYTHONPATH.

Hope it helps,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 3, 2011 at 12:56 PM, <Paul.Czodrowski at merck.de> wrote:

>
> Dear folks,
>
> I'm struggling around with the biopython installation.
> As non-administrator, the manual states the following:
> http://biopython.org/DIST/docs/install/Installation.html#htoc30
>
> However, the setup.py (version 1.57) does not contain any entry "
> include_dirs=["Bio/Cluster", "your_dir/include/python"]
> ", but rather only "Bio" entries.
>
> (See attached file: setup.py)
>
> Or do I oversee anything?
>
>
> Regards,
> Paul
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>


From anaryin at gmail.com  Tue May  3 11:32:47 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 3 May 2011 13:32:47 +0200
Subject: [Biopython] installation as non-administrator
In-Reply-To: <BANLkTikc60KMVGgbmST0-Xh60LXhPLgi3g@mail.gmail.com>
References: <OFC31C2F90.568F01B6-ONC1257885.003BAD0B-C1257885.003C134D@merck.de>
	<BANLkTikc60KMVGgbmST0-Xh60LXhPLgi3g@mail.gmail.com>
Message-ID: <BANLkTim2eeP9VN36+Mx-5G0Qa7yvPunJAA@mail.gmail.com>

Sorry, --prefix, not --home.


From mmokrejs at fold.natur.cuni.cz  Tue May  3 12:22:38 2011
From: mmokrejs at fold.natur.cuni.cz (Martin Mokrejs)
Date: Tue, 03 May 2011 14:22:38 +0200
Subject: [Biopython] How to optimize ACE file alignment (from newbler)
Message-ID: <4DBFF38E.7050406@fold.natur.cuni.cz>

Hi,
  I would like to ask you how can I optimize the ACE alignment with files
produced by newbler. I see only the high-quality region is aligned while
the rest is not. I typically ask newbler to place into the ace files untrimmed
reads so the low-quality sequence is present, you can see it could have been
included in the alignment and contribute the consensus quite well.
  I found a new feature of consed-20 being able to re-align the reads 
but that seemed to be too slow for me and had to kill re-processing of one
contig.
  Is there a way to direct some program that I want to re-align just some
columns since some position? That should first align to the consensus already
defined and afterwards continue with de novo alignment as long as it is possible.
  Alternatively, how do you edit ACE alignments (I mean manually adjust gaps,
move columns back and forth, re-order rows) and do you re-calculate the
consensus?
  This is some sort of a follow-up to "Newbler ACE file to SAM?"
posted to biopython-developers list at http://web.archiveorange.com/archive/v/5dAwXxUKZDTmQdM80MqQ
;)
Martin


From p.j.a.cock at googlemail.com  Tue May  3 13:46:25 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 3 May 2011 14:46:25 +0100
Subject: [Biopython] How to optimize ACE file alignment (from newbler)
In-Reply-To: <4DBFF38E.7050406@fold.natur.cuni.cz>
References: <4DBFF38E.7050406@fold.natur.cuni.cz>
Message-ID: <BANLkTikDn-jezoM=0g68RyyEFDbVBTEpoQ@mail.gmail.com>

On Tue, May 3, 2011 at 1:22 PM, Martin Mokrejs
<mmokrejs at fold.natur.cuni.cz> wrote:
> Hi,
> ?I would like to ask you how can I optimize the ACE alignment with files
> produced by newbler. I see only the high-quality region is aligned while
> the rest is not. I typically ask newbler to place into the ace files untrimmed
> reads so the low-quality sequence is present, you can see it could have been
> included in the alignment and contribute the consensus quite well.
> ?I found a new feature of consed-20 being able to re-align the reads
> but that seemed to be too slow for me and had to kill re-processing of one
> contig.
> ?Is there a way to direct some program that I want to re-align just some
> columns since some position? That should first align to the consensus already
> defined and afterwards continue with de novo alignment as long as it is possible.
> ?Alternatively, how do you edit ACE alignments (I mean manually adjust gaps,
> move columns back and forth, re-order rows) and do you re-calculate the
> consensus?
> ?This is some sort of a follow-up to "Newbler ACE file to SAM?"
> posted to biopython-developers list at http://web.archiveorange.com/archive/v/5dAwXxUKZDTmQdM80MqQ
> ;)
> Martin

Hi Martin,

Biopython only has an ACE parser, with no support for writing ACE files.
So, even if you did manipulate the parsed ACE file in Biopython, you'd
have to write your own output code (or use a simpler file format).

Regarding assembly editors, have you looked at Gap4 or Gap5?

This might be a good question to ask on the http://seqanswers.com
forum.

Peter


From Paul.Czodrowski at merck.de  Tue May  3 14:38:25 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 3 May 2011 16:38:25 +0200
Subject: [Biopython] Antwort: Re:  installation as non-administrator
In-Reply-To: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>
Message-ID: <OF4DF1F9B4.C7E3A645-ONC1257885.004F4A78-C1257885.00506C42@merck.de>

Dear Peter,


> >
> > Dear folks,
> >
> > I'm struggling around with the biopython installation.
> > As non-administrator, the manual states the following:
> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
> >
> > However, the setup.py (version 1.57) does not contain any entry "
> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
> > ", but rather only "Bio" entries.
> >
> > (See attached file: setup.py)
>
> You didn't really need to attach a whole file, you could have
> linked to our repository or quoted the bit of interest.
I'm sorry for this!


>
> > Or do I oversee anything?
>
> What OS are you using? Some flavour of Linux?
OpenSuse 11.3

>
> What version of NumPy do you have, and how was it installed?
NumPy version 1.3.0, installed locally by the built-in python routines.

>
> What command did you use to attempt the install, and what
> error message did you get.
python setup.py --build
==> ERROR MESSAGE
"
running build
running build_py
running build_ext
building 'Bio.Cluster.cluster' extension
gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
-fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-funwind-tables -fasynchronous-unwind-tables -g -fPIC
-I/usr/lib/python2.6/site-packages/numpy/core/include
-I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No such
file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
"

>
> Have you tried the --prefix argument?
>
> e.g.
>
> python setup.py build
> python setup.py test
> python setup.py install --prefix=$HOME
>
> Peter

python setup.py --test
==> ERROR MESSAGE
"
python setup.py test
running test
Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
Operating system: posix linux2
test_Ace ... ok
test_AlignIO ... ok
test_AlignIO_convert ... ok
test_BioSQL ... /xyz: UserWarning: order location operators are not fully
supported
  % feature.location_operator)
ok
test_BioSQL_SeqIO ... ERROR
test_CAPS ... ok
test_Clustalw ... ok
test_Clustalw_tool ... skipping. Install clustalw or clustalw2 if you want
to use Bio.Clustalw.
test_Cluster ... skipping. If you want to use Bio.Cluster, install NumPy
first and then reinstall Biopython
test_CodonTable ... ok
test_CodonUsage ... ok
test_Compass ... ok
test_Crystal ... ok
test_Dialign_tool ... skipping. Install DIALIGN2-2 if you want to use the
Bio.Align.Applications wrapper.
test_DocSQL ... ok
test_Emboss ... skipping. Install EMBOSS if you want to use Bio.Emboss.
test_EmbossPhylipNew ... skipping. Install the Emboss package 'PhylipNew'
if you want to use the Bio.Emboss.Applications wrappers for phylogenetic
tools.
test_EmbossPrimer ... ok
test_Entrez ... Segmentation fault (core dumped)
"

python setup.py install --prefix=$HOME
==> the same ERROR MESSAGE as from "python setup.py build"


Cheers & thanks in advance,

Paul

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From Paul.Czodrowski at merck.de  Tue May  3 14:47:00 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 3 May 2011 16:47:00 +0200
Subject: [Biopython] Antwort: Re:  installation as non-administrator
In-Reply-To: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>
Message-ID: <OF499B25B3.DDB632C4-ONC1257885.0050C9A2-C1257885.00513583@merck.de>

Dear Peter,

maybe as additonal question/issue:
numpy is not located in
"/usr/lib/python2.6/site-packages/numpy/core/include "
but in another, rather global, python-lib-directory.

As stated in my previous email,
python setup.py build gives
"gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
-fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-funwind-tables -fasynchronous-unwind-tables -g -fPIC
-I/usr/lib/python2.6/site-packages/numpy/core/include
-I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No such
file or directory"

and I would like to adapt the
"-I/usr/lib/python2.6/site-packages/numpy/core/includ" accordingly to the
directory where it is actually located.

Cheers & thanks,
Paul


> >
> > Dear folks,
> >
> > I'm struggling around with the biopython installation.
> > As non-administrator, the manual states the following:
> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
> >
> > However, the setup.py (version 1.57) does not contain any entry "
> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
> > ", but rather only "Bio" entries.
> >
> > (See attached file: setup.py)
>
> You didn't really need to attach a whole file, you could have
> linked to our repository or quoted the bit of interest.
>
> > Or do I oversee anything?
>
> What OS are you using? Some flavour of Linux?
>
> What version of NumPy do you have, and how was it installed?
>
> What command did you use to attempt the install, and what
> error message did you get.
>
> Have you tried the --prefix argument?
>
> e.g.
>
> python setup.py build
> python setup.py test
> python setup.py install --prefix=$HOME
>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Tue May  3 15:10:30 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 3 May 2011 16:10:30 +0100
Subject: [Biopython] Antwort: Re: installation as non-administrator
In-Reply-To: <OF4DF1F9B4.C7E3A645-ONC1257885.004F4A78-C1257885.00506C42@merck.de>
References: <BANLkTikon6xTasDNcXW9gXZ22FhVdpGswQ@mail.gmail.com>
	<OF4DF1F9B4.C7E3A645-ONC1257885.004F4A78-C1257885.00506C42@merck.de>
Message-ID: <BANLkTikAp_+w+r5Kc0OJ-gXFTziPdNOV+w@mail.gmail.com>

On Tue, May 3, 2011 at 3:38 PM,  <Paul.Czodrowski at merck.de> wrote:
> Dear Peter,
>
>
>> >
>> > Dear folks,
>> >
>> > I'm struggling around with the biopython installation.
>> > As non-administrator, the manual states the following:
>> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
>> >
>> > However, the setup.py (version 1.57) does not contain any entry "
>> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
>> > ", but rather only "Bio" entries.
>> >
>> > (See attached file: setup.py)
>>
>> You didn't really need to attach a whole file, you could have
>> linked to our repository or quoted the bit of interest.
>
> I'm sorry for this!

Don't worry too much, its a fairly small file otherwise I wouldn't
have let it though the moderation queue.

>> > Or do I oversee anything?
>>
>> What OS are you using? Some flavour of Linux?
>
> OpenSuse 11.3

Should be fine.

>>
>> What version of NumPy do you have, and how was it installed?
>
> NumPy version 1.3.0, installed locally by the built-in python routines.
>

Any reason for installing such an old version? I'm just curious.

Does NumPy work properly? At the very least, if you run python
does "import numpy" work or give an error? What happens if you
try and do this:

$ python
>>> import numpy
>>> numpy.get_include()
'/usr/local/lib/python2.6/site-packages/numpy/core/include'

(That's the output on one of our Linux machines)

If that doesn't work, perhaps your PYTHONPATH needs setting.
How/where did you install NumPy? e.g. python setup.py --prefix=$HOME


>> What command did you use to attempt the install, and what
>> error message did you get.
> python setup.py --build
> ==> ERROR MESSAGE
> "
> running build
> running build_py
> running build_ext
> building 'Bio.Cluster.cluster' extension
> gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
> -funwind-tables -fasynchronous-unwind-tables -g -fPIC
> -I/usr/lib/python2.6/site-packages/numpy/core/include
> -I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
> build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
> Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No such
> file or directory
> compilation terminated.
> error: command 'gcc' failed with exit status 1
> "

OK, it isn't finding the numpy header files. I'd guess from your next email the
file is /usr/lib/python2.6/site-packages/numpy/core/include/numpy/arrayobject.h

The hack suggested in the installation document is to edit our setup.py
file to point to the path explicitly. There is probably a more elegant way,
right now my guess is that NumPy is not on the python path (see above).

---

>From the test results,

> python setup.py test
> running test
> Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> Operating system: posix linux2
> test_Ace ... ok
> ...
> test_Entrez ... Segmentation fault (core dumped)

Oh, nasty! That should *not* happen, and is probably a separate
issue to the NumPy header install issue.

Peter


From mmokrejs at fold.natur.cuni.cz  Tue May  3 23:20:13 2011
From: mmokrejs at fold.natur.cuni.cz (Martin Mokrejs)
Date: Wed, 04 May 2011 01:20:13 +0200
Subject: [Biopython] How to optimize ACE file alignment (from newbler)
In-Reply-To: <BANLkTikDn-jezoM=0g68RyyEFDbVBTEpoQ@mail.gmail.com>
References: <4DBFF38E.7050406@fold.natur.cuni.cz>
	<BANLkTikDn-jezoM=0g68RyyEFDbVBTEpoQ@mail.gmail.com>
Message-ID: <4DC08DAD.9000100@fold.natur.cuni.cz>

Hi Peter,
  no I haven't played with gap5 yet, so far only with consed and tablet.
Thanks for noting biopython has no write support for ACE.
Martin

Peter Cock wrote:
> On Tue, May 3, 2011 at 1:22 PM, Martin Mokrejs
> <mmokrejs at fold.natur.cuni.cz> wrote:
>> Hi,
>>  I would like to ask you how can I optimize the ACE alignment with files
>> produced by newbler. I see only the high-quality region is aligned while
>> the rest is not. I typically ask newbler to place into the ace files untrimmed
>> reads so the low-quality sequence is present, you can see it could have been
>> included in the alignment and contribute the consensus quite well.
>>  I found a new feature of consed-20 being able to re-align the reads
>> but that seemed to be too slow for me and had to kill re-processing of one
>> contig.
>>  Is there a way to direct some program that I want to re-align just some
>> columns since some position? That should first align to the consensus already
>> defined and afterwards continue with de novo alignment as long as it is possible.
>>  Alternatively, how do you edit ACE alignments (I mean manually adjust gaps,
>> move columns back and forth, re-order rows) and do you re-calculate the
>> consensus?
>>  This is some sort of a follow-up to "Newbler ACE file to SAM?"
>> posted to biopython-developers list at http://web.archiveorange.com/archive/v/5dAwXxUKZDTmQdM80MqQ
>> ;)
>> Martin
> 
> Hi Martin,
> 
> Biopython only has an ACE parser, with no support for writing ACE files.
> So, even if you did manipulate the parsed ACE file in Biopython, you'd
> have to write your own output code (or use a simpler file format).
> 
> Regarding assembly editors, have you looked at Gap4 or Gap5?
> 
> This might be a good question to ask on the http://seqanswers.com
> forum.
> 
> Peter
> 
> 


From Paul.Czodrowski at merck.de  Wed May  4 08:47:14 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 10:47:14 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: installation as
	non-administrator
In-Reply-To: <BANLkTikAp_+w+r5Kc0OJ-gXFTziPdNOV+w@mail.gmail.com>
Message-ID: <OF02A48867.EFE17836-ONC1257886.002402DE-C1257886.0030459B@merck.de>

Dear Peter,


> > Dear Peter,
> >
> >
> >> >
> >> > Dear folks,
> >> >
> >> > I'm struggling around with the biopython installation.
> >> > As non-administrator, the manual states the following:
> >> > http://biopython.org/DIST/docs/install/Installation.html#htoc30
> >> >
> >> > However, the setup.py (version 1.57) does not contain any entry "
> >> > include_dirs=["Bio/Cluster", "your_dir/include/python"]
> >> > ", but rather only "Bio" entries.
> >> >
> >> > (See attached file: setup.py)
> >>
> >> You didn't really need to attach a whole file, you could have
> >> linked to our repository or quoted the bit of interest.
> >
> > I'm sorry for this!
>
> Don't worry too much, its a fairly small file otherwise I wouldn't
> have let it though the moderation queue.
>
> >> > Or do I oversee anything?
> >>
> >> What OS are you using? Some flavour of Linux?
> >
> > OpenSuse 11.3
>
> Should be fine.
>
> >>
> >> What version of NumPy do you have, and how was it installed?
> >
> > NumPy version 1.3.0, installed locally by the built-in python routines.
> >
>
> Any reason for installing such an old version? I'm just curious.

No logical reason... :)


>
> Does NumPy work properly? At the very least, if you run python
> does "import numpy" work or give an error? What happens if you
> try and do this:
>
> $ python
> >>> import numpy
> >>> numpy.get_include()
> '/usr/local/lib/python2.6/site-packages/numpy/core/include'
>
> (That's the output on one of our Linux machines)

We have the same output:
>>>
>>>
>>> numpy.get_include()
'/usr/lib/python2.6/site-packages/numpy/core/include'


>
> If that doesn't work, perhaps your PYTHONPATH needs setting.
> How/where did you install NumPy? e.g. python setup.py --prefix=$HOME

The /usr/lib python is installed via the yast OpenSuse.
But it seems to me that this installation did not work properly, since
there are only 2 files in the directory
" /usr/lib/python2.6/site-packages/numpy/core/include/numpy/":
- ufunc_api.txt
- multiarray_api.txt

However, we have another installation of NumPy which is located here:
"/SW/python/lib/python2.6/site-packages/lib/python2.6/site-packages/numpy"

And yes, there is a mix-up of the directories... :)


> >> What command did you use to attempt the install, and what
> >> error message did you get.
> > python setup.py --build
> > ==> ERROR MESSAGE
> > "
> > running build
> > running build_py
> > running build_ext
> > building 'Bio.Cluster.cluster' extension
> > gcc -pthread -fno-strict-aliasing -DNDEBUG -fomit-frame-pointer
> > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
> > -funwind-tables -fasynchronous-unwind-tables -g -fPIC
> > -I/usr/lib/python2.6/site-packages/numpy/core/include
> > -I/usr/include/python2.6 -c Bio/Cluster/clustermodule.c -o
> > build/temp.linux-i686-2.6/Bio/Cluster/clustermodule.o
> > Bio/Cluster/clustermodule.c:2:31: fatal error: numpy/arrayobject.h: No
such
> > file or directory
> > compilation terminated.
> > error: command 'gcc' failed with exit status 1
> > "
>
> OK, it isn't finding the numpy header files. I'd guess from your
> next email the
> file is /usr/lib/python2.6/site-
> packages/numpy/core/include/numpy/arrayobject.h

You are wrong about this.
The header file is locate here:

"/SW/python/lib/python2.6/site-packages/lib/python2.6/site-packages/numpy/core/include/numpy/"


By appropiately setting the PYTHONPATH, it works properly.


>
> The hack suggested in the installation document is to edit our setup.py
> file to point to the path explicitly. There is probably a more elegant
way,
> right now my guess is that NumPy is not on the python path (see above).
>
> ---
>
> >From the test results,
>
> > python setup.py test
> > running test
> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> > Operating system: posix linux2
> > test_Ace ... ok
> > ...
> > test_Entrez ... Segmentation fault (core dumped)
>
> Oh, nasty! That should *not* happen, and is probably a separate
> issue to the NumPy header install issue.


python setup.py install --prefix=$HOME works fine now.

Should the segmentation fault still be considered?

Cheers & thanks,
Paul

>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Wed May  4 09:06:11 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 10:06:11 +0100
Subject: [Biopython] Antwort: Re: Antwort: Re: installation as
	non-administrator
In-Reply-To: <OF02A48867.EFE17836-ONC1257886.002402DE-C1257886.0030459B@merck.de>
References: <BANLkTikAp_+w+r5Kc0OJ-gXFTziPdNOV+w@mail.gmail.com>
	<OF02A48867.EFE17836-ONC1257886.002402DE-C1257886.0030459B@merck.de>
Message-ID: <BANLkTimRtfygjwrUEXS=5x5874HZC6NcwQ@mail.gmail.com>

On Wed, May 4, 2011 at 9:47 AM,  <Paul.Czodrowski at merck.de> wrote:
> Dear Peter,
>
>>
>> Does NumPy work properly? At the very least, if you run python
>> does "import numpy" work or give an error? What happens if you
>> try and do this:
>>
>> $ python
>> >>> import numpy
>> >>> numpy.get_include()
>> '/usr/local/lib/python2.6/site-packages/numpy/core/include'
>>
>> (That's the output on one of our Linux machines)
>
> We have the same output:
>>>>
>>>>
>>>> numpy.get_include()
> '/usr/lib/python2.6/site-packages/numpy/core/include'
>
>
>>
>> If that doesn't work, perhaps your PYTHONPATH needs setting.
>> How/where did you install NumPy? e.g. python setup.py --prefix=$HOME
>
> The /usr/lib python is installed via the yast OpenSuse.
> But it seems to me that this installation did not work properly,
> since, there are only 2 files in the directory
> " /usr/lib/python2.6/site-packages/numpy/core/include/numpy/":
> - ufunc_api.txt
> - multiarray_api.txt
>
> However, we have another installation of NumPy which is located here:
> "/SW/python/lib/python2.6/site-packages/lib/python2.6/site-packages/numpy"
>
> And yes, there is a mix-up of the directories... :)

I think that explains why the Biopython install didn't work originally,
it found the broken NumPy under /usr/lib rather than your good one
installed under /SW/

You might want to try and remove the broken NumPy, as it may
cause you problems installing other python libraries.

>
> By appropiately setting the PYTHONPATH, it works properly.
>

OK, good.

>> >From the test results,
>>
>> > python setup.py test
>> > running test
>> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
>> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
>> > Operating system: posix linux2
>> > test_Ace ... ok
>> > ...
>> > test_Entrez ... Segmentation fault (core dumped)
>>
>> Oh, nasty! That should *not* happen, and is probably a separate
>> issue to the NumPy header install issue.
>
> python setup.py install --prefix=$HOME works fine now.
>
> Should the segmentation fault still be considered?

Yes please. I assume it still breaks? Can you try changing to the
Tests subdirectory from the Biopython source, and doing:

python test_Entrez.py

That should run just the Entrez tests, and hopefully give a bit
more information about what/when the segmentation fault
occurs. I suspect a problem in one of the Python C libraries
that Biopython is using (since as far as I can recall, all the
Bio.Entrez code is pure python).

Peter


From mictadlo at gmail.com  Wed May  4 09:59:13 2011
From: mictadlo at gmail.com (Michal)
Date: Wed, 04 May 2011 19:59:13 +1000
Subject: [Biopython] [BioRuby] Interesting BLAST 2.2.25+ XML behaviour
In-Reply-To: <398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu>
References: <BANLkTi=6_2bFpGhOwxtdjy-DzxUotVWxEg@mail.gmail.com>	<BANLkTinH0y4KQ7_AXt7Ly3TgN9fXxErUzA@mail.gmail.com>
	<398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu>
Message-ID: <4DC12371.3040204@gmail.com>

Hi Peter,
Do you have the script which read

https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml


and what would be the correct output?

Thank you in advance.

Cheers,
Michal

On 05/03/2011 11:31 PM, Chris Fields wrote:
> Haven't tried this using the latest BLAST+ myself, but it doesn't surprise me too much.  Also agree re: some kind of bug tracking with NCBI; I believe they have an internal one, but it would be nice to have a public interface to it.
>
> chris
>
> On May 3, 2011, at 4:24 AM, Peter Cock wrote:
>
>> Hello all,
>>
>> I've CC'd the BioPerl, BioRuby, BioJava and Biopython development mailing
>> lists to make sure you're aware of this, but can we continue any discussion
>> on the cross-project open-bio-l mailing list please?
>>
>> I noticed that recent versions of BLAST are not using a single<iteration>
>> block for each query, which was the historical behaviour and assumed
>> by the Biopython BLAST XML parser. This may be a bug in BLAST.
>> See link below for an example.
>>
>> Has anyone else noticed this, and has it been reported to the NCBI yet?
>>
>> Thanks,
>>
>> Peter
>>
>> (Not for the first time, I wish there was a public bug tracker for BLAST,
>> or at least a private bug tracker so we could talk about issues with an
>> NCBI assigned reference number.)
>>
>> ---------- Forwarded message ----------
>> From: Peter Cock<p.j.a.cock at googlemail.com>
>> Date: Wed, Apr 20, 2011 at 6:08 PM
>> Subject: Interesting BLAST 2.2.25+ XML behaviour
>> To: Biopython-Dev Mailing List<biopython-dev at biopython.org>
>>
>>
>> Hi all,
>>
>> Have a look at this XML file from a FASTA vs FASTA search
>> using blastp from  BLAST 2.2.25+ (current release), which
>> is a test file I created for the BLAST+ wrappers in Galaxy:
>>
>> https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml
>>
>> I just put it though the Biopython BLAST XML parser, and
>> was surprised not to get four records back (since as you
>> might guess from the filename, there were four queries).
>>
>> It appears this version of BLAST+ is incrementing the
>> iteration counter for each match... or something like that.
>>
>> Has anyone else noticed this? I wonder if it is accidental...
>>
>> Peter
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From p.j.a.cock at googlemail.com  Wed May  4 10:36:57 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 11:36:57 +0100
Subject: [Biopython] [BioRuby] Interesting BLAST 2.2.25+ XML behaviour
In-Reply-To: <4DC12371.3040204@gmail.com>
References: <BANLkTi=6_2bFpGhOwxtdjy-DzxUotVWxEg@mail.gmail.com>
	<BANLkTinH0y4KQ7_AXt7Ly3TgN9fXxErUzA@mail.gmail.com>
	<398303E2-1195-4CC2-8B73-09C6C1117892@illinois.edu>
	<4DC12371.3040204@gmail.com>
Message-ID: <BANLkTinV4uha74Y9jC_f=XLK5LufAn1xHw@mail.gmail.com>

On Wed, May 4, 2011 at 10:59 AM, Michal <mictadlo at gmail.com> wrote:
> Hi Peter,
> Do you have the script which read
>
> https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml
>
>
> and what would be the correct output?
>
> Thank you in advance.
>
> Cheers,
> Michal

Hi Michal,

I'm not quite sure what you're asking, but I'll try. First, the three
data files:

$ wget https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/blastp_four_human_vs_rhodopsin.xml
$ wget https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/test-data/four_human_proteins.fasta
$ wget https://bitbucket.org/galaxy/galaxy-central/src/8eaf07a46623/rhodopsin_proteins.fasta

The query file has four sequences,

$ grep -c "^>" four_human_proteins.fasta
4

$ grep "^>" four_human_proteins.fasta
>sp|Q9BS26|ERP44_HUMAN Endoplasmic reticulum resident protein 44 OS=Homo sapiens GN=ERP44 PE=1 SV=1
>sp|Q9NSY1|BMP2K_HUMAN BMP-2-inducible protein kinase OS=Homo sapiens GN=BMP2K PE=1 SV=2
>sp|P06213|INSR_HUMAN Insulin receptor OS=Homo sapiens GN=INSR PE=1 SV=4
>sp|P08100|OPSD_HUMAN Rhodopsin OS=Homo sapiens GN=RHO PE=1 SV=1

Based on past experience, I would expect 4 iteration blocks in the
XML, but in this case I have 24:

$ grep "<Iteration>" -c blastp_four_human_vs_rhodopsin.xml
24

Notice we get 6 iterations for each query (4 times 6 is 24):

$ grep "<Iteration_query-ID>" blastp_four_human_vs_rhodopsin.xml
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|Q9NSY1|BMP2K_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P06213|INSR_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>
      <Iteration_query-ID>sp|P08100|OPSD_HUMAN</Iteration_query-ID>

Now, using the two FASTA files directly and re-running blastp, what do I get?

$ ~/Downloads/ncbi-blast-2.2.25+/bin/blastp -query
four_human_proteins.fasta -subject rhodopsin_proteins.fasta -outfmt 5
| grep "<Iteration>" -c
24

Or again with -parse_deflines, which changes how the hit ID/def is presented:

$ ~/Downloads/ncbi-blast-2.2.25+/bin/blastp -query
four_human_proteins.fasta -subject rhodopsin_proteins.fasta -outfmt 5
-parse_deflines | grep "<Iteration>" -c
24

How about older versions?

$ ~/Downloads/ncbi-blast-2.2.24+/bin/blastp -query
four_human_proteins.fasta -subject rhodopsin_proteins.fasta -outfmt 5
BLAST engine error: XML formatting is only supported for a database search

I'll have to make a blast database first...

$ ~/Downloads/ncbi-blast-2.2.24+/bin/makeblastdb -in
rhodopsin_proteins.fasta -dbtype prot

Building a new DB, current time: 05/04/2011 11:22:57
New DB name:   rhodopsin_proteins.fasta
New DB title:  rhodopsin_proteins.fasta
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
Adding sequences from FASTA; added 6 sequences in 0.105655 seconds.

$ ~/Downloads/ncbi-blast-2.2.25+/bin/blastp -query
four_human_proteins.fasta -db rhodopsin_proteins.fasta -outfmt 5 |
grep "<Iteration>" -c
4

Look - just four identifiers as I expect! This also works if the database
is built with the -parse_seqids switch.

The same happens with older versions of BLAST+, one <Iteration>
block per query, so four iteration blocks for this example. I tried all
of 2.2.21+, 2.2.22+, 2.2.23+ and 2.2.24+ (running makeblastdb to
give a fresh database, then blastp).

That seems to demonstrate that bug is specific to the XML output
from FASTA vs FASTA (not FASTA vs DB), which is a new feature
in NCBI BLAST 2.2.25+

I will raise this with the NCBI, and report back.

However, even if the NCBI fix it in the next release, we (Bio*) may
want to update our parsers to cope with this quirk, or at least put a
warning in our BLAST XML parser documentation, as there will be
lots of installations of NCBI BLAST 2.2.25+ in the wild.

Peter


From Paul.Czodrowski at merck.de  Wed May  4 11:30:16 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 13:30:16 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: installation
	as	non-administrator
In-Reply-To: <BANLkTimRtfygjwrUEXS=5x5874HZC6NcwQ@mail.gmail.com>
Message-ID: <OF5C3E5063.6478DA17-ONC1257886.003E29CD-C1257886.003F32C7@merck.de>

Dear Peter,


> >> >From the test results,
> >>
> >> > python setup.py test
> >> > running test
> >> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> >> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> >> > Operating system: posix linux2
> >> > test_Ace ... ok
> >> > ...
> >> > test_Entrez ... Segmentation fault (core dumped)
> >>
> >> Oh, nasty! That should *not* happen, and is probably a separate
> >> issue to the NumPy header install issue.
> >
> > python setup.py install --prefix=$HOME works fine now.
> >
> > Should the segmentation fault still be considered?
>
> Yes please. I assume it still breaks? Can you try changing to the
> Tests subdirectory from the Biopython source, and doing:
>
> python test_Entrez.py

I cannot find the src directory.
Here is my Bio/ directory:
"
Affy
Align
AlignIO
Alphabet
Application
Blast
CAPS
Clustalw
Cluster
Compass
cpairwise2.so
Crystal
Data
DocSQL.py
DocSQL.pyc
Emboss
Entrez
ExPASy
File.py
File.pyc
FSSP
GA
GenBank
Geo
Graphics
HMM
HotRand.py
HotRand.pyc
Index.py
Index.pyc
__init__.py
__init__.pyc
InterPro
KDTree
KEGG
kNN.py
kNN.pyc
LogisticRegression.py
LogisticRegression.pyc
MarkovModel.py
MarkovModel.pyc
MaxEntropy.py
MaxEntropy.pyc
Medline
Motif
NaiveBayes.py
NaiveBayes.pyc
NeuralNetwork
Nexus
NMR
pairwise2.py
pairwise2.pyc
Parsers
ParserSupport.py
ParserSupport.pyc
Pathway
PDB
Phylo
PopGen
_py3k.py
_py3k.pyc
Restriction
SCOP
Search.py
Search.pyc
SeqFeature.py
SeqFeature.pyc
SeqIO
Seq.py
Seq.pyc
SeqRecord.py
SeqRecord.pyc
Sequencing
SeqUtils
Statistics
SubsMat
SVDSuperimposer
SwissProt
triefind.py
triefind.pyc
trie.so
UniGene
Wise
"

BTW, python setup.py install --prefix=$HOME did not break.

Thanks & Ceers,
Pau?

>
> That should run just the Entrez tests, and hopefully give a bit
> more information about what/when the segmentation fault
> occurs. I suspect a problem in one of the Python C libraries
> that Biopython is using (since as far as I can recall, all the
> Bio.Entrez code is pure python).
>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Wed May  4 11:41:07 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Wed, 4 May 2011 13:41:07 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: installation
 as non-administrator
In-Reply-To: <OF5C3E5063.6478DA17-ONC1257886.003E29CD-C1257886.003F32C7@merck.de>
References: <BANLkTimRtfygjwrUEXS=5x5874HZC6NcwQ@mail.gmail.com>
	<OF5C3E5063.6478DA17-ONC1257886.003E29CD-C1257886.003F32C7@merck.de>
Message-ID: <BANLkTikwgMftN9O7RRtqkbQ2tj1-+CGDAA@mail.gmail.com>

On the same level of Bio/ you have another directory called Tests/.

If I list my biopython directory:

joaor at home: ls biopython-git/
*Bio*         BioSQL      CONTRIB     DEPRECATED  Doc         LICENSE
MANIFEST.in NEWS        README      Scripts     *Tests*       build
do2to3.py   setup.py

The file Peter was talking about should be there.

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Wed, May 4, 2011 at 1:30 PM, <Paul.Czodrowski at merck.de> wrote:

> Dear Peter,
>
>
>
> > >> >From the test results,
> > >>
> > >> > python setup.py test
> > >> > running test
> > >> > Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> > >> > [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> > >> > Operating system: posix linux2
> > >> > test_Ace ... ok
> > >> > ...
> > >> > test_Entrez ... Segmentation fault (core dumped)
> > >>
> > >> Oh, nasty! That should *not* happen, and is probably a separate
> > >> issue to the NumPy header install issue.
> > >
> > > python setup.py install --prefix=$HOME works fine now.
> > >
> > > Should the segmentation fault still be considered?
> >
> > Yes please. I assume it still breaks? Can you try changing to the
> > Tests subdirectory from the Biopython source, and doing:
> >
> > python test_Entrez.py
>
> I cannot find the src directory.
> Here is my Bio/ directory:
> "
> Affy
> Align
> AlignIO
> Alphabet
> Application
> Blast
> CAPS
> Clustalw
> Cluster
> Compass
> cpairwise2.so
> Crystal
> Data
> DocSQL.py
> DocSQL.pyc
> Emboss
> Entrez
> ExPASy
> File.py
> File.pyc
> FSSP
> GA
> GenBank
> Geo
> Graphics
> HMM
> HotRand.py
> HotRand.pyc
> Index.py
> Index.pyc
> __init__.py
> __init__.pyc
> InterPro
> KDTree
> KEGG
> kNN.py
> kNN.pyc
> LogisticRegression.py
> LogisticRegression.pyc
> MarkovModel.py
> MarkovModel.pyc
> MaxEntropy.py
> MaxEntropy.pyc
> Medline
> Motif
> NaiveBayes.py
> NaiveBayes.pyc
> NeuralNetwork
> Nexus
> NMR
> pairwise2.py
> pairwise2.pyc
> Parsers
> ParserSupport.py
> ParserSupport.pyc
> Pathway
> PDB
> Phylo
> PopGen
> _py3k.py
> _py3k.pyc
> Restriction
> SCOP
> Search.py
> Search.pyc
> SeqFeature.py
> SeqFeature.pyc
> SeqIO
> Seq.py
> Seq.pyc
> SeqRecord.py
> SeqRecord.pyc
> Sequencing
> SeqUtils
> Statistics
> SubsMat
> SVDSuperimposer
> SwissProt
> triefind.py
> triefind.pyc
> trie.so
> UniGene
> Wise
> "
>
> BTW, python setup.py install --prefix=$HOME did not break.
>
> Thanks & Ceers,
> Pau?
>
> >
> > That should run just the Entrez tests, and hopefully give a bit
> > more information about what/when the segmentation fault
> > occurs. I suspect a problem in one of the Python C libraries
> > that Biopython is using (since as far as I can recall, all the
> > Bio.Entrez code is pure python).
> >
> > Peter
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Wed May  4 12:40:06 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 14:40:06 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 installation as non-administrator
In-Reply-To: <BANLkTikwgMftN9O7RRtqkbQ2tj1-+CGDAA@mail.gmail.com>
Message-ID: <OF27B1521F.A060468A-ONC1257886.00457F1B-C1257886.004597A6@merck.de>


Dear Joao & Peter,

this is what I got:

"
Test error handling when presented with Fasta non-XML data ... ok
Test error handling when presented with GenBank non-XML data ... ok
Test parsing XML returned by EFetch, Nucleotide database (first test) ...
ERROR
Test parsing XML returned by EFetch, Protein database ... ERROR
Test parsing XML returned by EFetch, OMIM database ... ERROR
Test parsing XML returned by EFetch, PubMed database (first test) ...
Segmentation fault (core dumped)
"


Cheers,
Paul

> On the same level of Bio/ you have another directory called Tests/.
>
> If I list my biopython directory:
>
> joaor at home: ls biopython-git/
> *Bio*         BioSQL      CONTRIB     DEPRECATED  Doc         LICENSE
> MANIFEST.in NEWS        README      Scripts     *Tests*       build
> do2to3.py   setup.py
>
> The file Peter was talking about should be there.
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Wed May  4 13:17:21 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 14:17:21 +0100
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 installation as non-administrator
In-Reply-To: <OF27B1521F.A060468A-ONC1257886.00457F1B-C1257886.004597A6@merck.de>
References: <BANLkTikwgMftN9O7RRtqkbQ2tj1-+CGDAA@mail.gmail.com>
	<OF27B1521F.A060468A-ONC1257886.00457F1B-C1257886.004597A6@merck.de>
Message-ID: <BANLkTim_YYfgFpgtzKdPT-2SJXjXFZY83Q@mail.gmail.com>

On Wed, May 4, 2011 at 1:40 PM,  <Paul.Czodrowski at merck.de> wrote:
>
> Dear Joao & Peter,
>
> this is what I got:
>
> "
> Test error handling when presented with Fasta non-XML data ... ok
> Test error handling when presented with GenBank non-XML data ... ok
> Test parsing XML returned by EFetch, Nucleotide database (first test) ...
> ERROR
> Test parsing XML returned by EFetch, Protein database ... ERROR
> Test parsing XML returned by EFetch, OMIM database ... ERROR
> Test parsing XML returned by EFetch, PubMed database (first test) ...
> Segmentation fault (core dumped)
> "
>
>
> Cheers,
> Paul

Hmm, something amiss with the XML parsing I think, we're
using the Python standard library xml.parsers.expat here.

You said you were using OpenSuse 11.3, and the start of our test
suite reported the following:

Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
Operating system: posix linux2

What version of expat do you have? Try:

$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.parsers import expat
>>> print expat.__version__
$Revision: 17640 $

Do you fancy trying gdb to get a stack trace for us?

I've had a quick Google, and the following issue *might* be
related: http://bugs.python.org/issue4877

Peter


From Paul.Czodrowski at merck.de  Wed May  4 13:36:42 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 15:36:42 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 Antwort: Re: installation as non-administrator
In-Reply-To: <BANLkTim_YYfgFpgtzKdPT-2SJXjXFZY83Q@mail.gmail.com>
Message-ID: <OF5BE89F6E.17E09B10-ONC1257886.004A5D76-C1257886.004AC62C@merck.de>

                                                                           
 Dr. Paul Czodrowski                                                       
                                                                           
 Merck KGaA                                                                
                                                                           
 NCE Technologies                 Room A22/231                             
                                                                           
 Computational Chemistry          Phone: +49-6151-72 3218                  
                                                                           
 Frankfurter Strasse 250          Fax: +49-6151-72 91 3218                 
                                                                           
 64293 Darmstadt, Germany         Email: paul.czodrowski at merck.de          
                                                                           

Mandatory information can be found at http://mandatories.merck.de

biopython-bounces at lists.open-bio.org wrote on 04.05.2011 15:17:21:

> On Wed, May 4, 2011 at 1:40 PM,  <Paul.Czodrowski at merck.de> wrote:
> >
> > Dear Joao & Peter,
> >
> > this is what I got:
> >
> > "
> > Test error handling when presented with Fasta non-XML data ... ok
> > Test error handling when presented with GenBank non-XML data ... ok
> > Test parsing XML returned by EFetch, Nucleotide database (first
test) ...
> > ERROR
> > Test parsing XML returned by EFetch, Protein database ... ERROR
> > Test parsing XML returned by EFetch, OMIM database ... ERROR
> > Test parsing XML returned by EFetch, PubMed database (first test) ...
> > Segmentation fault (core dumped)
> > "
> >
> >
> > Cheers,
> > Paul
>
> Hmm, something amiss with the XML parsing I think, we're
> using the Python standard library xml.parsers.expat here.
>
> You said you were using OpenSuse 11.3, and the start of our test
> suite reported the following:
>
> Python version: 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
> [GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]]
> Operating system: posix linux2
>
> What version of expat do you have? Try:
>
> $ python
> Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
> [GCC 4.4.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from xml.parsers import expat
> >>> print expat.__version__
> $Revision: 17640 $

$Revision: 1.1 $

>
> Do you fancy trying gdb to get a stack trace for us?

How shall I understand your question? Shall I use the gnu debugger in order
to get some debuggable output?

What is the worst case scenario related to biopython, i.e. could it
ultimately lead to any errors/instabilities?


Cheers,
Paul

>
> I've had a quick Google, and the following issue *might* be
> related: http://bugs.python.org/issue4877
>
> Peter
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From p.j.a.cock at googlemail.com  Wed May  4 14:13:47 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 4 May 2011 15:13:47 +0100
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 Antwort: Re: installation as non-administrator
In-Reply-To: <OF5BE89F6E.17E09B10-ONC1257886.004A5D76-C1257886.004AC62C@merck.de>
References: <BANLkTim_YYfgFpgtzKdPT-2SJXjXFZY83Q@mail.gmail.com>
	<OF5BE89F6E.17E09B10-ONC1257886.004A5D76-C1257886.004AC62C@merck.de>
Message-ID: <BANLkTi=pMSuYnAGBhPoP85Q53pQ25Bfhiw@mail.gmail.com>

On Wed, May 4, 2011 at 2:36 PM,  <Paul.Czodrowski at merck.de> wrote:
>>
>> Do you fancy trying gdb to get a stack trace for us?
>
> How shall I understand your question? Shall I use the gnu debugger
> in order to get some debuggable output?

Yes please.

With hindsight, "Could you try using the gnu debugger (gdb) to get
a stack trace?" would have been clearer. Are you familiar with gdb?

Was it the "Do you fancy *activity*?" phrasing that was unclear?
Basically meaning "Would you like to do *activity*?".

> What is the worst case scenario related to biopython, i.e. could it
> ultimately lead to any errors/instabilities?

It looks like if you tried to use Biopython's Bio.Entrez module to
parse XML files from the NCBI it would crash. If you are not going
to use that module, you should be fine.

Peter


From Paul.Czodrowski at merck.de  Wed May  4 14:25:23 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Wed, 4 May 2011 16:25:23 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: Antwort: Re: Antwort: Re:
 Antwort: Re: Antwort: Re: installation as non-administrator
In-Reply-To: <BANLkTi=pMSuYnAGBhPoP85Q53pQ25Bfhiw@mail.gmail.com>
Message-ID: <OF94F7B8BE.1990457F-ONC1257886.004F0E16-C1257886.004F3B27@merck.de>

                                                                           
 Dr. Paul Czodrowski                                                       
                                                                           
 Merck KGaA                                                                
                                                                           
 NCE Technologies                 Room A22/231                             
                                                                           
 Computational Chemistry          Phone: +49-6151-72 3218                  
                                                                           
 Frankfurter Strasse 250          Fax: +49-6151-72 91 3218                 
                                                                           
 64293 Darmstadt, Germany         Email: paul.czodrowski at merck.de          
                                                                           

Mandatory information can be found at http://mandatories.merck.de

Peter Cock <p.j.a.cock at googlemail.com> wrote on 04.05.2011 16:13:47:

> On Wed, May 4, 2011 at 2:36 PM,  <Paul.Czodrowski at merck.de> wrote:
> >>
> >> Do you fancy trying gdb to get a stack trace for us?
> >
> > How shall I understand your question? Shall I use the gnu debugger
> > in order to get some debuggable output?
>
> Yes please.
>
> With hindsight, "Could you try using the gnu debugger (gdb) to get
> a stack trace?" would have been clearer. Are you familiar with gdb?
>
> Was it the "Do you fancy *activity*?" phrasing that was unclear?
> Basically meaning "Would you like to do *activity*?".

Yes, it was just the expression you used. I have to admit that English is
not my mother tongue.

>
> > What is the worst case scenario related to biopython, i.e. could it
> > ultimately lead to any errors/instabilities?
>
> It looks like if you tried to use Biopython's Bio.Entrez module to
> parse XML files from the NCBI it would crash. If you are not going
> to use that module, you should be fine.

Good news, thanks :)

And thanks for all the other help, also to JOAO!!


Cheers,
Paul

>
> Peter


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From Paul.Czodrowski at merck.de  Tue May 10 07:50:23 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 09:50:23 +0200
Subject: [Biopython] PDB parsing
Message-ID: <OF13FB6CF9.43C8C805-ONC125788C.002A7B07-C125788C.002B113B@merck.de>


Dear folks,

how do I add a B-factor as well as an occupancy column to a PDB file?

I guess Bio.PDB is the appropriate module.
But I already fail with regards to a simple PDB load...


Cheers,
Paul

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Tue May 10 08:30:04 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 10 May 2011 10:30:04 +0200
Subject: [Biopython] PDB parsing
In-Reply-To: <OF13FB6CF9.43C8C805-ONC125788C.002A7B07-C125788C.002B113B@merck.de>
References: <OF13FB6CF9.43C8C805-ONC125788C.002A7B07-C125788C.002B113B@merck.de>
Message-ID: <BANLkTikXPx1FkAgkoez2UzY0O2O+H1AfyQ@mail.gmail.com>

Hey Paul,

When you parse a PDB file with PDBParser it automatically retrieves both
B-factor and occupancy. If it fails to do so for any reason, it defaults
those values to 0.

After parsing, you can set those values explicitly by modifying the
corresponding attribute of the Atom object. So, for example, to change the
B-factor of all your atoms to 10.0, you just have to do:

for atom in structure.get_atoms():
>   atom.bfactor = 10.0
>

Hope this answered your question.

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:

>
> Dear folks,
>
> how do I add a B-factor as well as an occupancy column to a PDB file?
>
> I guess Bio.PDB is the appropriate module.
> But I already fail with regards to a simple PDB load...
>
>
> Cheers,
> Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Tue May 10 09:19:54 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 11:19:54 +0200
Subject: [Biopython] Antwort: Re:  PDB parsing
In-Reply-To: <BANLkTikXPx1FkAgkoez2UzY0O2O+H1AfyQ@mail.gmail.com>
Message-ID: <OF4766FF65.603639F6-ONC125788C.00314D66-C125788C.0033431F@merck.de>

Dear Joao,

this one does not work:
"

structure_id = "1234"
PDBFILE = open(filename,'r').read()
p = PDBParser(PERMISSIVE=1)
p._parse(PDBFILE)
pp = p.get_structure(structure_id, PDBFILE)


for atom in pp.get_atoms():
 atom.bfactor = 10.0
 print atom.bfactor
"


"p.get_structure(structure_id, PDBFILE)" seems to get the structural data,
but setting the bfactor does not give any output.


Cheers & Thanks,
Paul


> Hey Paul,
>
> When you parse a PDB file with PDBParser it automatically retrieves both
> B-factor and occupancy. If it fails to do so for any reason, it defaults
> those values to 0.
>
> After parsing, you can set those values explicitly by modifying the
> corresponding attribute of the Atom object. So, for example, to change
the
> B-factor of all your atoms to 10.0, you just have to do:
>
> for atom in structure.get_atoms():
> >   atom.bfactor = 10.0
> >
>
> Hope this answered your question.
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>
>
>
> On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
>
> >
> > Dear folks,
> >
> > how do I add a B-factor as well as an occupancy column to a PDB file?
> >
> > I guess Bio.PDB is the appropriate module.
> > But I already fail with regards to a simple PDB load...
> >
> >
> > Cheers,
> > Paul

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Tue May 10 09:27:37 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 10 May 2011 11:27:37 +0200
Subject: [Biopython] Antwort: Re: PDB parsing
In-Reply-To: <OF4766FF65.603639F6-ONC125788C.00314D66-C125788C.0033431F@merck.de>
References: <BANLkTikXPx1FkAgkoez2UzY0O2O+H1AfyQ@mail.gmail.com>
	<OF4766FF65.603639F6-ONC125788C.00314D66-C125788C.0033431F@merck.de>
Message-ID: <BANLkTinNk59YFkaGv8p6kfiF+YkYNrFfhw@mail.gmail.com>

Hey Paul,

First of all, you should not call _parse on your own. That is called already
when you call get_structure(). Generally, if a method has an underscore
behind its name it means it shouldn't really be called unless you really
know what you want to do with it.

What version of Biopython are you using?

I'd do this:

structure_id = "1234"
> PDBFILE = open(filename,'r')
> p = PDBParser(PERMISSIVE=1)
> pp = p.get_structure(structure_id, PDBFILE)
>
> for atom in pp.get_atoms():
>  atom.bfactor = 10.0
>  print atom.bfactor
>

It works pretty well here, with version 1.57.

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:

> Dear Joao,
>
> this one does not work:
> "
>
> structure_id = "1234"
> PDBFILE = open(filename,'r').read()
> p = PDBParser(PERMISSIVE=1)
> p._parse(PDBFILE)
> pp = p.get_structure(structure_id, PDBFILE)
>
>
> for atom in pp.get_atoms():
>  atom.bfactor = 10.0
>  print atom.bfactor
> "
>
>
> "p.get_structure(structure_id, PDBFILE)" seems to get the structural data,
> but setting the bfactor does not give any output.
>
>
>
>
> Cheers & Thanks,
> Paul
>
>
> > Hey Paul,
> >
> > When you parse a PDB file with PDBParser it automatically retrieves both
> > B-factor and occupancy. If it fails to do so for any reason, it defaults
> > those values to 0.
> >
> > After parsing, you can set those values explicitly by modifying the
> > corresponding attribute of the Atom object. So, for example, to change
> the
> > B-factor of all your atoms to 10.0, you just have to do:
> >
> > for atom in structure.get_atoms():
> > >   atom.bfactor = 10.0
> > >
> >
> > Hope this answered your question.
> >
> > Cheers,
> >
> > Jo?o [...] Rodrigues
> > http://nmr.chem.uu.nl/~joao
> >
> >
> >
> > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> >
> > >
> > > Dear folks,
> > >
> > > how do I add a B-factor as well as an occupancy column to a PDB file?
> > >
> > > I guess Bio.PDB is the appropriate module.
> > > But I already fail with regards to a simple PDB load...
> > >
> > >
> > > Cheers,
> > > Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Tue May 10 09:32:33 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 11:32:33 +0200
Subject: [Biopython] Antwort: Re:  Antwort: Re: PDB parsing
In-Reply-To: <BANLkTinNk59YFkaGv8p6kfiF+YkYNrFfhw@mail.gmail.com>
Message-ID: <OFCA9D82C3.3D436E85-ONC125788C.00344BAC-C125788C.00346BAC@merck.de>

Dear Jo?o,


cool, thank you very much so far!

How do I output the newly generated PDBfile?

Cheers & thanks,
Paul


> Hey Paul,
>
> First of all, you should not call _parse on your own. That is called
> already when you call get_structure(). Generally, if a method has an
> underscore behind its name it means it shouldn't really be called
> unless you really know what you want to do with it.
>
> What version of Biopython are you using?
>
> I'd do this:

> structure_id = "1234"
> PDBFILE = open(filename,'r')
> p = PDBParser(PERMISSIVE=1)
> pp = p.get_structure(structure_id, PDBFILE)
>
> for atom in pp.get_atoms():
> ?atom.bfactor = 10.0
> ?print atom.bfactor
>
> It works pretty well here, with version 1.57.
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>
>

> On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:
> Dear Joao,
>
> this one does not work:
> "
>
> structure_id = "1234"
> PDBFILE = open(filename,'r').read()
> p = PDBParser(PERMISSIVE=1)
> p._parse(PDBFILE)
> pp = p.get_structure(structure_id, PDBFILE)
>
>
> for atom in pp.get_atoms():
> ?atom.bfactor = 10.0
> ?print atom.bfactor
> "
>
>
> "p.get_structure(structure_id, PDBFILE)" seems to get the structural
data,
> but setting the bfactor does not give any output.
>
>
>
>
> Cheers & Thanks,
> Paul
>
>
> > Hey Paul,
> >
> > When you parse a PDB file with PDBParser it automatically retrieves
both
> > B-factor and occupancy. If it fails to do so for any reason, it
defaults
> > those values to 0.
> >
> > After parsing, you can set those values explicitly by modifying the
> > corresponding attribute of the Atom object. So, for example, to change
> the
> > B-factor of all your atoms to 10.0, you just have to do:
> >
> > for atom in structure.get_atoms():
> > > ? atom.bfactor = 10.0
> > >
> >
> > Hope this answered your question.
> >
> > Cheers,
> >
> > Jo?o [...] Rodrigues
> > http://nmr.chem.uu.nl/~joao
> >
> >
> >
> > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> >
> > >
> > > Dear folks,
> > >
> > > how do I add a B-factor as well as an occupancy column to a PDB file?
> > >
> > > I guess Bio.PDB is the appropriate module.
> > > But I already fail with regards to a simple PDB load...
> > >
> > >
> > > Cheers,
> > > Paul
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended
recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message
and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and
does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish
and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list ?- ?Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From anaryin at gmail.com  Tue May 10 09:38:23 2011
From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=)
Date: Tue, 10 May 2011 11:38:23 +0200
Subject: [Biopython] Antwort: Re: Antwort: Re: PDB parsing
In-Reply-To: <OFCA9D82C3.3D436E85-ONC125788C.00344BAC-C125788C.00346BAC@merck.de>
References: <BANLkTinNk59YFkaGv8p6kfiF+YkYNrFfhw@mail.gmail.com>
	<OFCA9D82C3.3D436E85-ONC125788C.00344BAC-C125788C.00346BAC@merck.de>
Message-ID: <BANLkTimPyKubjCJCEzXY_ssdYMzpnSoy8A@mail.gmail.com>

Use PDBIO.

from Bio.PDB import PDBIO
IO = PDBIO()
IO.set_structure(your_structure)
IO.save(output_filename)

You can also control which parts of the structure to output with Select.

Check the documentation<http://www.biopython.org/DIST/docs/cookbook/biopdb_faq.pdf>,
it will make you progress much faster :)

Cheers,

Jo?o [...] Rodrigues
http://nmr.chem.uu.nl/~joao


On Tue, May 10, 2011 at 11:32 AM, <Paul.Czodrowski at merck.de> wrote:

> Dear Jo?o,
>
>
> cool, thank you very much so far!
>
> How do I output the newly generated PDBfile?
>
> Cheers & thanks,
> Paul
>
>
>
> > Hey Paul,
> >
> > First of all, you should not call _parse on your own. That is called
> > already when you call get_structure(). Generally, if a method has an
> > underscore behind its name it means it shouldn't really be called
> > unless you really know what you want to do with it.
> >
> > What version of Biopython are you using?
> >
> > I'd do this:
>
> > structure_id = "1234"
> > PDBFILE = open(filename,'r')
> > p = PDBParser(PERMISSIVE=1)
> > pp = p.get_structure(structure_id, PDBFILE)
> >
> > for atom in pp.get_atoms():
> >  atom.bfactor = 10.0
> >  print atom.bfactor
> >
> > It works pretty well here, with version 1.57.
> >
> > Cheers,
> >
> > Jo?o [...] Rodrigues
> > http://nmr.chem.uu.nl/~joao
> >
> >
>
> > On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:
> > Dear Joao,
> >
> > this one does not work:
> > "
> >
> > structure_id = "1234"
> > PDBFILE = open(filename,'r').read()
> > p = PDBParser(PERMISSIVE=1)
> > p._parse(PDBFILE)
> > pp = p.get_structure(structure_id, PDBFILE)
> >
> >
> > for atom in pp.get_atoms():
> >  atom.bfactor = 10.0
> >  print atom.bfactor
> > "
> >
> >
> > "p.get_structure(structure_id, PDBFILE)" seems to get the structural
> data,
> > but setting the bfactor does not give any output.
> >
> >
> >
> >
> > Cheers & Thanks,
> > Paul
> >
> >
> > > Hey Paul,
> > >
> > > When you parse a PDB file with PDBParser it automatically retrieves
> both
> > > B-factor and occupancy. If it fails to do so for any reason, it
> defaults
> > > those values to 0.
> > >
> > > After parsing, you can set those values explicitly by modifying the
> > > corresponding attribute of the Atom object. So, for example, to change
> > the
> > > B-factor of all your atoms to 10.0, you just have to do:
> > >
> > > for atom in structure.get_atoms():
> > > >   atom.bfactor = 10.0
> > > >
> > >
> > > Hope this answered your question.
> > >
> > > Cheers,
> > >
> > > Jo?o [...] Rodrigues
> > > http://nmr.chem.uu.nl/~joao
> > >
> > >
> > >
> > > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> > >
> > > >
> > > > Dear folks,
> > > >
> > > > how do I add a B-factor as well as an occupancy column to a PDB file?
> > > >
> > > > I guess Bio.PDB is the appropriate module.
> > > > But I already fail with regards to a simple PDB load...
> > > >
> > > >
> > > > Cheers,
> > > > Paul
> >
> > This message and any attachment are confidential and may be privileged or
> > otherwise protected from disclosure. If you are not the intended
> recipient,
> > you must not copy this message or attachment or disclose the contents to
> > any other person. If you have received this transmission in error, please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in this
> > message which may arise as a result of E-Mail-transmission or for damages
> > resulting from any unauthorized changes of the content of this message
> and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
> does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://disclaimer.merck.de to access the German, French, Spanish
> and
> > Portuguese versions of this disclaimer.
> >
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://disclaimer.merck.de to access the German, French, Spanish and
> Portuguese versions of this disclaimer.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From Paul.Czodrowski at merck.de  Tue May 10 11:05:50 2011
From: Paul.Czodrowski at merck.de (Paul.Czodrowski at merck.de)
Date: Tue, 10 May 2011 13:05:50 +0200
Subject: [Biopython] Antwort: Re:  Antwort: Re: Antwort: Re: PDB parsing
In-Reply-To: <BANLkTimPyKubjCJCEzXY_ssdYMzpnSoy8A@mail.gmail.com>
Message-ID: <OF1FDFD069.9C28606B-ONC125788C.003C1775-C125788C.003CF5D1@merck.de>

Dear Joao,

thanks for your help and the documentation link!
So far, I was aware of this documentation
http://biopython.org/DIST/docs/tutorial/Tutorial.html

wherein PDB parsing is only briefly covered.

And, yes, progress is faster now!


Cheers,
Paul


> Use PDBIO.
>
> from Bio.PDB import PDBIO
> IO = PDBIO()
> IO.set_structure(your_structure)
> IO.save(output_filename)
>
> You can also control which parts of the structure to output with Select.
>
> Check the documentation<http://www.biopython.
> org/DIST/docs/cookbook/biopdb_faq.pdf>,
> it will make you progress much faster :)
>
> Cheers,
>
> Jo?o [...] Rodrigues
> http://nmr.chem.uu.nl/~joao
>
>
>
> On Tue, May 10, 2011 at 11:32 AM, <Paul.Czodrowski at merck.de> wrote:
>
> > Dear Jo?o,
> >
> >
> > cool, thank you very much so far!
> >
> > How do I output the newly generated PDBfile?
> >
> > Cheers & thanks,
> > Paul
> >
> >
> >
> > > Hey Paul,
> > >
> > > First of all, you should not call _parse on your own. That is called
> > > already when you call get_structure(). Generally, if a method has an
> > > underscore behind its name it means it shouldn't really be called
> > > unless you really know what you want to do with it.
> > >
> > > What version of Biopython are you using?
> > >
> > > I'd do this:
> >
> > > structure_id = "1234"
> > > PDBFILE = open(filename,'r')
> > > p = PDBParser(PERMISSIVE=1)
> > > pp = p.get_structure(structure_id, PDBFILE)
> > >
> > > for atom in pp.get_atoms():
> > >  atom.bfactor = 10.0
> > >  print atom.bfactor
> > >
> > > It works pretty well here, with version 1.57.
> > >
> > > Cheers,
> > >
> > > Jo?o [...] Rodrigues
> > > http://nmr.chem.uu.nl/~joao
> > >
> > >
> >
> > > On Tue, May 10, 2011 at 11:19 AM, <Paul.Czodrowski at merck.de> wrote:
> > > Dear Joao,
> > >
> > > this one does not work:
> > > "
> > >
> > > structure_id = "1234"
> > > PDBFILE = open(filename,'r').read()
> > > p = PDBParser(PERMISSIVE=1)
> > > p._parse(PDBFILE)
> > > pp = p.get_structure(structure_id, PDBFILE)
> > >
> > >
> > > for atom in pp.get_atoms():
> > >  atom.bfactor = 10.0
> > >  print atom.bfactor
> > > "
> > >
> > >
> > > "p.get_structure(structure_id, PDBFILE)" seems to get the structural
> > data,
> > > but setting the bfactor does not give any output.
> > >
> > >
> > >
> > >
> > > Cheers & Thanks,
> > > Paul
> > >
> > >
> > > > Hey Paul,
> > > >
> > > > When you parse a PDB file with PDBParser it automatically retrieves
> > both
> > > > B-factor and occupancy. If it fails to do so for any reason, it
> > defaults
> > > > those values to 0.
> > > >
> > > > After parsing, you can set those values explicitly by modifying the
> > > > corresponding attribute of the Atom object. So, for example, to
change
> > > the
> > > > B-factor of all your atoms to 10.0, you just have to do:
> > > >
> > > > for atom in structure.get_atoms():
> > > > >   atom.bfactor = 10.0
> > > > >
> > > >
> > > > Hope this answered your question.
> > > >
> > > > Cheers,
> > > >
> > > > Jo?o [...] Rodrigues
> > > > http://nmr.chem.uu.nl/~joao
> > > >
> > > >
> > > >
> > > > On Tue, May 10, 2011 at 9:50 AM, <Paul.Czodrowski at merck.de> wrote:
> > > >
> > > > >
> > > > > Dear folks,
> > > > >
> > > > > how do I add a B-factor as well as an occupancy column to a PDB
file?
> > > > >
> > > > > I guess Bio.PDB is the appropriate module.
> > > > > But I already fail with regards to a simple PDB load...
> > > > >
> > > > >
> > > > > Cheers,
> > > > > Paul
> > >
> > > This message and any attachment are confidential and may be
privileged or
> > > otherwise protected from disclosure. If you are not the intended
> > recipient,
> > > you must not copy this message or attachment or disclose the contents
to
> > > any other person. If you have received this transmission in error,
please
> > > notify the sender immediately and delete the message and any
attachment
> > > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > > subsidiaries do not accept liability for any omissions or errors in
this
> > > message which may arise as a result of E-Mail-transmission or for
damages
> > > resulting from any unauthorized changes of the content of this
message
> > and
> > > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > > subsidiaries do not guarantee that this message is free of viruses
and
> > does
> > > not accept liability for any damages caused by any virus transmitted
> > > therewith.
> > >
> > > Click http://disclaimer.merck.de to access the German, French,
Spanish
> > and
> > > Portuguese versions of this disclaimer.
> > >
> > >
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> >
> > This message and any attachment are confidential and may be privileged
or
> > otherwise protected from disclosure. If you are not the intended
recipient,
> > you must not copy this message or attachment or disclose the contents
to
> > any other person. If you have received this transmission in error,
please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in
this
> > message which may arise as a result of E-Mail-transmission or for
damages
> > resulting from any unauthorized changes of the content of this message
and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://disclaimer.merck.de to access the German, French, Spanish
and
> > Portuguese versions of this disclaimer.
> >
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> >
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://disclaimer.merck.de to access the German, French, Spanish and
Portuguese versions of this disclaimer.


From sainitin7 at gmail.com  Thu May 12 08:39:28 2011
From: sainitin7 at gmail.com (sai nitin)
Date: Thu, 12 May 2011 10:39:28 +0200
Subject: [Biopython] Problem in accessing pcassay database
Message-ID: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>

Hi all,

I am new to Biopython i want to access pcassay database programatically the
exact issue is described below

--- I have list of Bioassay AIDs i want retrieve all Names i treid esummary
to do this but it is giving error
also tried to efetch but didnt succeed..

Can any body tell me possible solution...

Thanks

-- 

Sainitin D


From p.j.a.cock at googlemail.com  Thu May 12 09:15:37 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 12 May 2011 10:15:37 +0100
Subject: [Biopython] Problem in accessing pcassay database
In-Reply-To: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>
References: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>
Message-ID: <BANLkTinv=45jL_e--=Cj3a4CjpMUPo8v-Q@mail.gmail.com>

On Thu, May 12, 2011 at 9:39 AM, sai nitin <sainitin7 at gmail.com> wrote:
> Hi all,
>
> I am new to Biopython i want to access pcassay database programatically the
> exact issue is described below
>
> --- I have list of Bioassay AIDs i want retrieve all Names i treid esummary
> to do this but it is giving error
> also tried to efetch but didnt succeed..
>
> Can any body tell me possible solution...
>
> Thanks

Hi,

Can you do this by hand? Which website would you use? If NCBI Entrez,
then it should be possible using Biopython's Bio.Entrez module.

Could you give an example, say two Bioassay AIDs, and the expected
results (e.g. URLs to NCBI webpage).

Peter


From p.j.a.cock at googlemail.com  Thu May 12 19:04:45 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 12 May 2011 20:04:45 +0100
Subject: [Biopython] Problem in accessing pcassay database
In-Reply-To: <BANLkTim1p32E=RnLGqoVRikv+-0E7tdAjw@mail.gmail.com>
References: <BANLkTim4Yh_DuBjuBOA9aUszsZEgXHdNuA@mail.gmail.com>
	<BANLkTinv=45jL_e--=Cj3a4CjpMUPo8v-Q@mail.gmail.com>
	<BANLkTim1p32E=RnLGqoVRikv+-0E7tdAjw@mail.gmail.com>
Message-ID: <BANLkTi=aXruQWV-gcM=CtQJjQycL6g_h_A@mail.gmail.com>

Please CC the mailing list on any reply.

On Thu, May 12, 2011 at 6:59 PM, sai nitin <sainitin7 at gmail.com> wrote:
> Hi Peter,
> Thanks for reply ya tried with Bio.entrez module (biopython) Ok let me
> explain issue more clearly...Say i have AID as follows
> 1. AID:?504582? i want to?retrieve Description section details from this URL
> (http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=504582&loc=ea_ras)
> Like this i have 20 -30 AIDs I want to do this for all of them
> Any suggestions it would be gr8 help
> Thanks,
> Sainitin

If you look on the page you linked to, notice AID 504582 is itself a
link to Entrez,
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=search&db=pcassay&term=504582

So, I would expect an Entrez search for 504582 in the pcassay database
to work. Trying this by hand on the NCBI Entrez website work fine,
then from Biopython you could do the same search with
Entrez.esearch(db="pcassay", term="504582")

Peter


From mictadlo at gmail.com  Sun May 15 05:35:07 2011
From: mictadlo at gmail.com (Michal)
Date: Sun, 15 May 2011 15:35:07 +1000
Subject: [Biopython] multiprocessing problem with pysam
In-Reply-To: <20110412013119.GF2053@kunkel>
References: <4DA1137E.1090803@gmail.com> <20110410111510.GA2634@kunkel>
	<4DA2EC9D.7040004@gmail.com> <20110412013119.GF2053@kunkel>
Message-ID: <4DCF660B.30309@gmail.com>

Hello,
Thank you Brad. I have written the following new code:

import re
import os
import pysam
from pprint import pprint
from multiprocessing import Pool


class Test():

     def __init__(self, bam_filename, cultivars):
         self.__bam_fh = pysam.Samfile(bam_filename, "rb")
         self.__cultivars = cultivars

     def run(self, ref_name):
         print os.getpid(), ref_name, self.__cultivars
         return (os.getpid(), ref_name)


if __name__ == '__main__':
     cultivars = 'Ja,Ea,As'.replace(' ', '').split(',')
     bam_filename = "/media/usb/tests/test.bam"

     bamfile = pysam.Samfile(bam_filename, "rb")

     ref_names = bamfile.references
     ref_lengths = bamfile.lengths
     bamfile.close()

#    for ref_name in ref_names:
#        Test(bam_filename, cultivars).run(ref_names)


     pool = Pool()
     results = dict(pool.imap_unordered(
         Test(bam_filename, cultivars).run, ref_names))
     pool.close()
     pool.join()
     pprint(results)


and got the follwing error:

Exception in thread Thread-2:
Traceback (most recent call last):
   File "/home/mictadlo/apps/python/lib/python2.7/threading.py", line 
530, in __bootstrap_inner
     self.run()
   File "/home/mictadlo/apps/python/lib/python2.7/threading.py", line 
483, in run
     self.__target(*self.__args, **self.__kwargs)
   File 
"/home/mictadlo/apps/python/lib/python2.7/multiprocessing/pool.py", line 
285, in _handle_tasks
     put(task)
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup 
__builtin__.instancemethod failed


I have search and found two possible solution for this problem:
* http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html
* http://www.rueckstiess.net/research/snippets/show/ca1d7d90

However, is there a better way to solve it or the above solution are not 
good?

Thank you in advance.

Michal


From chapmanb at 50mail.com  Sun May 15 15:53:46 2011
From: chapmanb at 50mail.com (Brad Chapman)
Date: Sun, 15 May 2011 11:53:46 -0400
Subject: [Biopython] multiprocessing problem with pysam
In-Reply-To: <4DCF660B.30309@gmail.com>
References: <4DA1137E.1090803@gmail.com> <20110410111510.GA2634@kunkel>
	<4DA2EC9D.7040004@gmail.com> <20110412013119.GF2053@kunkel>
	<4DCF660B.30309@gmail.com>
Message-ID: <20110515155346.GD2530@kunkel>

Michal;

[multiprocessing]
> class Test():
>     def __init__(self, bam_filename, cultivars):
>         self.__bam_fh = pysam.Samfile(bam_filename, "rb")
>         self.__cultivars = cultivars
> 
>     def run(self, ref_name):
>         print os.getpid(), ref_name, self.__cultivars
>         return (os.getpid(), ref_name)
[...]
>     pool = Pool()
>     results = dict(pool.imap_unordered(
>         Test(bam_filename, cultivars).run, ref_names))
[...]
> and got the follwing error:
> 
> Exception in thread Thread-2:
[...]
> PicklingError: Can't pickle <type 'instancemethod'>: attribute
> lookup __builtin__.instancemethod failed

multiprocessing is sensitive to passing or calling complex class
objects. My suggestion is to use functions without associated state
attributes and pass in your information as standard python objects
(strings, lists, dicts). I use a little decorator to make writing
the functions passed easier:

import functools
def map_wrap(f):
    @functools.wraps(f)
    def wrapper(*args, **kwargs):
        return apply(f, *args, **kwargs)
    return wrapper

Then would write your function as:

@map_wrap
def run_test(bam_filename, cultivars, ref_name):
    bam_fh = pysam.Samfile(bam_filename, "rb")
    print os.getpid(), ref_name, cultivars
    return (os.getpid(), ref_name)

and call it with:

cultivars = 'Ja,Ea,As'.replace(' ', '').split(',')
bam_filename = "/media/usb/tests/test.bam"
bamfile = pysam.Samfile(bam_filename, "rb")
ref_names = bamfile.references
bamfile.close()

pool = Pool()
results = dict(pool.imap(run_test, ((bam_filename, cultivars, ref)
                                    for ref in ref_names)))
pool.close()

Hope this helps,
Brad


From aradwen at gmail.com  Wed May 18 15:28:25 2011
From: aradwen at gmail.com (Radhouane Aniba)
Date: Wed, 18 May 2011 11:28:25 -0400
Subject: [Biopython] Snippets Sharing
Message-ID: <BANLkTikEPiHafURBdsEo2rtxmsFxtKRs5A@mail.gmail.com>

Hi guys,

I apologize if that mail sounds like an ad, please consider it just like an
annoucement.

I just wanted you to be aware of the change that occured to biocoders.net

We restructured it to be an online collaboration tool for bioinformatics,
you could create groups for your projects, interact with other users, upload
snippets and software packages that you find useful, discuss latest topics
in bioinformatics, find newest jobs (we partner with simplyhired jobboard)
and much more.

I am not writing an extended mail so that you don't feel like spammed, it is
not my goal. Just come an explore biocoders.net new formula.

Cheers,

Radhouane


From p.j.a.cock at googlemail.com  Wed May 18 20:42:02 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 18 May 2011 21:42:02 +0100
Subject: [Biopython] gff3 problem
In-Reply-To: <20110408121041.GM20963@sobchak>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
Message-ID: <BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>

On Fri, Apr 8, 2011 at 1:10 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
> Leighton and Peter;
>
>> > Just to further complicate matters, the symbol convention for GFF3 differs
>> > from Biopython in terms of the categories it defines:
>> > + is positive strand
>> > - is negative strand
>> > . is not stranded (i.e. strand not relevant)
>> > ? is strand relevant, but not known
>> > http://www.sequenceontology.org/gff3.shtml
>
> Yes, although this strikes me a bit like fuzzy features in terms of
> usefulness.
>
>> > The latter two are distinct, but not distinguished by convention in
>> > Biopython:
>> > The obvious (to me) mapping of the four allowed Biopython symbols to the
>> > GFF3 convention is:
>> > +1 -> +
>> > -1 -> -
>> > None -> .
>> > 0 -> ?
>> > because 'None' is semantically close to 'has no strand information of
>> > consequence', and 0 is the mean of +1 and -1 ;)
>
> That's fine by me. Right now both '?' and '.' are converted to None
> so I lose the subtle distinction GFF is introducing:
>
> strand_map = {'+' : 1, '-' : -1, '?' : None, None: None}
>
> If everyone agrees on that coding it's no problem to swap it over.
> Brad

So was the consensus that we should reword the Bio.SeqFeature
docstring so say the four valid values for strand are (with GFF3
equivalents in brackets):

+1 = Forward (+ in GFF3)
-1 = Reverse (- in GFF3)
0 = Not stranded (. in GFF3)
None = Unknown (? in GFF3)

And should features on a protein sequence should then have strand 0?

Peter


From hxcan at stupidbeauty.com  Thu May 19 05:00:37 2011
From: hxcan at stupidbeauty.com (=?GB2312?B?ssy78Mqk?=)
Date: Thu, 19 May 2011 13:00:37 +0800
Subject: [Biopython] missing dtd file
Message-ID: <4DD4A3F5.8020406@stupidbeauty.com>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20110519/d6b242dc/attachment-0002.html>

From p.j.a.cock at googlemail.com  Thu May 19 07:57:17 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 19 May 2011 08:57:17 +0100
Subject: [Biopython] missing dtd file
In-Reply-To: <4DD4A3F5.8020406@stupidbeauty.com>
References: <4DD4A3F5.8020406@stupidbeauty.com>
Message-ID: <BANLkTi=Fv7iLowfmPwRikkSjxi2M7on-kA@mail.gmail.com>

2011/5/19 ??? <hxcan at stupidbeauty.com>:
> Hello
>
>
> Entrez module gives this warning:
>
> /usr/lib/python2.6/site-packages/Bio/Entrez/Parser.py:495: UserWarning:
> Unable to load DTD file eLink_101123.dtd.
>
> Bio.Entrez uses NCBI's DTD files to parse XML files ...
>
> For this purpose, please download eLink_101123.dtd from
>
> http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eLink_101123.dtd
>
> ...

Thank you for alerting us, that file will be included in our next release.

Could you update your copy of Biopython successfully?

Peter


From esa.aalto at oulu.fi  Thu May 19 13:02:17 2011
From: esa.aalto at oulu.fi (Esa Aalto)
Date: Thu, 19 May 2011 16:02:17 +0300
Subject: [Biopython] An error with Concatenate nexus
Message-ID: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>

Dear group,

I'm trying to concatenate 20 nexus files with the instructions given
here:

http://www.biopython.org/wiki/Concatenate_nexus

but it doesn't work:

Traceback (most recent call last):
  File "C:\Python27\concate_nexus.py", line 36, in <module>
    nexi =  [(handle.name, Nexus.Nexus(handle)) for handle in handles]
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 555, in
__init__
    self.read(input)
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 618, in
read
    self._parse_nexus_block(title, contents)
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 659, in
_parse_nexus_block
    getattr(self,'_'+line.command)(line.options)
  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 1021, in
_codonposset
    raise NexusError('Formatting Error in codonposset: %s ' % options)
NexusError: Formatting Error in codonposset: * UNTITLED = 1: 1-577\3, 2:
2-578\3, 3: 3-579\3

The end of the first of my nex files looks like this:

BEGIN SETS;
   TaxSet A_thaliana = 1;
   TaxSet A_lyrata = 2;
   TaxSet Boh = 3-32;
   TaxSet Ice = 33-60;
   TaxSet Ith = 61-92;
   TaxSet Kar = 93-124;
   TaxSet Lom = 125-156;
   TaxSet NC = 157-196;
   TaxSet Pl = 197-236;
   TaxSet Sp = 237-274;
   TaxSet Stu = 275-294;
   TaxSet South = 3-32 197-236;
   TaxSet North = 125-156 237-274;
   TaxSet lyrata = 2-294;
END;

BEGIN CODONS; 
   CODONPOSSET * UNTITLED = 
      1: 1-577\3,
      2: 2-578\3,
      3: 3-579\3;
   CODESET * UNTITLED = Universal: all;
END;

BEGIN CODONUSAGE;
END;

BEGIN DnaSP;
   Genome= Diploid;
   ChromosomalLocation= Autosome;
   VariationType= DNA_Seq_Pol;
   Species= ---;
   ChromosomeName= ---;
   GenomicPosition= 1;
   GenomicAssembly= ---;
   DnaSPversion= Ver. 5.10.00;
END;

Could someone tell what's wrong here? Is it my nexus files or something
in the code?

Thanks for your help!

Esa Aalto


From cy at cymon.org  Thu May 19 14:30:36 2011
From: cy at cymon.org (Cymon Cox)
Date: Thu, 19 May 2011 15:30:36 +0100
Subject: [Biopython] An error with Concatenate nexus
In-Reply-To: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>
References: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>
Message-ID: <BANLkTikpgu_znao_YnAa40Hzr73SHB2ybg@mail.gmail.com>

Hi Esa,

At first glance this looks like a bug.

But given that Nexus.combine() is going to discard your codonposset
character partition anyway, you could try deleting it from the Nexus file
before combining.

Regards, Cymon

On 19 May 2011 14:02, Esa Aalto <esa.aalto at oulu.fi> wrote:

> Dear group,
>
> I'm trying to concatenate 20 nexus files with the instructions given
> here:
>
> http://www.biopython.org/wiki/Concatenate_nexus
>
> but it doesn't work:
>
> Traceback (most recent call last):
>  File "C:\Python27\concate_nexus.py", line 36, in <module>
>    nexi =  [(handle.name, Nexus.Nexus(handle)) for handle in handles]
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 555, in
> __init__
>    self.read(input)
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 618, in
> read
>    self._parse_nexus_block(title, contents)
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 659, in
> _parse_nexus_block
>    getattr(self,'_'+line.command)(line.options)
>  File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 1021, in
> _codonposset
>    raise NexusError('Formatting Error in codonposset: %s ' % options)
> NexusError: Formatting Error in codonposset: * UNTITLED = 1: 1-577\3, 2:
> 2-578\3, 3: 3-579\3
>
> The end of the first of my nex files looks like this:
>
> BEGIN SETS;
>   TaxSet A_thaliana = 1;
>   TaxSet A_lyrata = 2;
>   TaxSet Boh = 3-32;
>   TaxSet Ice = 33-60;
>   TaxSet Ith = 61-92;
>   TaxSet Kar = 93-124;
>   TaxSet Lom = 125-156;
>   TaxSet NC = 157-196;
>   TaxSet Pl = 197-236;
>   TaxSet Sp = 237-274;
>   TaxSet Stu = 275-294;
>   TaxSet South = 3-32 197-236;
>   TaxSet North = 125-156 237-274;
>   TaxSet lyrata = 2-294;
> END;
>
> BEGIN CODONS;
>   CODONPOSSET * UNTITLED =
>      1: 1-577\3,
>      2: 2-578\3,
>      3: 3-579\3;
>   CODESET * UNTITLED = Universal: all;
> END;
>
> BEGIN CODONUSAGE;
> END;
>
> BEGIN DnaSP;
>   Genome= Diploid;
>   ChromosomalLocation= Autosome;
>   VariationType= DNA_Seq_Pol;
>   Species= ---;
>   ChromosomeName= ---;
>   GenomicPosition= 1;
>   GenomicAssembly= ---;
>   DnaSPversion= Ver. 5.10.00;
> END;
>
> Could someone tell what's wrong here? Is it my nexus files or something
> in the code?
>
> Thanks for your help!
>
> Esa Aalto
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


--


From fkelesh at gmail.com  Fri May 20 09:33:03 2011
From: fkelesh at gmail.com (Fatih Keles)
Date: Fri, 20 May 2011 12:33:03 +0300
Subject: [Biopython] installing biopython on mac os x 10.6
Message-ID: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>

Hi,

I was trying to install Biopython on mac os x 10.6 using X11. However,
It gives this error :
"""

running install
running build
running build_py
running build_ext
building 'Bio.cpairwise2' extension
gcc-4.0 -fno-strict-aliasing -fno-common -dynamic -arch ppc -arch i386
-g -O2 -DNDEBUG -g -O3 -IBio
-I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
-c Bio/cpairwise2module.c -o
build/temp.macosx-10.3-fat-2.7/Bio/cpairwise2module.o
unable to execute gcc-4.0: No such file or directory
error: command 'gcc-4.0' failed with exit status 1
"""

I couldn't find the problem. I would be happy if you help me.

Thanks,

keles


From p.j.a.cock at googlemail.com  Fri May 20 09:40:16 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 20 May 2011 10:40:16 +0100
Subject: [Biopython] installing biopython on mac os x 10.6
In-Reply-To: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>
References: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>
Message-ID: <BANLkTimNZC3F0ehZ34wz8seHLyybRYYRAQ@mail.gmail.com>

On Fri, May 20, 2011 at 10:33 AM, Fatih Keles <fkelesh at gmail.com> wrote:
> Hi,
>
> I was trying to install Biopython on mac os x 10.6 using X11. However,
> It gives this error :
> """
>
> running install
> running build
> running build_py
> running build_ext
> building 'Bio.cpairwise2' extension
> gcc-4.0 -fno-strict-aliasing -fno-common -dynamic -arch ppc -arch i386
> -g -O2 -DNDEBUG -g -O3 -IBio
> -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
> -c Bio/cpairwise2module.c -o
> build/temp.macosx-10.3-fat-2.7/Bio/cpairwise2module.o
> unable to execute gcc-4.0: No such file or directory
> error: command 'gcc-4.0' failed with exit status 1
> """
>
> I couldn't find the problem. I would be happy if you help me.
>
> Thanks,
>
> keles

Have you installed Apple X Code, the development suite that
comes with Apple's version of gcc (C compiler)? What we say
on the download page of the wiki is:

>> For Mac OS X, we recommend installing from source (see below).
>> You will need to have installed Apple's XCode tools including the
>> optional 10.4 SDK (check the option for 10.4 support when
>> installing Xcode tools).

Peter


From chapmanb at 50mail.com  Fri May 20 11:15:35 2011
From: chapmanb at 50mail.com (Brad Chapman)
Date: Fri, 20 May 2011 07:15:35 -0400
Subject: [Biopython] gff3 problem
In-Reply-To: <BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
Message-ID: <20110520111535.GC21651@sobchak>

Peter;

[SeqFeature support for not-stranded elements]
> So was the consensus that we should reword the Bio.SeqFeature
> docstring so say the four valid values for strand are (with GFF3
> equivalents in brackets):
> 
> +1 = Forward (+ in GFF3)
> -1 = Reverse (- in GFF3)
> 0 = Not stranded (. in GFF3)
> None = Unknown (? in GFF3)
> 
> And should features on a protein sequence should then have strand 0?

That sounds great. I can make the corresponding change to the GFF
library. Let me know if there are any other roadblocks to
integrating that. Thanks much,
Brad


From p.j.a.cock at googlemail.com  Fri May 20 11:27:04 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 20 May 2011 12:27:04 +0100
Subject: [Biopython] gff3 problem
In-Reply-To: <20110520111535.GC21651@sobchak>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
	<20110520111535.GC21651@sobchak>
Message-ID: <BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>

On Fri, May 20, 2011 at 12:15 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
> Peter;
>
> [SeqFeature support for not-stranded elements]
>> So was the consensus that we should reword the Bio.SeqFeature
>> docstring so say the four valid values for strand are (with GFF3
>> equivalents in brackets):
>>
>> +1 = Forward (+ in GFF3)
>> -1 = Reverse (- in GFF3)
>> 0 = Not stranded (. in GFF3)
>> None = Unknown (? in GFF3)
>>
>> And should features on a protein sequence then have strand 0?
>
> That sounds great. I can make the corresponding change to the GFF
> library. Let me know if there are any other roadblocks to
> integrating that. Thanks much,
> Brad

I've remembered a corner case, mixed strand features. e.g the
Arabidopsis thaliana chloroplast complete genome, AP000423
in EMBL, NC_000932 in GenBank (one of our unit test files).
e.g. gene with join(complement(69611..69724),139856..140650)

Clearly the child features have well defined strands (+1 and -1).
The parent feature (the join) is mixed strand. Currently our
GenBank parser uses None for this. So maybe:

+1 = Forward (+ in GFF3)
-1 = Reverse (- in GFF3)
0 = Not stranded (. in GFF3)
None = Mixed or unknown (? in GFF3)

Peter


From cjfields at illinois.edu  Fri May 20 13:24:30 2011
From: cjfields at illinois.edu (Chris Fields)
Date: Fri, 20 May 2011 08:24:30 -0500
Subject: [Biopython] gff3 problem
In-Reply-To: <BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
	<20110520111535.GC21651@sobchak>
	<BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
Message-ID: <E092D5A9-E200-414B-AA92-B9C6578B090E@illinois.edu>

On May 20, 2011, at 6:27 AM, Peter Cock wrote:

> On Fri, May 20, 2011 at 12:15 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>> Peter;
>> 
>> [SeqFeature support for not-stranded elements]
>>> So was the consensus that we should reword the Bio.SeqFeature
>>> docstring so say the four valid values for strand are (with GFF3
>>> equivalents in brackets):
>>> 
>>> +1 = Forward (+ in GFF3)
>>> -1 = Reverse (- in GFF3)
>>> 0 = Not stranded (. in GFF3)
>>> None = Unknown (? in GFF3)
>>> 
>>> And should features on a protein sequence then have strand 0?
>> 
>> That sounds great. I can make the corresponding change to the GFF
>> library. Let me know if there are any other roadblocks to
>> integrating that. Thanks much,
>> Brad
> 
> I've remembered a corner case, mixed strand features. e.g the
> Arabidopsis thaliana chloroplast complete genome, AP000423
> in EMBL, NC_000932 in GenBank (one of our unit test files).
> e.g. gene with join(complement(69611..69724),139856..140650)
> 
> Clearly the child features have well defined strands (+1 and -1).
> The parent feature (the join) is mixed strand. Currently our
> GenBank parser uses None for this. So maybe:
> 
> +1 = Forward (+ in GFF3)
> -1 = Reverse (- in GFF3)
> 0 = Not stranded (. in GFF3)
> None = Mixed or unknown (? in GFF3)
> 
> Peter

That's essentially what bioperl does for 'split' locations (actually, I think it is just undef, which would translate to '?' for GFF3).

chris


From laserson at mit.edu  Fri May 20 21:14:32 2011
From: laserson at mit.edu (Uri Laserson)
Date: Fri, 20 May 2011 17:14:32 -0400
Subject: [Biopython] Serialize SeqRecord to JSON?
Message-ID: <BANLkTinGvFuT8NCmO8-VkvMjwEWd7qzC-g@mail.gmail.com>

Does anyone know of a solution for this?

Thanks!
Uri

...................................................................................
Uri Laserson
Graduate Student, Biomedical Engineering
Harvard-MIT Division of Health Sciences and Technology
M +1 917 742 8019
laserson at mit.edu


From mjldehoon at yahoo.com  Sat May 21 03:59:24 2011
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Fri, 20 May 2011 20:59:24 -0700 (PDT)
Subject: [Biopython] installing biopython on mac os x 10.6
In-Reply-To: <BANLkTikOBK8ip=9ijR-N7fkMzh4H2w=25A@mail.gmail.com>
Message-ID: <782468.28393.qm@web161211.mail.bf1.yahoo.com>

Probably you don't have a C compiler installed on your computer. The easiest way to get one is to install Apple's Xcode package.

--Michiel.

--- On Fri, 5/20/11, Fatih Keles <fkelesh at gmail.com> wrote:

> From: Fatih Keles <fkelesh at gmail.com>
> Subject: [Biopython] installing biopython on mac os x 10.6
> To: biopython at lists.open-bio.org
> Date: Friday, May 20, 2011, 5:33 AM
> Hi,
> 
> I was trying to install Biopython on mac os x 10.6 using
> X11. However,
> It gives this error :
> """
> 
> running install
> running build
> running build_py
> running build_ext
> building 'Bio.cpairwise2' extension
> gcc-4.0 -fno-strict-aliasing -fno-common -dynamic -arch ppc
> -arch i386
> -g -O2 -DNDEBUG -g -O3 -IBio
> -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
> -c Bio/cpairwise2module.c -o
> build/temp.macosx-10.3-fat-2.7/Bio/cpairwise2module.o
> unable to execute gcc-4.0: No such file or directory
> error: command 'gcc-4.0' failed with exit status 1
> """
> 
> I couldn't find the problem. I would be happy if you help
> me.
> 
> Thanks,
> 
> keles
> _______________________________________________
> Biopython mailing list? -? Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 


From sainitin7 at gmail.com  Mon May 23 08:32:07 2011
From: sainitin7 at gmail.com (sai nitin)
Date: Mon, 23 May 2011 10:32:07 +0200
Subject: [Biopython] Problem to retreive compound names using CID from
	PubChem
Message-ID: <BANLkTimz=mv=VwmBFcnomCXjyqoqS_bg+g@mail.gmail.com>

Hi all,

Myself sainitin i have list of CIDs from Pubchem Database  i want retereive
corresponding compundnames to automate this process im using Biopython
Entrez module (Entrez.esummary) when i give one CID and try to retreive name
of the compound  error is occuring

Code
h = Entrez.esummary(db = "pccompound",id = "449489")
r = Entrez.read(h)
r[0]["SourceName"]

Error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'SourceName'

Can anybody help me to solve this

Thanks
-- 

Sainitin D


From fkauff at biologie.uni-kl.de  Mon May 23 10:19:30 2011
From: fkauff at biologie.uni-kl.de (Frank Kauff)
Date: Mon, 23 May 2011 12:19:30 +0200
Subject: [Biopython] An error with Concatenate nexus
In-Reply-To: <BANLkTikpgu_znao_YnAa40Hzr73SHB2ybg@mail.gmail.com>
References: <3C36433088B0FF4B834B351A67C98111E6F721@KEKO.univ.yo.oulu.fi>
	<BANLkTikpgu_znao_YnAa40Hzr73SHB2ybg@mail.gmail.com>
Message-ID: <4DDA34B2.9010907@biologie.uni-kl.de>

Hi Esa,

are you using an up-to-date Nexus parser? The codonposset below can be 
read without problems when I copy-paste it into one of my nexus files. 
Or, if you like, send me a copy of your complete nexus file for a check.

Cheers,
Frank


On 05/19/2011 04:30 PM, Cymon Cox wrote:
> Hi Esa,
>
> At first glance this looks like a bug.
>
> But given that Nexus.combine() is going to discard your codonposset
> character partition anyway, you could try deleting it from the Nexus file
> before combining.
>
> Regards, Cymon
>
> On 19 May 2011 14:02, Esa Aalto<esa.aalto at oulu.fi>  wrote:
>
>> Dear group,
>>
>> I'm trying to concatenate 20 nexus files with the instructions given
>> here:
>>
>> http://www.biopython.org/wiki/Concatenate_nexus
>>
>> but it doesn't work:
>>
>> Traceback (most recent call last):
>>   File "C:\Python27\concate_nexus.py", line 36, in<module>
>>     nexi =  [(handle.name, Nexus.Nexus(handle)) for handle in handles]
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 555, in
>> __init__
>>     self.read(input)
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 618, in
>> read
>>     self._parse_nexus_block(title, contents)
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 659, in
>> _parse_nexus_block
>>     getattr(self,'_'+line.command)(line.options)
>>   File "C:\Python27\lib\site-packages\Bio\Nexus\Nexus.py", line 1021, in
>> _codonposset
>>     raise NexusError('Formatting Error in codonposset: %s ' % options)
>> NexusError: Formatting Error in codonposset: * UNTITLED = 1: 1-577\3, 2:
>> 2-578\3, 3: 3-579\3
>>
>> The end of the first of my nex files looks like this:
>>
>> BEGIN SETS;
>>    TaxSet A_thaliana = 1;
>>    TaxSet A_lyrata = 2;
>>    TaxSet Boh = 3-32;
>>    TaxSet Ice = 33-60;
>>    TaxSet Ith = 61-92;
>>    TaxSet Kar = 93-124;
>>    TaxSet Lom = 125-156;
>>    TaxSet NC = 157-196;
>>    TaxSet Pl = 197-236;
>>    TaxSet Sp = 237-274;
>>    TaxSet Stu = 275-294;
>>    TaxSet South = 3-32 197-236;
>>    TaxSet North = 125-156 237-274;
>>    TaxSet lyrata = 2-294;
>> END;
>>
>> BEGIN CODONS;
>>    CODONPOSSET * UNTITLED =
>>       1: 1-577\3,
>>       2: 2-578\3,
>>       3: 3-579\3;
>>    CODESET * UNTITLED = Universal: all;
>> END;
>>
>> BEGIN CODONUSAGE;
>> END;
>>
>> BEGIN DnaSP;
>>    Genome= Diploid;
>>    ChromosomalLocation= Autosome;
>>    VariationType= DNA_Seq_Pol;
>>    Species= ---;
>>    ChromosomeName= ---;
>>    GenomicPosition= 1;
>>    GenomicAssembly= ---;
>>    DnaSPversion= Ver. 5.10.00;
>> END;
>>
>> Could someone tell what's wrong here? Is it my nexus files or something
>> in the code?
>>
>> Thanks for your help!
>>
>> Esa Aalto
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>
>
> --
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>


From chapmanb at 50mail.com  Mon May 23 10:42:56 2011
From: chapmanb at 50mail.com (Brad Chapman)
Date: Mon, 23 May 2011 06:42:56 -0400
Subject: [Biopython] Problem to retreive compound names using CID from
 PubChem
In-Reply-To: <BANLkTimz=mv=VwmBFcnomCXjyqoqS_bg+g@mail.gmail.com>
References: <BANLkTimz=mv=VwmBFcnomCXjyqoqS_bg+g@mail.gmail.com>
Message-ID: <20110523104256.GA2365@kunkel>

Sainitin;

> Code
> h = Entrez.esummary(db = "pccompound",id = "449489")
> r = Entrez.read(h)
> r[0]["SourceName"]
> 
> Error
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> KeyError: 'SourceName'
> 
> Can anybody help me to solve this

The 'r' object you've parsed from Entrez contains a list of
dictionaries. The information that is in each dictionary will be
dependent on the database you are retrieving from. In this case
there is no SourceName information, so python returns a KeyError to
indicate this.

You can examine the items in the dictionary with:

for key, val in r[0].iteritems():
    print key, val

[...]
InChI InChI=1S/C9H12IN2O8P/c10-4-2-12(9(15)11-8(4)14)7-1-5(13)6(20-7)3-19-21(16,17)18/h2,5-7,13H,1,3H2,(H,11,14,15)(H2,16,17,18)/t5-,6+,7+/m0/s1
TautomerCount 3
SourceIDList []
BondChiralCount 0
MeSHTermList ["5-iodo-2'-deoxyuridine 5'-monophosphate", '5-iodo-dUMP', 'IdUMP', 'iododeoxyuridylate', 'iododeoxyuridylate, 125I-labeled']
[...]

There are also a number of good online resources for learning Python
which will help give experience in debugging these kind of errors:

http://learnpythonthehardway.org/index
http://diveintopython.org/

Hope this helps,
Brad


From p.j.a.cock at googlemail.com  Mon May 23 11:01:51 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 23 May 2011 12:01:51 +0100
Subject: [Biopython] Serialize SeqRecord to JSON?
In-Reply-To: <BANLkTinGvFuT8NCmO8-VkvMjwEWd7qzC-g@mail.gmail.com>
References: <BANLkTinGvFuT8NCmO8-VkvMjwEWd7qzC-g@mail.gmail.com>
Message-ID: <BANLkTiktqaj=7V_y-ajvfA+etqpKi1_SOA@mail.gmail.com>

On Fri, May 20, 2011 at 10:14 PM, Uri Laserson <laserson at mit.edu> wrote:
> Does anyone know of a solution for this?
>
> Thanks!
> Uri

I thought JSON was more suited to holding simple data structures,
rather than serialising arbitrary complex objects.

Which bits of data do you need? The basics like the
id/name/description and sequence could be presented like a tuple and
encoded in JSON. Annotations begins to get complicated - but a
dictionary of basic types should be fine. I suspect the biggest hurdle
would be trying to encode any features.

Peter


From sdavis2 at mail.nih.gov  Mon May 23 18:08:47 2011
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 23 May 2011 14:08:47 -0400
Subject: [Biopython] [OT] Bioconductor-2011 conference.
Message-ID: <BANLkTimpfpNywiJTLhm4P5xtYeXmGFdPrA@mail.gmail.com>

All,

Sorry for the slightly off-topic post, but I know there are some
overlaps between Bioconductor and Biopython user groups.

The Bioconductor-2011 conference will be held July 28-29, 2011
(optional: July 27 - Developer Day) at the Fred Hutchinson Cancer
Research Center in Seattle, WA.  This conference highlights current
developments within and beyond?Bioconductor, an international open
source and open development software project for the analysis and
comprehension of high-throughput genomic data. ?The conference
provides a forum in which to discuss the use and design of software
for analyzing data arising in biology with a focus on Bioconductor and
genomic data.

If interested, see the website:

https://secure.bioconductor.org/BioC2011/

Thanks,
Sean


From laserson at mit.edu  Mon May 23 19:42:35 2011
From: laserson at mit.edu (Uri Laserson)
Date: Mon, 23 May 2011 15:42:35 -0400
Subject: [Biopython] reading Alphabet from file
Message-ID: <BANLkTinD93es+AiPhO-dvdSKdoEEB5WxVQ@mail.gmail.com>

Hi all,

I am trying to implement a method that will convert a SeqRecord to a JSON
serializable object.  One piece of data that must be stored for a Seq object
is the alphabet type.  When I read this from file, what is the best practice
to reload a the same alphabet type?

Thanks!
Uri

...................................................................................
Uri Laserson
Graduate Student, Biomedical Engineering
Harvard-MIT Division of Health Sciences and Technology
M +1 917 742 8019
laserson at mit.edu


From p.j.a.cock at googlemail.com  Mon May 23 22:09:02 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 23 May 2011 23:09:02 +0100
Subject: [Biopython]  reading Alphabet from file
In-Reply-To: <BANLkTinD93es+AiPhO-dvdSKdoEEB5WxVQ@mail.gmail.com>
References: <BANLkTinD93es+AiPhO-dvdSKdoEEB5WxVQ@mail.gmail.com>
Message-ID: <BANLkTinsTbtzFjkgNfJ1+yuJcweX7es_xQ@mail.gmail.com>

On Monday, May 23, 2011, Uri Laserson <laserson at mit.edu> wrote:
> Hi all,
>
> I am trying to implement a method that will convert a SeqRecord to a JSON
> serializable object. ?One piece of data that must be stored for a Seq object
> is the alphabet type. ?When I read this from file, what is the best practice
> to reload a the same alphabet type?
>
> Thanks!
> Uri

Hmm, that's tricky because the Biopython alphabet haerachy is so
complicated. Or richly detailed depending on your point of view ;-)

In your position I would apply the KISS principle and reduce it to
Protein, DNA, RNA or unknown - and use the generic_protein etc classes
on reconstruction. Unless you need more detail than that?

Peter


From p.j.a.cock at googlemail.com  Tue May 24 11:26:25 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 24 May 2011 12:26:25 +0100
Subject: [Biopython] gff3 problem
In-Reply-To: <BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
References: <4D9B0A6D.3040608@gmail.com> <20110405132247.GA20523@sobchak>
	<4D9DB3F4.30107@gmail.com>
	<BANLkTinEjy97gKYUPY_1it1zhLOj6sR+nw@mail.gmail.com>
	<BANLkTikDd_K6LTEYWZHmBSKsGA5aiX2msA@mail.gmail.com>
	<EA39C938-FB7B-4808-8B01-AA2D71504080@hutton.ac.uk>
	<BANLkTim2rv4xjQ8dBkq+Zjjom2ys575c4Q@mail.gmail.com>
	<20110408121041.GM20963@sobchak>
	<BANLkTinTuOzNd6JkmxQte1jA=m=S4jD8GA@mail.gmail.com>
	<20110520111535.GC21651@sobchak>
	<BANLkTikS6uFUv+XnEitCpV+5ymhCygkBUw@mail.gmail.com>
Message-ID: <BANLkTimHqZTMzMVmZY0O=hYF6xQxcjY6gQ@mail.gmail.com>

On Fri, May 20, 2011 at 12:27 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Fri, May 20, 2011 at 12:15 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>> Peter;
>>
>> [SeqFeature support for not-stranded elements]
>>> So was the consensus that we should reword the Bio.SeqFeature
>>> docstring so say the four valid values for strand are (with GFF3
>>> equivalents in brackets):
>>>
>>> +1 = Forward (+ in GFF3)
>>> -1 = Reverse (- in GFF3)
>>> 0 = Not stranded (. in GFF3)
>>> None = Unknown (? in GFF3)
>>>
>>> And should features on a protein sequence then have strand 0?
>>
>> That sounds great. I can make the corresponding change to the
>> GFF library. Let me know if there are any other roadblocks to
>> integrating that. Thanks much,
>> Brad

Going over this a fresh now, in my email of 20 May, I had mixed up
Leighton's original suggestion. The two special cases (0 and None)
are a bit of a pain:

http://lists.open-bio.org/pipermail/biopython/2011-April/007194.html

Back in April, Leighton wrote:
> The obvious (to me) mapping of the four allowed Biopython symbols to the
> GFF3 convention is:
> +1 -> +
> -1 -> -
> None -> .
> 0 -> ?
> because 'None' is semantically close to 'has no strand information of
> consequence', and 0 is the mean of +1 and -1 ;)
> Cheers,
> L.

i.e.

+1 = Forward (+ in GFF3)
-1 = Reverse (- in GFF3)
0 = Stranded but unknown (? in GFF3)
None = Not stranded (. in GFF3)

SeqFeature docstring updated:
https://github.com/biopython/biopython/commit/ea64c74758dccfc7e6c0940e31a214293ecc59d3

This way proteins features should have strand None (which is what the
current GenBank/EMBL parser does anyway).

Note that the SeqFeature default is strand=None which is still OK.

Mixed strand isn't needed in the GFF3 model, but we already use
None for this. Perhaps it should be 0 rather than None under this model?

Peter


From hxcan at stupidbeauty.com  Sun May 29 07:18:22 2011
From: hxcan at stupidbeauty.com (=?GB2312?B?ssy78Mqk?=)
Date: Sun, 29 May 2011 15:18:22 +0800
Subject: [Biopython] Another warning of "missing dtd file"
Message-ID: <4DE1F33E.8020700@stupidbeauty.com>

 /usr/lib/python2.6/site-packages/Bio/Entrez/Parser.py:495: UserWarning:
Unable to load DTD file bookdoc_110101.dtd.

Bio.Entrez uses NCBI's DTD files to parse XML files returned by NCBI Entrez.
Though most of NCBI's DTD files are included in the Biopython distribution,
sometimes you may find that a particular DTD file is missing. While we can
access the DTD file through the internet, the parser is much faster if the
required DTD files are available locally.

For this purpose, please download bookdoc_110101.dtd from

http://www.ncbi.nlm.nih.gov/entrez/query/DTD/bookdoc_110101.dtd

and save it either in directory

/usr/lib/python2.6/site-packages/Bio/Entrez/DTDs

or in directory

/Data/.biopython/Bio/Entrez/DTDs

in order for Bio.Entrez to find it.

Alternatively, you can save bookdoc_110101.dtd in the directory
Bio/Entrez/DTDs in the Biopython distribution, and reinstall Biopython.

Please also inform the Biopython developers about this missing DTD, by
reporting a bug on http://bugzilla.open-bio.org/ or sign up to our mailing
list and emailing us, so that we can include it with the next release of
Biopython.

Proceeding to access the DTD file through the internet...

warnings.warn(message)


From p.j.a.cock at googlemail.com  Sun May 29 10:00:58 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sun, 29 May 2011 11:00:58 +0100
Subject: [Biopython] Another warning of "missing dtd file"
In-Reply-To: <4DE1F33E.8020700@stupidbeauty.com>
References: <4DE1F33E.8020700@stupidbeauty.com>
Message-ID: <BANLkTi=xsp6djyW2rRFt2TQn69xijX5BiQ@mail.gmail.com>

2011/5/29 ??? <hxcan at stupidbeauty.com>:
>  /usr/lib/python2.6/site-packages/Bio/Entrez/Parser.py:495: UserWarning:
> Unable to load DTD file bookdoc_110101.dtd.
> ,,,
> For this purpose, please download bookdoc_110101.dtd from
>
> http://www.ncbi.nlm.nih.gov/entrez/query/DTD/bookdoc_110101.dtd
>
> ...
> Please also inform the Biopython developers about this missing DTD, by
> reporting a bug on http://bugzilla.open-bio.org/ or sign up to our mailing
> list and emailing us, so that we can include it with the next release of
> Biopython.

Thank you, that's been added. I don't see anything else missing from
this list, but I know it is a partial listing:

http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/index.shtml

Peter


From sainitin7 at gmail.com  Tue May 31 11:34:54 2011
From: sainitin7 at gmail.com (sai nitin)
Date: Tue, 31 May 2011 13:34:54 +0200
Subject: [Biopython] Query regarding Bioassay database
Message-ID: <BANLkTikJB5Y4KvEdJ7D0tPLKucTzZ_Dtxw@mail.gmail.com>

Hello,

Myself sainitin  i have one query regarding Eutilities use for pubchem and
bioassay database as follows

Question: I have list of pubchem IDs i have to get corresponding bioassay
IDS which are unspecified
for example it should print as following

PubchemID:Bioassay IDs (unspecified)

Please can any one give some suggestions how to retreive unspecified
Bioassay IDS for given Pubchem IDS using Biopython

Thanks in Advance

-- 

Sainitin D


From p.j.a.cock at googlemail.com  Tue May 31 12:30:15 2011
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 31 May 2011 13:30:15 +0100
Subject: [Biopython] Query regarding Bioassay database
In-Reply-To: <BANLkTikJB5Y4KvEdJ7D0tPLKucTzZ_Dtxw@mail.gmail.com>
References: <BANLkTikJB5Y4KvEdJ7D0tPLKucTzZ_Dtxw@mail.gmail.com>
Message-ID: <BANLkTinPVLO5Lhax=XRr-VrYuixaEWaBJQ@mail.gmail.com>

On Tue, May 31, 2011 at 12:34 PM, sai nitin <sainitin7 at gmail.com> wrote:
> Hello,
>
> Myself sainitin ?i have one query regarding Eutilities use for pubchem and
> bioassay database as follows
>
> Question: I have list of pubchem IDs i have to get corresponding bioassay
> IDS which are unspecified
> for example it should print as following
>
> PubchemID:Bioassay IDs (unspecified)
>
> Please can any one give some suggestions how to retreive unspecified
> Bioassay IDS for given Pubchem IDS using Biopython
>
> Thanks in Advance

Try Entrez Link (ELink), possibly with the pcassay_pccompound link. See
the links in the Biopython documentation for Bio.Entrez.ELink, especially:
http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/entrezlinks.html

If you could give a more complete example it would help. In particular,
an example of a positive match between pubchem and bioassay.

Peter