[Biopython-dev] _pwm.c

Bartek Wilczynski barwil at gmail.com
Thu Oct 28 14:37:18 UTC 2010


On Thu, Oct 28, 2010 at 4:27 PM, Peter <biopython at maubp.freeserve.co.uk>wrote:

> On Thu, Oct 28, 2010 at 3:16 PM, Dragoslav Zaric
> <zaricdragoslav at gmail.com> wrote:
> > running build_ext
> > building 'Bio.Motif._pwm' extension
> > creating build/temp.linux-i686-3.1
> > creating build/temp.linux-i686-3.1/Bio
> > creating build/temp.linux-i686-3.1/Bio/Motif
> > gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall
> > -Wstrict-prototypes -fPIC
> > -I/usr/local/lib/python3.1/site-packages/numpy/core/include
> > -I/usr/local/include/python3.1 -c Bio/Motif/_pwm.c -o
> > build/temp.linux-i686-3.1/Bio/Motif/_pwm.o
> > Bio/Motif/_pwm.c: In function ‘init_pwm’:
> > Bio/Motif/_pwm.c:123: warning: ‘return’ with a value, in function
> returning void
> > Bio/Motif/_pwm.c:125: warning: implicit declaration of function
> ‘Py_InitModule4’
> > Bio/Motif/_pwm.c:129: warning: assignment makes pointer from integer
> > without a cast
> > gcc -pthread -shared build/temp.linux-i686-3.1/Bio/Motif/_pwm.o -o
> > build/lib/Bio/Motif/_pwm.so
> >
> >
> > So as you can see this is compiling, but there are some warnings. So what
> is
> > plan, to compile totally without warnings ??
>
> Well ideally no warnings - but of those three warnings only the one about
> Py_InitModule4 strikes me as important. This was part of the Python 2.x
> C API used to tell Python about the functions your code provides, and has
> been changed in Python 3.x (I think you must use PyModule_Create instead).
>
> What happens if you try to use the compiled module in Python 3? e.g.
>
> from Bio import Motif
> from Bio.Motif import _pwm
>
> Bartek - could you give us a short (Python 2) example of Bio.Motif
> which uses the C module _pwm?
>

Hi,

this is the fast implementation of DNA motif searching written by Michiel
some time ago. It is exposed in the Bio.Motif API in the form of .scanPWM
method:

Definition:    m.scanPWM(self, seq)
Docstring:
    Matrix of log-odds scores for a nucleotide sequence.

    scans (using a fast C extension) a nucleotide sequence and returns
    the matrix of log-odds scores for all positions

    - the result is a one-dimensional numpy array
    - the sequence can only be a DNA sequence
    - the search is performed only on one strand

It's a very simple module so it should be relatively easy to convert it to
python3. Unfortunately, I have no experience in c extensions so I cannot
help much.

If you need a snippet for testing, you can use this:
from Bio import Seq
from Bio import Motif
m=Motif.read(open("Doc/cookbook/motif/SRF.pfm"),"jaspar-pfm")
m.scanPWM(Seq.Seq("ACGTGTGCGTAGTGCGT",m.alphabet))


result should be:
array([-29.18363571, -38.3365097 , -29.17756271, -38.04542542, -20.3014183 ,
-25.18009186], dtype=float32)

hope this helps
-- 
Bartek Wilczynski
==================
Postdoctoral fellow
EMBL, Furlong group
Meyerhoffstrasse 1,
69012 Heidelberg,
Germany
tel: +49 6221 387 8433




More information about the Biopython-dev mailing list