From Cariaso at yahoo.com  Thu Apr  1 19:00:31 2004
From: Cariaso at yahoo.com (Michael Cariaso)
Date: Thu Apr  1 19:05:37 2004
Subject: [BioPython] Problems building biopython-based script with py2exe
In-Reply-To: <20040204234841.GI907@evostick.agtec.uga.edu>
References: <6.0.1.1.0.20040203095808.01e675e8@exchange1.scri.sari.ac.uk>	<BC46E178.1C3C%anunberg@oriongenomics.com>
	<20040204234841.GI907@evostick.agtec.uga.edu>
Message-ID: <406CAD1F.6000901@yahoo.com>

perhaps one more minor issue was overlooked.

Bio/__init__.py : 93
        zipfiles = __import__("Bio.config", {}, {}, 
["Bio"]).__loader__.files
doesn't work for me. But (with the '_' before files) it does
        zipfiles = __import__("Bio.config", {}, {}, 
["Bio"]).__loader__._files


Brad Chapman wrote:

>Hi Leighton, Andy;
>
>Me:
>  
>
>>>>Could you please verify that I did everything right and didn't mess
>>>>anything up on typing it up? Ugh, all those list comprehensions and
>>>>import hacks and my head is a bit tired right now; so hopefully I
>>>>did a decent job.
>>>>        
>>>>
>
>Leighton:
>  
>
>>>Just off by a couple of typos:
>>>      
>>>
>
>Thanks -- I just got these all fixed. You also get yourself added to
>the contributors file with happy lines of code. Thanks again for
>your work on this.
>
>Andy:
>  
>
>>Please commit the changes ASAP,
>>I just updated my biopython and tried to install and got an error 
>>at line 54 in the Bio.__init__.py
>>    
>>
>
>Yeah, I'm a mess. Sorry about that. The fixes were checked in
>(actually about 15 minutes before I received your mail -- how's that
>for service :-). It just might take a bit to propagate over to
>anonymous CVS.
>
>Brad
>_______________________________________________
>BioPython mailing list  -  BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>  
>

From lpritc at scri.sari.ac.uk  Fri Apr  2 03:57:23 2004
From: lpritc at scri.sari.ac.uk (Leighton Pritchard)
Date: Fri Apr  2 04:02:32 2004
Subject: [BioPython] Problems building biopython-based script with py2exe
In-Reply-To: <406CAD1F.6000901@yahoo.com>
References: <6.0.1.1.0.20040203095808.01e675e8@exchange1.scri.sari.ac.uk>	<BC46E178.1C3C%anunberg@oriongenomics.com>	<20040204234841.GI907@evostick.agtec.uga.edu>
	<406CAD1F.6000901@yahoo.com>
Message-ID: <406D2AF3.9060606@scri.sari.ac.uk>

Well spotted.  I see from the CVS that Brad has already fixed it, too -
now /that/ is service :)

(My Windows install is behind the CVS and still has my hack, which may
be the reason besides my poor proofreading that I hadn't noticed it -
:oops: )

Michael Cariaso wrote:
 > perhaps one more minor issue was overlooked.
 >
 > Bio/__init__.py : 93
 >        zipfiles = __import__("Bio.config", {}, {},
 > ["Bio"]).__loader__.files
 > doesn't work for me. But (with the '_' before files) it does
 >        zipfiles = __import__("Bio.config", {}, {},
 > ["Bio"]).__loader__._files
 >
 >
 > Brad Chapman wrote:
 >
 >> Hi Leighton, Andy;
 >>
 >> Me:
 >>
 >>
 >>>>> Could you please verify that I did everything right and didn't mess
 >>>>> anything up on typing it up? Ugh, all those list comprehensions and
 >>>>> import hacks and my head is a bit tired right now; so hopefully I
 >>>>> did a decent job.
 >>>>>
 >>
 >>
 >> Leighton:
 >>
 >>
 >>>> Just off by a couple of typos:
 >>>>
 >>
 >>
 >> Thanks -- I just got these all fixed. You also get yourself added to
 >> the contributors file with happy lines of code. Thanks again for
 >> your work on this.
 >>
 >> Andy:
 >>
 >>
 >>> Please commit the changes ASAP,
 >>> I just updated my biopython and tried to install and got an error at
 >>> line 54 in the Bio.__init__.py
 >>>
 >>
 >>
 >> Yeah, I'm a mess. Sorry about that. The fixes were checked in
 >> (actually about 15 minutes before I received your mail -- how's that
 >> for service :-). It just might take a bit to propagate over to
 >> anonymous CVS.
 >>
 >> Brad
 >> _______________________________________________
 >> BioPython mailing list  -  BioPython@biopython.org
 >> http://biopython.org/mailman/listinfo/biopython
 >>
 >>
 >>
 >
 > _______________________________________________
 > BioPython mailing list  -  BioPython@biopython.org
 > http://biopython.org/mailman/listinfo/biopython
 >
 >


-- 
Dr Leighton Pritchard AMRSC
D104, PPI, Scottish Crop Research Institute
Invergowrie, Dundee, DD2 5DA, Scotland, UK
E: lpritc@scri.sari.ac.uk	W: http://bioinf.scri.sari.ac.uk/index.shtml
T: +44 (0)1382 568579		F: +44 (0)1382 568578
PGP key FEFC205C: GPG key E58BA41B: http://www.keyserver.net

From chapmanb at uga.edu  Fri Apr  2 11:20:52 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Fri Apr  2 11:33:54 2004
Subject: [BioPython] error with Fasta.Record?
In-Reply-To: <20040331164354.GA9655@uracil.uio.no>
References: <20040331164354.GA9655@uracil.uio.no>
Message-ID: <20040402162052.GB45713@evostick.agtec.uga.edu>

Hi Karin;

> I use the following code to read in a fasta file:
[...]
> I do this with a test file:
> 
> adenine:18:38> cat /med/adenine/u2/projects/locator/gard/testfile
> >1_dapB_to_carA_29196_29650
> gtctataagtgccaaaaattacatgttttgtcttctgtttttgttgttttaatgtaaatt
> ttgaccatttggtccacttttttctgctcgtttttatttcatgcaatc
[...]
> And the files I get look like this:
> 
> adenine:18:37> cat /med/adenine/u2/projects/locator/gard/singles/10001
> >1_dapB_to_carA_29196_29650
> GTCTATAAGTGCCAAAAATTACATGTTTTGTCTTCTGTTTTTGTTGTTTTAATGTAAATT

Unfortunately, I'm not able to reproduce this error. I've attached a
test script which uses the quick_FASTA_reader and works with the
f002 file from Tests/Fasta (so you can check it yourself on the same
file and make sure everything works on your platform). If you run
this script on your test file, do you see the same problem?

Without knowing more, I have a couple of guesses about the problem:

1. There is some kind of newline problem. The quick_FASTA_reader is
a pretty simple implementation which probably won't work properly if
fed a file with lots of different newlines (or newlines different
from the platform they are being run on). The best solution here is
to use the full Fasta.RecordParser() for parsing.

2. Your code is somewhere modifying the sequences. If seems like you
have at least a bit of other code in there which is doing things
with the entries. Perhaps they are modified somehow there.

Just guesses though. I'd like to fix the problem but need to distill
this down to a test case so that I can reproduce it. Hopefully my
attached test code helps do this.

Thanks for the report and checking into this.
Brad
-------------- next part --------------
from Bio.SeqUtils import quick_FASTA_reader
from Bio import Fasta

fasta_file = "f002"

outfile = "test-writing.fasta"
outhandle = open(outfile, "w")

entries = quick_FASTA_reader(fasta_file)

for name, seq in entries:
    rec = Fasta.Record()
    rec.title = name
    rec.sequence = seq
    print rec
    outhandle.write(str(rec) + "\n")

outhandle.close()
From chapmanb at uga.edu  Fri Apr  2 11:28:24 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Fri Apr  2 11:39:10 2004
Subject: [BioPython] Deprecated Bio.sequtils
Message-ID: <20040402162824.GC45713@evostick.agtec.uga.edu>

Hello everyone;
In checking into a bug report, I suddenly realized for the first
time that Bio/sequtils.py and Bio/SeqUtils/__init__.py duplicate the
same code. It looks like sequtils.py was expanded to an entire
directory at some point but the duplication was never removed. Even
worse, there have been different fixes and additions to both parts.

To fix this problem, I've merged all the changes into
Bio/SeqUtils/__init__.py and officially deprecated Bio/sequtils.py.
The SeqUtils directory contains other useful manipulation code so
this seems the right way to do things without being too confusing.

For now Bio/sequtils.py raises a DeprecationWarning when used, but
still works (by importing the code from Bio/SeqUtils.py into it's
namespace. This will give people time to update their scripts
without breaking anything. To fix Scripts that reference
Bio/sequtils.py you need to change:

from Bio.sequtils import whatever

to:

from Bio.SeqUtils import whatever

Sorry about any difficulties this may cause. Bio/sequtils.py will
remain around for a while to maintain back compatibility with the
warning. Please let me know if I've done anything which does not
leave it back compatible.

Thanks!
Brad
From fkauff at duke.edu  Fri Apr  2 14:42:11 2004
From: fkauff at duke.edu (Frank Kauff)
Date: Fri Apr  2 14:47:19 2004
Subject: [BioPython] Deprecated Bio.sequtils
In-Reply-To: <20040402162824.GC45713@evostick.agtec.uga.edu>
References: <20040402162824.GC45713@evostick.agtec.uga.edu>
Message-ID: <1080934930.2059.5.camel@osiris.biology.duke.edu>

Hi Brad,

while you're at it... :-)

there seems to be a little bug in
SeqUtils.quicker_apply_on_multi_fasta()

Instead of 

if result:
     results.append('>%s\n%s' % (record.title, result))

it should be

if result:
     results.append('>%s\n%s' % (name, result))
 
Probably a copy-paste error from apply_on_multi_fasta()

Cheers,
Frank


On Fri, 2004-04-02 at 11:28, Brad Chapman wrote:
> Hello everyone;
> In checking into a bug report, I suddenly realized for the first
> time that Bio/sequtils.py and Bio/SeqUtils/__init__.py duplicate the
> same code. It looks like sequtils.py was expanded to an entire
> directory at some point but the duplication was never removed. Even
> worse, there have been different fixes and additions to both parts.
> 
> To fix this problem, I've merged all the changes into
> Bio/SeqUtils/__init__.py and officially deprecated Bio/sequtils.py.
> The SeqUtils directory contains other useful manipulation code so
> this seems the right way to do things without being too confusing.
> 
> For now Bio/sequtils.py raises a DeprecationWarning when used, but
> still works (by importing the code from Bio/SeqUtils.py into it's
> namespace. This will give people time to update their scripts
> without breaking anything. To fix Scripts that reference
> Bio/sequtils.py you need to change:
> 
> from Bio.sequtils import whatever
> 
> to:
> 
> from Bio.SeqUtils import whatever
> 
> Sorry about any difficulties this may cause. Bio/sequtils.py will
> remain around for a while to maintain back compatibility with the
> warning. Please let me know if I've done anything which does not
> leave it back compatible.
> 
> Thanks!
> Brad
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293

From chapmanb at uga.edu  Fri Apr  2 15:41:24 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Fri Apr  2 15:52:11 2004
Subject: [BioPython] Deprecated Bio.sequtils
In-Reply-To: <1080934930.2059.5.camel@osiris.biology.duke.edu>
References: <20040402162824.GC45713@evostick.agtec.uga.edu>
	<1080934930.2059.5.camel@osiris.biology.duke.edu>
Message-ID: <20040402204124.GD45713@evostick.agtec.uga.edu>

Hi Frank;

> there seems to be a little bug in
> SeqUtils.quicker_apply_on_multi_fasta()

Thanks! All checked into CVS -- just drop a line is you spot
anything else.

Brad
From dora at doracasso.com  Fri Apr  2 20:08:32 2004
From: dora at doracasso.com (Dora Casso)
Date: Fri Apr  2 20:08:41 2004
Subject: [BioPython] biopython.org ranked # 38 in Google for super viagra
Message-ID: <17605128.1080954600265.JavaMail.developer@211.152.14.82>

Hi there! Sorry for an e-mail out of the blue, but I just did a search for the term super viagra on Google and found biopython.org ranked 38. Since I publish a related website about Health - Pharmacy (it's strictly informational, so I'm definitely NOT a competitor of yours), I'd like to link to your site.
 
My site is one of the best resources for info in our category (I think you'll see that my site is pretty clean and high quality, and I only request to link to other quality sites for exchange). Because of this great info, I get a pretty decent amount of visitors...so if I link to you, your site should get some nice traffic as well.
 
 So you know, I've already linked to you and will keep it there for a few days until I hear from you. If you're interested in swapping links for good, please reply back so I can get you all of the pertinent information.
 
Thanks!
 
Dora Casso
RAC IM: 1105232.

From mdehoon at ims.u-tokyo.ac.jp  Sun Apr  4 23:37:54 2004
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sun Apr  4 23:43:23 2004
Subject: [BioPython] kMeans.py -> Bio.Cluster
Message-ID: <4070D492.20107@ims.u-tokyo.ac.jp>

Brad:
 > 1. Starting to raise a Deprecation Warning for the kMeans module. 2. Trying
 > to write some kind of short document on how to switch from using kMeans to
 > using Bio.Cluster.kcluster. BioPerl has a document called DEPRECATED with
 > this kind of info -- that seems like a reasonable step to follow. Jeff and
 > Michiel, would it be possible to write something up quick. 3. Thomas needs to
 > decide if he wants to rewrite xkMeans or deprecate it as well.

1. I have added a Deprecation Warning to the kMeans module and the xkMeans 
module. Btw, it seems that the import statement in xkMeans.py is no longer valid 
  (from Bio.Clustering import kMeans).
2. Below is my attempt at the DEPRECATED doc. Jeff, could you have a look at 
this to see if there are any mistakes? Thanks!

--Michiel.

Moving from kMeans.py to Bio.Cluster
====================================

The k-Means algorithm is an algorithm for unsupervised clustering of data.
Biopython includes an implementation of the k-means clustering algorithm
in kMeans.py. Recently, a larger set of clustering algorithms entered
Biopython as Bio.Cluster. As the kcluster routine in Bio.Cluster also implements
the k-means clustering algorithm, the kMeans.py module has been deprecated. This
document describes how to switch from kMeans.py to Bio.Cluster's kcluster.

The function kcluster in Bio.Cluster performs k-means or k-medians clustering.
The corresponding function in kMeans.py is called cluster. This function takes
the following arguments:

o data
o k
o distance_fn
o init_centroids_fn
o calc_centroid_fn
o max_iterations
o update_fn

The function kcluster in Bio.Cluster takes the following arguments:

o data
o nclusters
o mask
o weight
o transpose
o npass
o method
o dist
o initialid


Arguments for kMeans.py's cluster, and their equivalents in Bio.Cluster
=======================================================================


data
----

In kMeans.py, data is a list of vectors, each containing the same number of
data points. Within the context of clustering genes based on their gene
expression values, each vector would correspond to the gene expression data of
one particular gene, and the values in the vector would correspond to the
measured gene expression value by the different microarrays. The cluster
routine in kMeans.py always performs a row-wise clustering by grouping vectors.

The argument data to Bio.Cluster's kcluster has the same structure as in
kMeans.py. However, Bio.Cluster allows row-wise and column-wise clustering by
the transpose argument. If transpose==0 (the default value), kcluster performs
row-wise clustering, consistent with kMeans.py. If transpose==1, kcluster
performs column-wise clustering. The same behavior can be obtained, of course,
by transposing the data array before calling kcluster.


k
-

The desired number of clusters is specified by the input argument k in
kMeans.py. The corresponding argument in Bio.Cluster's kcluster is nclusters.

distance_fn
-----------

In kMeans.py, the argument distance_fn represents the distance function to
calculate the distances between items and cluster centroids. This argument
corresponds to a true Python function. The default value is the Euclidean
distance, implemented as distance.euclidean in distance.py. User-defined
distance functions can also be used.

The k-means routine in Bio.Cluster does not allow user-specified distance
functions. Instead, it provides the following nine built-in distance functions,
depending on the argument dist:

dist=='e': Euclidean distance
dist=='h': Harmonically summed Euclidean distance
dist=='b': City-block distance
dist=='c': Pearson correlation
dist=='a': absolute value of the Pearson correlation
dist=='u': uncentered correlation
dist=='x': absolute uncentered correlation
dist=='s': Spearmans rank correlation
dist=='k': Kendalls tau

User-defined distance functions are possible only by modifying the C code in
cluster.c (which may not be as hard as it sounds). The default distance function
is the Euclidean distance (distance=='e'). Note that in Bio.Cluster the
Euclidean distance is defined as the sum of squared differences, whereas in
kMeans.py the square root of this quantity is taken. This does not affect the
clustering result.


init_centroids_fn
-----------------

This function specifies the initial choice for the cluster centroids. By
default, cluster in kMeans.py uses a random initial choice of cluster centroids
by randomly choosing k data vectors from the input vectors in the data input
argument. Alternatively, the user can specify a user-defined function to choose
the initial cluster centroids.

In Bio.Cluster, the k-means algorithm in kcluster starts from an initial cluster
assignment instead of an initial choice of cluster centroids. As far as I know,
these two initialization methods are equivalent in practice. Similar to the
cluster routine in kMeans.py, Bio.Cluster's kcluster performs a random initial
assignment of items to clusters. Alternatively, users can specify a
(deterministic) initial clustering via the initialid argument. This argument is
None by default. If not None, it should be a 1D array (or list) containing the
number (between 0 and nclusters-1) of the cluster to which each item is
assigned initially.

Note that the k-means routine in Bio.Cluster performs automatic repeats of the
algorithm, each time starting from a different random initial clustering. See
the comment for the npass argument below.

calc_centroid_fn
----------------

This argument specifies how to calculate the cluster centroids, given the data
vectors of the items that belong to each cluster. By default, the mean over the
vectors is calculated. A user-defined function can also be used.

Bio.Cluster's kcluster does not allow user-defined functions. Instead, the
method to calculate the cluster centroid is determined by the argument method,
which can be either 'a' (arithmetic mean) or 'm' (median). The default is to
calculate the mean ('a').

max_iterations
--------------

The cluster routine in kMeans.py has an argument max_iterations, which is used
to stop the iteration it the routine does not converge after the given number of
iterations.

The kcluster routine in Bio.Cluster does not have such an argument. The failure
of a k-means algorithm to converge is due to the occurrence of periodic
clustering solutions during the course of the k-means algorithm. The kcluster
routine in Bio.Cluster automatically checks for the occurrence of such a
periodicity in the solutions. If a periodic behavior is detected, the algorithm
is interrupted and the last clustering solution is returned. Accordingly, the
kcluster routine is guaranteed to return a clustering solution. Also see the
discussion of the npass argument below.

update_fn
---------

The argument update_fn to cluster in kMeans.py is a hook function that is
called at the beginning of every iteration and passed the iteration number,
cluster centroids, and current cluster assignments. It is used by xkMeans.py,
which provides a visualization of k-means clustering. Currently there is no
equivalent in Bio.Cluster.


Other arguments for Bio.Cluster's kcluster
==========================================

Three arguments in Bio.Cluster's kcluster do not have a direct equivalent in
kMeans.py's cluster.

mask
----

Microarray experiments tend to suffer from a large number of missing data. The
argument mask to Bio.Cluster's kcluster lets the user specify which data are
missing. This argument is an array with the same shape as data, and contains
a 1 for each data point that is present, and a 0 for a missing data point:

   mask[i,j]==1: data[i,j] is valid
   mask[i,j]==0: data[i,j] is a missing data point

Missing data points are ignored by the clustering algorithm. By default, mask
is an array containing 1's everywhere.

weight
------

The weight argument is used to put different weights on different data point.
For example, when clustering genes based on their gene expression profile, we
may want to attach a bigger weight to some microarrays compared to others. By
default, the weight argument contains equal weights of 1.0 for all data points.
Note that for row-wise clustering, the weight argument is a 1D vector whose
length is equal to the number of columns. For column-wise clustering, the length
of this argument is equal to the number of rows.

npass
-----

Typical implementations of the k-means clustering algorithm rely on a random
initialization. Unlike Self-Organizing Maps, however, the k-means algorithm has
a clearly defined goal, which is to minimize the within-cluster sum of
distances. Different k-means clustering solutions (based on different initial
clusterings) can therefore be compared to each other directly. In order to
increase the chance of finding the optimal k-means clustering solution, the
k-means routine in Bio.Cluster automatically repeats the algorithm npass times,
each time starting from a different initial random clustering. The best
clustering solution, as well as in how many of the npass attempts it was found,
is returned to the user. For more information, see the output variable nfound
below. The default value of npass is 1.


Return values
=============

The cluster routine in kMeans.py returns two values, centroids and clusters.
The kcluster routine in Bio.Cluster returns four values: clusterid, centroids,
error, and nfound.


centroids
---------

The centroids return value contains the centroids of the k clusters that were
found, and corresponds to the centroids return value from Bio.Cluster's
kcluster routine.

clusters
--------

The clusters return value contains the number of the cluster to which each
vector was assigned. The corresponding return value in Bio.Cluster's kcluster
is clusterid.

error
-----

The error return value from Bio.Cluster's kcluster is the within-cluster sum of
distances for the optimal clustering solution that was found. This value can be
used to compare different clustering solutions to each other.

nfound
------

The nfound return value from Bio.Cluster's kcluster shows in how many of the
npass runs the optimal clustering solution was found. Accordingly, nfound is at
least 1 and at most equal to npass. A large value for nfound is an indication
that the clustering solution that was found is optimal. On the other hand, if
nfound is equal to 1, it is very well possible that a better clustering solution
exists than the one found by kcluster.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From Angelina7May at yahoo.ca  Mon Apr  5 09:04:41 2004
From: Angelina7May at yahoo.ca (Angelina May)
Date: Mon Apr  5 05:09:13 2004
Subject: [BioPython] 14 FWD: All MEN should read this
Message-ID: <SZWFRMIVAYIYKEFJXKNMKUNG@yahoo.ca>


Paradise SEX Island Awaits! Tropical 1-2 week vacations where anything 
goes!


We have lots of WOMEN, SEX, ALCOHOL, ETC!!

Every man's dream awaits on this island of pleasure.

Ever wonder what a Fantasy Sex Holiday would be like? 

If it was available at a reasonable cost.........would you go? 

Check out more information on our site & we can make your dream 
vacation a reality....


* All contact, reservations, billings, are strcitly confidential & are 
discussed directly with the client only.

** Group dis-counts are available. ie. Bachelor parties, etc.

World-class Golfing, snorkling, night-clubs, & beaches within minutes of resort.

APRIL BONUS now available.

http://www.intimate-travelclub.com


This communi-cation is privileged and contains confidential information 
intended only for the person(s) to whom it is addressed.  Any 
unauthorized disclosure, copying,  distribution  of this 
communication. or  any action on its c.ontents is strictly  prohibited. If you have 
received this message in error, please notify us immediately OR remove 
yours.elf from our list if there is no in.terest in regards to our 
services.

http://www.intimate-travelclub.com/remove/remove.html

berlioz jill quicken dreamy eider expand idiomatic pfennig arrogate caution annulling trytophan vivace porch atreus biddy cesare fortieth fredholm trawl lime repertoire ciliate bombastic exact avesta betoken 

From sdimov at sbnd.net  Mon Apr  5 05:37:53 2004
From: sdimov at sbnd.net (Stoytcho Dimov)
Date: Mon Apr  5 05:37:56 2004
Subject: [BioPython] Outsourcing of software development
Message-ID: <0a4101c41af1$2e652e80$150c000a@sbnd.int>

Attention to:
IT department of The Biopython Project

Dear Manager,

We saw your company description in the "Alexa" database and we are interested in establishing partnership between our companies.

SBND Technologies, a leading Bulgarian software outsourcing company, caters to custom programming in the areas of Web development, Internet based systems development, Low-level systems development and Desktop applications. SBND Technologies provides solutions for small to large businesses. Our current development is mainly oriented towards Microsoft? Windows? and UNIX?-based platforms on PC and PDA devices. We employ a wide range of up-to-date programming technologies and languages. We handle various types of projects from very small to large, complex and cross-platform, incorporating a number of different technologies. We cover the complete project lifecycle, including required consulting, design, development, testing, deployment to end users and support.

We charge between $15 and $25 per working hour depending on the length of work and complexity of the task involved.
 
The product, and all the relevant source code, is the intellectual property of the client.  We take care to ensure that all the author and support information in the final product refers to the client.  In addition, the source code is always supplied with the final release of the product, for no additional charge.
 
If you are interested in our services, I would be happy to provide you with any information and references you may request.
 
More information could be obtained also at http://www.sbnd.net/  
 
I look forward to hearing from you.


Kind regards,
Stoycho Dimov
Executive Manager
SBND Technologies Ltd.
URL: http://www.sbnd.net     
E-mail: sdimov@sbnd.net     
Phone1: +359 2 9312378 
Phone2: +359 2 328709 
Fax: + 359 2 8313158

From jeffrey_chang at stanfordalumni.org  Mon Apr  5 10:13:10 2004
From: jeffrey_chang at stanfordalumni.org (Jeffrey Chang)
Date: Mon Apr  5 10:18:11 2004
Subject: [BioPython] kMeans.py -> Bio.Cluster
In-Reply-To: <4070D492.20107@ims.u-tokyo.ac.jp>
References: <4070D492.20107@ims.u-tokyo.ac.jp>
Message-ID: <5E359854-870B-11D8-8D4C-000A956845CE@stanfordalumni.org>

On Apr 4, 2004, at 11:37 PM, Michiel Jan Laurens de Hoon wrote:

> 1. I have added a Deprecation Warning to the kMeans module and the 
> xkMeans module. Btw, it seems that the import statement in xkMeans.py 
> is no longer valid  (from Bio.Clustering import kMeans).
> 2. Below is my attempt at the DEPRECATED doc. Jeff, could you have a 
> look at this to see if there are any mistakes? Thanks!

This looks fantastic, Michiel!  It seems accurate and complete to me.

Jeff

From pieter at kotnet.org  Mon Apr  5 16:58:31 2004
From: pieter at kotnet.org (pieter@kotnet.org)
Date: Mon Apr  5 17:16:10 2004
Subject: [BioPython] Non blocking blast.
Message-ID: <87y8pagn0o.fsf@hades.kotnet.org>

Hello,

Is there a way to use biopython to submit let say 100 jobs to blast.
Without waiting for them, storing the request ids. And than afterwards
reading the ids and checking which results are available?

Thanks in advanvd,

Pieter

From jeffrey_chang at stanfordalumni.org  Mon Apr  5 18:26:49 2004
From: jeffrey_chang at stanfordalumni.org (Jeffrey Chang)
Date: Mon Apr  5 18:32:21 2004
Subject: [BioPython] Non blocking blast.
In-Reply-To: <87y8pagn0o.fsf@hades.kotnet.org>
References: <87y8pagn0o.fsf@hades.kotnet.org>
Message-ID: <54AB7104-8750-11D8-B659-000A956845CE@stanfordalumni.org>

You may be able to use Bio.MultiProc.copen to do something like that.  
But you do realize that NCBI BLAST is a shared resource, right?

Jeff


On Apr 5, 2004, at 4:58 PM, pieter@kotnet.org wrote:

> Hello,
>
> Is there a way to use biopython to submit let say 100 jobs to blast.
> Without waiting for them, storing the request ids. And than afterwards
> reading the ids and checking which results are available?
>
> Thanks in advanvd,
>
> Pieter
>
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython

From postmaster at scichina.com  Mon Apr  5 22:35:53 2004
From: postmaster at scichina.com (Postmaster)
Date: Mon Apr  5 22:40:39 2004
Subject: [BioPython] Undeliverable Mail
Message-ID: <10404061035.AA76816130@scichina.com>

No message body: liyawen@scichina.com


Original message follows.

From dalke at dalkescientific.com  Tue Apr  6 01:00:49 2004
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Tue Apr  6 01:07:18 2004
Subject: [BioPython] Non blocking blast.
In-Reply-To: <87y8pagn0o.fsf@hades.kotnet.org>
References: <87y8pagn0o.fsf@hades.kotnet.org>
Message-ID: <5F054E10-8787-11D8-B94E-000393C92466@dalkescientific.com>

pieter@kotnet.org:
> Is there a way to use biopython to submit let say 100 jobs to blast.
> Without waiting for them, storing the request ids. And than afterwards
> reading the ids and checking which results are available?

NCBI won't like it if you do 100 BLASTs at once, but let's suppose
it's a hypothetical.

Biopython's BLAST looks like a function call.  That it, it hides
that it's doing network I/O.  The standard way to parallalize it
is to use threads, and for this the standard idiom is boss/worker.
One thread creates two Queue.Queue instances, one for job
requests and the other for job results.  It then starts up N
other threads, each of which know about the Queues.  The boss
thread submits the jobs (as a simple data structure) to the
queue.  Each worker thread does a get on the queue to get the
next job and does the Biopython BLAST request.  When done, the
worker thread returns the information in the results Queue.
While waiting the boss thread can do whatever else is needed.

Aahz wrote some documentation about this idiom ... probably
   http://starship.python.net/crew/aahz/OSCON2001/

					Andrew
					dalke@dalkescientific.com

From fkauff at duke.edu  Tue Apr  6 10:32:56 2004
From: fkauff at duke.edu (Frank Kauff)
Date: Tue Apr  6 10:37:53 2004
Subject: [BioPython] Non blocking blast.
In-Reply-To: <5F054E10-8787-11D8-B94E-000393C92466@dalkescientific.com>
References: <87y8pagn0o.fsf@hades.kotnet.org>
	<5F054E10-8787-11D8-B94E-000393C92466@dalkescientific.com>
Message-ID: <1081261976.2059.13.camel@osiris.biology.duke.edu>

Folks,

On Tue, 2004-04-06 at 01:00, Andrew Dalke wrote:
> pieter@kotnet.org:
> > Is there a way to use biopython to submit let say 100 jobs to blast.
> > Without waiting for them, storing the request ids. And than afterwards
> > reading the ids and checking which results are available?
> 
> NCBI won't like it if you do 100 BLASTs at once, but let's suppose
> it's a hypothetical.
> 
> Biopython's BLAST looks like a function call.  That it, it hides
> that it's doing network I/O.  The standard way to parallalize it
> is to use threads, and for this the standard idiom is boss/worker.
> One thread creates two Queue.Queue instances, one for job
> requests and the other for job results.  It then starts up N
> other threads, each of which know about the Queues.  The boss
> thread submits the jobs (as a simple data structure) to the
> queue.  Each worker thread does a get on the queue to get the
> next job and does the Biopython BLAST request.  When done, the
> worker thread returns the information in the results Queue.
> While waiting the boss thread can do whatever else is needed.
> 

I've a little (crude) script ready that does that, blasting a fasta file
of sequences using threads. It can be useful for blasting a 96 plate of
sequences overnight.
But be careful - as Jeff mentioned, blast is a shared resource:
- for each additional request in the blast queue, you'll get a 60 (or
so) seconds penalty from NCBI: 60s for the second, 120s for the third,
etc. Makes to many threads quite unattractive...
- If you start too many blasts in a short time, after hitting some limit
the only response will be a nice page saying 'Access denied due to
possible misuse', and your IP will be blocked from further access to
ncbi blast... You'll then have to write them a nice email and beg for
grace. Happend to me while testing some automated blast feature :-) But
the limit seems to be several 100 requests in like 24h, which is quite a
lot.

If you're interested in the script, send me an email.

Frank

> Aahz wrote some documentation about this idiom ... probably
>    http://starship.python.net/crew/aahz/OSCON2001/
> 
> 					Andrew
> 					dalke@dalkescientific.com
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293

From Neal5Mcguire at rock.com  Tue Apr  6 18:10:16 2004
From: Neal5Mcguire at rock.com (Neal Mcguire)
Date: Tue Apr  6 14:13:08 2004
Subject: [BioPython] 14 Date Features
Message-ID: <UFTXQJYZGXHAQWZYBSQIPUEQL@rock.com>


Hello, 

Friends have sent you an invitation for a surprise date. Hurry!

http://needtolookforlove.com/confirm/?oc=52212355


This commun-ication is privileged and contains
confi.dential information -
intended on.ly for the person(s) to whom it is
addressed.  Any 
unauthorized disclosure, copying, other distribution 
of this communi.cation or 
taking any action .on its contents is strictly 
prohibited. If you have 
received. this -message in error, plea.se. notify us
immediately OR remove 
yourself- from our list if there is no interest in
re-gards to our 
services.

http://needtolookforlove.com/remove/?oc=17333


hovel corona carport plausible onslaught bronchial danubian cryogenic bacillus beech conic collaborate dementia introduction butch tech economy interior bakery booky serendipitous downey amiss chairperson parkish deuterium clasp insert ago 
2


From cook_jim at yahoo.com  Tue Apr  6 16:07:47 2004
From: cook_jim at yahoo.com (J Cook)
Date: Tue Apr  6 16:12:44 2004
Subject: [BioPython] numeric
Message-ID: <20040406200747.71663.qmail@web10507.mail.yahoo.com>

Hi,

I'm pulling together the dependencies for biopython
and need to know whether I should download "numeric"
or "numarray".

Thanks,
Jim Cook

=====
Email from:   Jim Cook
Reply to   :   cook_jim@yahoo.com   <or>   cookjim@ieee.org

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/
From cook_jim at yahoo.com  Tue Apr  6 17:07:36 2004
From: cook_jim at yahoo.com (J Cook)
Date: Tue Apr  6 17:12:32 2004
Subject: [BioPython] Installation help
Message-ID: <20040406210736.56677.qmail@web10509.mail.yahoo.com>

Hi,

I've completed the installation of the biopython
dependencies and would like to run the rigorous tests
mentioned in the documentation.  However, I don't see
a "Tests" directory in the installation.  What do I
need to do to run the tests?

Thanks,
Jim Cook


=====
Email from:   Jim Cook
Reply to   :   cook_jim@yahoo.com   <or>   cookjim@ieee.org

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/
From chapmanb at uga.edu  Tue Apr  6 17:44:26 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Tue Apr  6 17:54:59 2004
Subject: [BioPython] kMeans.py -> Bio.Cluster
In-Reply-To: <4070D492.20107@ims.u-tokyo.ac.jp>
References: <4070D492.20107@ims.u-tokyo.ac.jp>
Message-ID: <20040406214426.GA25784@evostick.agtec.uga.edu>

Hi Michiel;

> 1. I have added a Deprecation Warning to the kMeans module and the xkMeans 
> module. Btw, it seems that the import statement in xkMeans.py is no longer 
> valid (from Bio.Clustering import kMeans).
> 2. Below is my attempt at the DEPRECATED doc. Jeff, could you have a look 
> at this to see if there are any mistakes? Thanks!

Thanks for doing this! The deprecated documentation is excellent and
I definitely support the deprecation of xkMeans.py as well. From the
invalid import statement it is likely not it use much and if it
doesn't fit with the new system we might as well let it fall to the
side.

Glad to have things coming together nicely. I think I am going to
push for another release soonish (probably the week after next), so
if there is more work you want to do before then on the clustering
stuff please feel free to go ahead.

Thanks again.
Brad
From chapmanb at uga.edu  Tue Apr  6 18:39:02 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Tue Apr  6 18:49:36 2004
Subject: [BioPython] Installation help
In-Reply-To: <20040406210736.56677.qmail@web10509.mail.yahoo.com>
References: <20040406210736.56677.qmail@web10509.mail.yahoo.com>
Message-ID: <20040406223902.GE25784@evostick.agtec.uga.edu>

Hi Jim;
[Condensing both answers into a single mail]

> I'm pulling together the dependencies for biopython
> and need to know whether I should download "numeric"
> or "numarray".

Numeric. Although Numarray is supposed to be API compatible to some
extent with Numeric, I don't believe that anyone has fully tested
that out. Numeric should work for now -- at some point in the future
someone will likely go through and ensure everything works and we'll
move on to Numarray.

> I've completed the installation of the biopython
> dependencies and would like to run the rigorous tests
> mentioned in the documentation.  However, I don't see
> a "Tests" directory in the installation.  What do I
> need to do to run the tests?

If you've installed it from source, then inside the biopython-1.24
directory there should be a Tests directory. You can either change
to the Tests directory and do python run_tests.py (add --no-gui if
you don't want a Tk GUI) or just do python setup.py test from the
main installation directory.

If you are on Windows and installed using a Windows installer, you
need to download the source and unpack it to find the Tests.

Hope this helps.
Brad
From sbassi at asalup.org  Tue Apr  6 18:52:06 2004
From: sbassi at asalup.org (Sebastian Bassi)
Date: Tue Apr  6 18:58:12 2004
Subject: [BioPython] kMeans.py -> Bio.Cluster
In-Reply-To: <20040406214426.GA25784@evostick.agtec.uga.edu>
References: <4070D492.20107@ims.u-tokyo.ac.jp>
	<20040406214426.GA25784@evostick.agtec.uga.edu>
Message-ID: <40733496.2060609@asalup.org>

Brad Chapman wrote:
> Glad to have things coming together nicely. I think I am going to
> push for another release soonish (probably the week after next), so
> if there is more work you want to do before then on the clustering
> stuff please feel free to go ahead.

Please wait for another update of Tm calculation. I received a buf 
report (seems to be a bad value on the deltaS table and some other minor 
stuff).

-- 
Best regards,

//=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ   //=\
\=// IT Manager Advanta Seeds - Balcarce Research Center -      \=//
//=\ Pro secretario ASALUP - www.asalup.org - PGP key available //=\
\=// E-mail: sbassi@genesdigitales.com - ICQ UIN: 3356556 -     \=//

                 http://Bioinformatica.info
From dalke at dalkescientific.com  Tue Apr  6 23:14:13 2004
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Tue Apr  6 23:20:31 2004
Subject: [BioPython] BOSC
Message-ID: <A5791618-8841-11D8-B94E-000393C92466@dalkescientific.com>

Hi all,

   Just a reminder -- only a bit over a week to submit a talk
proposal for BOSC.  In addition to the talks there will also
be lightning talks (5-7 minutes max) and a demo session for
showing off your software to others in a small group rather
than to everyone at once.

Hope to see many of you there!

					Andrew
					dalke@dalkescientific.com

From mdehoon at ims.u-tokyo.ac.jp  Wed Apr  7 08:19:27 2004
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Wed Apr  7 08:24:26 2004
Subject: [BioPython] Questions & suggestions
In-Reply-To: <EE17478A-7B7E-11D8-A886-000A956845CE@stanfordalumni.org>
References: <401AC30A.E639BD2E@ebc.uu.se>	<200403191023.01597.thamelry@binf.ku.dk>	<20040319173703.GC95219@evostick.agtec.uga.edu>	<200403192151.05004.thamelry@binf.ku.dk>	<20040321174605.GA18818@evostick.agtec.uga.edu>
	<EE17478A-7B7E-11D8-A886-000A956845CE@stanfordalumni.org>
Message-ID: <4073F1CF.8050809@ims.u-tokyo.ac.jp>

Jeffrey Chang wrote:
> SVM is superceded by libsvm.  It should be deprecated.

Is libsvm in Biopython? I couldn't find it there. I am now using Bio.SVM, which 
seems to have a problem with large data sets, so I'd like to try libsvm.

--Michiel.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon

From jeffrey_chang at stanfordalumni.org  Wed Apr  7 09:25:51 2004
From: jeffrey_chang at stanfordalumni.org (Jeffrey Chang)
Date: Wed Apr  7 09:30:49 2004
Subject: [BioPython] Questions & suggestions
In-Reply-To: <4073F1CF.8050809@ims.u-tokyo.ac.jp>
References: <401AC30A.E639BD2E@ebc.uu.se>	<200403191023.01597.thamelry@binf.ku.dk>	<20040319173703.GC95219@evostick.agtec.uga.edu>	<200403192151.05004.thamelry@binf.ku.dk>	<20040321174605.GA18818@evostick.agtec.uga.edu>
	<EE17478A-7B7E-11D8-A886-000A956845CE@stanfordalumni.org>
	<4073F1CF.8050809@ims.u-tokyo.ac.jp>
Message-ID: <17306090-8897-11D8-91D5-000A956845CE@stanfordalumni.org>

libsvm is a general SVM library, not related to Biopython.
http://www.csie.ntu.edu.tw/~cjlin/libsvm/

It has a Python interface.

Jeff


On Apr 7, 2004, at 8:19 AM, Michiel Jan Laurens de Hoon wrote:

> Jeffrey Chang wrote:
>> SVM is superceded by libsvm.  It should be deprecated.
>
> Is libsvm in Biopython? I couldn't find it there. I am now using 
> Bio.SVM, which seems to have a problem with large data sets, so I'd 
> like to try libsvm.
>
> --Michiel.
>
>
> -- 
> Michiel de Hoon, Assistant Professor
> University of Tokyo, Institute of Medical Science
> Human Genome Center
> 4-6-1 Shirokane-dai, Minato-ku
> Tokyo 108-8639
> Japan
> http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
>
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython

From jgreena at emory.edu  Wed Apr  7 07:34:50 2004
From: jgreena at emory.edu (jgreena@emory.edu)
Date: Wed Apr  7 09:31:14 2004
Subject: [BioPython] Re: Question
Message-ID: <200404071155.i37Btgg2010751@portal.open-bio.org>

WARNING: This e-mail has been altered by MIMEDefang.  Following this
paragraph are indications of the actual changes made.  For more
information about your site's MIMEDefang policy, contact
MIMEDefang Administrator's <dag@sonsorol.org>.  For more information about MIMEDefang, see:

            http://www.roaringpenguin.com/mimedefang/enduser.php3

An attachment named word_doc_biopython.pif was removed from this document as it
constituted a security hazard.  If you require this document, please contact
the sender and arrange an alternate means of receiving it.

-------------- next part --------------
I have attached the sample.

From cook_jim at yahoo.com  Wed Apr  7 16:35:11 2004
From: cook_jim at yahoo.com (J Cook)
Date: Wed Apr  7 16:40:04 2004
Subject: [BioPython] Testing of installation
Message-ID: <20040407203511.64895.qmail@web10503.mail.yahoo.com>

Hi,

I can't find the file "run_tests.py" referred to in
the biopython documentation.  Is there another file
that will test the entire installation?

Thanks,
Jim Cook

=====
Email from:   Jim Cook
Reply to   :   cook_jim@yahoo.com   <or>   cookjim@ieee.org

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/
From cook_jim at yahoo.com  Wed Apr  7 16:43:52 2004
From: cook_jim at yahoo.com (J Cook)
Date: Wed Apr  7 16:48:47 2004
Subject: [BioPython] Installation help
Message-ID: <20040407204352.29555.qmail@web10505.mail.yahoo.com>

Hi Brad,

That answers my questions.  I am on a Windows machine
and I used a Windows installer, so I'll download the
source to get the tests.

Thanks,
Jim Cook

=====
Email from:   Jim Cook
Reply to   :   cook_jim@yahoo.com   <or>   cookjim@ieee.org

__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway 
http://promotions.yahoo.com/design_giveaway/
From pieter at laeremans.org  Thu Apr  8 08:37:17 2004
From: pieter at laeremans.org (Pieter Laeremans)
Date: Thu Apr  8 08:42:36 2004
Subject: [BioPython] Non blocking blast.
References: <87y8pagn0o.fsf@hades.kotnet.org>
	<5F054E10-8787-11D8-B94E-000393C92466@dalkescientific.com>
	<1081261976.2059.13.camel@osiris.biology.duke.edu>
Message-ID: <87vfkaabnm.fsf@hades.kotnet.org>

Frank Kauff <fkauff@duke.edu> writes:

>
> I've a little (crude) script ready that does that, blasting a fasta file
> of sequences using threads. It can be useful for blasting a 96 plate of
> sequences overnight.
> But be careful - as Jeff mentioned, blast is a shared resource:
> - for each additional request in the blast queue, you'll get a 60 (or
> so) seconds penalty from NCBI: 60s for the second, 120s for the third,
> etc. Makes to many threads quite unattractive...
> - If you start too many blasts in a short time, after hitting some limit
> the only response will be a nice page saying 'Access denied due to
> possible misuse', and your IP will be blocked from further access to
> ncbi blast... You'll then have to write them a nice email and beg for
> grace. Happend to me while testing some automated blast feature :-) But
> the limit seems to be several 100 requests in like 24h, which is quite a
> lot.
>
> If you're interested in the script, send me an email.
>
> Frank


Thank you all very much for the input.  But I think I have no other
option than submitting one job at a time. Since it is of utmost
importance that I do not get blocked.

kind regards,

Pieter

From biosql at hotmail.com  Thu Apr  8 16:31:14 2004
From: biosql at hotmail.com (Jonathan Boulais)
Date: Thu Apr  8 17:03:47 2004
Subject: [BioPython] Need help to get Fasta sequence of Gis !
Message-ID: <BAY13-F7478vryxZV3Z000526d2@hotmail.com>

An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython/attachments/20040408/bd741961/attachment.htm
From idoerg at burnham.org  Thu Apr  8 17:07:08 2004
From: idoerg at burnham.org (Iddo Friedberg)
Date: Thu Apr  8 17:12:31 2004
Subject: [BioPython] Need help to get Fasta sequence of Gis !
In-Reply-To: <BAY13-F7478vryxZV3Z000526d2@hotmail.com>
References: <BAY13-F7478vryxZV3Z000526d2@hotmail.com>
Message-ID: <4075BEFC.5000303@burnham.org>

Welcome!

Since the list is huge, I guess you should do it standalone, rather than 
via the net.

How about downloading nr or np as the case may be from NCBI. The gi 
numbers should be in the fasta headers. Then use the fasta parser (see 
tutorial) to parse through the file, and retrieve those sequences which 
you need.

email me if you need more info,

Iddo

Jonathan Boulais wrote:
> Hi everyone !
>  
> I'm a newbie to Biopython and I would like to get the fasta sequences of 
> a huge list of Gis. Any suggestions ?
>  
> Thanks !
> 
> ------------------------------------------------------------------------
> MSN Search, le moteur de recherche qui pense comme vous ! Cliquez-ici 
> <http://g.msn.com/8HMAFRCA/2752??PS=>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9930
http://ffas.ljcrf.edu/~iddo
From dalke at dalkescientific.com  Thu Apr  8 17:15:37 2004
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Thu Apr  8 17:21:53 2004
Subject: [BioPython] Need help to get Fasta sequence of Gis !
In-Reply-To: <BAY13-F7478vryxZV3Z000526d2@hotmail.com>
References: <BAY13-F7478vryxZV3Z000526d2@hotmail.com>
Message-ID: <E1DE8081-89A1-11D8-B418-000393C92466@dalkescientific.com>

Jonathan Boulais:
> Hi everyone !
> I'm a newbie to Biopython

Welcome!

> and I would like to get the fasta sequences of a huge list of Gis. Any 
> suggestions ?

How huge?  At some point it's better to just download GenBank and get 
the
data straight from there.

If it's small enough (10,000 or fewer records?), then look at the
Bio.EUtils client.

 >>> from Bio import EUtils
 >>> from Bio.EUtils import ThinClient
 >>> client = ThinClient.ThinClient()
 >>> dbids = EUtils.DBIds("protein", ["914034", "5263173", "1769808", 
"1060883"])
 >>> f = client.efetch_using_dbids(dbids, retmode = "text", rettype = 
"fasta")
 >>> print f.read()
 >gi|914034|gb|AAB32951.1| cruxrhodopsin-2 [Haloarcula]
MLQSGMSTYVPGGESIFLWVGTAGMFLGMLYFIARGWSVSDQRRQKFYIATIMIAAIAFVNYLSMALGFG
VTTIELGGEERAIYWARYTDWLFTTPLLLYDLALLAGADRNTIYSLVGLDVLMIGTGALATLSAGSGVLP
AGAERLVWWGISTGFLLVLLYFLFSNLTDRASELSGDLQSKFSTLRNLVLVLWLVYPVLWLVGTEGLGLV
GLPIETAAFMVLDLTAKIGFGIILLQSHAVLDEGQTASEGAAVAD

 >gi|5263173|dbj|BAA81816.1| cruxrhodopsin [Haloarcula japonica]
MPEPGSEAIWLWLGTAGMFLGMLYFIGRGWGETDSRRQKFYIATILITAIAFVNYLAMALGFGLTIVEFA
GEEHPIYWARYSDWLFTTPLLLYDLGLLAGADRNTIASLVSLDVLMIGTGLVATLSAGSGVLSAGAERLV
WWGISTAFLLVLLYFLFSSLSGRVADLPSDTRSTFKTLRNLVTVVWLVYPVWWLIGTEGLGLVGIGIETA
GFMVIDLTAKVGFGIILLRSHGVLDGAAETTGAGATATAD

 >gi|1769808|dbj|BAA06680.1| cruxrhodopsin-3 [Haloarcula vallismortis]
MPAPEGEAIWLWLGTAGMFLGMLYFIARGWGETDSRRQKFYIATILITAIAFVNYLAMALGFGLTIVEIA
GEQRPIYWARYSDWLFTTPLLLYDLGLLAGADRNTISSLVSLDVLMIGTGLVATLSAGSGVLSAGAERLV
WWGISTAFLLVLLYFLFSSLSGRVADLPSDTRSTFKTLRNLVTVVWLVYPVWWLVGTEGIGLVGIGIETA
GFMVIDLVAKVGFGIILLRSHGVLDGAAETTGAGATATAD

 >gi|1060883|dbj|BAA06678.1| cruxrhodopsin-1 [Haloarcula argentinensis]
MPEPGSEAIWLWLGTAGMFLGMLYFIARGWGETDSRRQKFYIATILITAIAFVNYLAMALGFGLTIVEFA
GEEHPIYWARYSDWLFTTPLLLYDLGLLAGADRNTITSLVSLDVLMIGTGLVATLSPGSGVLSAGAERLV
WWGISTAFLLVLLYFLFSSLSGRVADLPSDTRSTFKTLRNLVTVVWLVYPVWWLIGTEGIGLVGIGIETA
GFMVIDLTAKVGFGIILLRSHGVLDGAAETTGTGATPADD


I'm working a cleanup of EUtils to make some of the machinery
disappear.  I expect the result will let you do

import EUtils
f = EUtils.efetch("protein", ["914034", "5263173", "1769808", 
"1060883"],
                   format = "fasta")
print f.read()

Is anyone here using EUtils?  I would like to see some code which
uses it, to make sure I don't break things and to see if I can
improve the API.

					Andrew
					dalke@dalkescientific.com

From chapmanb at uga.edu  Thu Apr  8 17:15:42 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Thu Apr  8 17:26:08 2004
Subject: [BioPython] Need help to get Fasta sequence of Gis !
In-Reply-To: <4075BEFC.5000303@burnham.org>
References: <BAY13-F7478vryxZV3Z000526d2@hotmail.com>
	<4075BEFC.5000303@burnham.org>
Message-ID: <20040408211542.GD63800@evostick.agtec.uga.edu>

Hey Jonathon, Iddo;

Jonathon:
> >I'm a newbie to Biopython and I would like to get the fasta sequences of 
> >a huge list of Gis. Any suggestions ?

Iddo:
> Since the list is huge, I guess you should do it standalone, rather than 
> via the net.

That's the best idea. But if you want to do it by the web and it is
feasible (depends a lot on your definition of huge), you can use the
Biopython EUtils interface. If your list of gis is in a variable
called my_gis, you could do this like:

from Bio.EUtils import DBIds
from Bio.EUtils import DBIdsClient

# assuming they are GIs for DNA sequence
db_ids = DBIds("nucleotide", my_gis)
eutils_client = DBIdClient.from_dbids(db_ids)
fasta_handle = eutils_client.efetch(retmode = "text", 
                                    rettype = "fasta")
output_handle = open("my_output.fasta", "w")
output_handle.write(fasta_handle.read())
output_handle.close()

EUtils is pretty nice at giving you back a lot of sequences, so that
might work for you.

Best of luck.
Brad
From chapmanb at uga.edu  Thu Apr  8 17:27:48 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Thu Apr  8 17:38:15 2004
Subject: [BioPython] BOSC
In-Reply-To: <A5791618-8841-11D8-B94E-000393C92466@dalkescientific.com>
References: <A5791618-8841-11D8-B94E-000393C92466@dalkescientific.com>
Message-ID: <20040408212748.GE63800@evostick.agtec.uga.edu>

Hey all;

Andrew:
>   Just a reminder -- only a bit over a week to submit a talk
> proposal for BOSC.  In addition to the talks there will also
> be lightning talks (5-7 minutes max) and a demo session for
> showing off your software to others in a small group rather
> than to everyone at once.

Thanks for the reminder. I'd definitely like to encourage people to
give Python/Biopython related talks. If you wrote (or are nearly
done writing :-) a module for Biopython the lightning talks are a
great chance to show it off.

We'd also like to try and get an official Biopython talk (a longer
talk with extensive examples of Biopython and it's usage) as BOSC is
a great chance to let people know about the kind of things Biopython
does well. 

I'd like to offer up the talk to any of the regular contributors to
Biopython. Although I'm now the "official" coordinator, I've also
given the talk the past two years and would like to have a fresh
perspective on it from someone. Additionally, BOSC falls right after
graduation deadlines for me this year so I will be quite busy.

Is anyone interested? Basically, it would involve writing up an
abstract by the deadline (May 5th, from the BOSC website:
http://open-bio.org/bosc2004/) and then getting together the talk.
There are several sample talks on the Biopython documentation page
to give an idea of what they are like:

http://www.biopython.org/documentation/

Well, that's my plea for good Biopython representation at the BOSC
conference this year. Pretty persuasive, eh?

Brad
From Nicolas.Chauvat at logilab.fr  Sat Apr 10 16:59:17 2004
From: Nicolas.Chauvat at logilab.fr (Nicolas Chauvat)
Date: Sat Apr 10 17:04:16 2004
Subject: [BioPython] Europython: Registration open - Talk submission
	deadline is Apr 15th
Message-ID: <20040410205917.GA10699@logilab.fr>

Dear Scientific Pythonistas,

I'm forwarding this update about the EuroPython conference. Please
note that the talk submission deadline is April 15th but will probably
be extended a bit. I suppose most of you will be interested in the
Science Track at least.

For previous years' program, please refer to :

	http://www.europython.org/2002/sessions/talks
	http://www.europython.org/2003/sessions/talks

---------------------------------------------------------------------
Subject: [EuroPython] Europython Update: Registration open

Europython Update
=================

- Registration is now open. We apologise for the delay, but we have
had some technical problems.

- Due to this, we have decided to keep the submission of abstracts
for the refereed track open for one more day. Last submission time is
now on Sunday 11 April at 23.59 CET.

- We have a limited number of beds available in very affordable
accomodation near the conference venue. Book early before it runs out.

- We are still receiving submissions for regular talks and tutorials.
Closing date is 15 April.

- There is now a wiki at the Europython website for sprint organising. 
Start planning!

About the conference
====================

EuroPython 2004 will be held 7-9 June in G?teborg, Sweden.

The EuroPython conference will have tracks for Science, Business,
Education, Applications, Frameworks, Zope and the Python language
itself. Lightning talks, Open Space and BOF sessions are also planned.
There will be tutorials as well, both for newcomers to Python and
Python users interested in special subjects. In the days before and
after the conference, programming sprints will be arranged.


Important dates
===============

Refereed paper proposals: until 11 April.
Submission of talks: 1 March - 15 April.
Early Bird registration: 9 April - 1 May.
Accomodation booking: 9 April - 1 May (or until space runs out)

More information at http://www.europython.org.

---------------------------------------------------------------------

Hope to see you there.


-- 
Nicolas Chauvat

logilab.fr - services en informatique avanc?e et gestion de connaissances  
From dalke at dalkescientific.com  Sun Apr 11 03:36:35 2004
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sun Apr 11 03:42:39 2004
Subject: [BioPython] Europython: Registration open - Talk submission
	deadline is Apr 15th
In-Reply-To: <20040410205917.GA10699@logilab.fr>
References: <20040410205917.GA10699@logilab.fr>
Message-ID: <F5F11646-8B8A-11D8-B418-000393C92466@dalkescientific.com>

EuroPython is in G?teborg, Sweden this year, which just happens to be
where one of my clients is located.  I'm planning to be at EuroPython
and am planning to submit a talk along the lines of
   "Python Libraries for Chemistry and Biology"
which will cover Biopython, PyDaylight, OEChem, and probably a few
others (MMTK, PyQuante).  If I make it "Python Software for ..."
then I'll mention PyMol and a few more applications.

As such, I think it will be a 45 minute talk.  However, if someone
else here is planning to be at EuroPython and wants to talk about
one or more of these then let me know so I can adjust my proposal
accordingly.

					Andrew
					dalke@dalkescientific.com


From yong27 at bioinfo.sarang.net  Mon Apr 12 03:34:43 2004
From: yong27 at bioinfo.sarang.net (Hyung-Yong Kim)
Date: Mon Apr 12 03:39:35 2004
Subject: [BioPython] parsing error in Bio.Sequencing.Ace
Message-ID: <KOEALHFMKEJANKOPIOOIGECFCAAA.yong27@bioinfo.sarang.net>

Hi,

I'm parsing the ACE file made by phrap. But there is some parsing error.

$ cat aceToAbs.py 
#!/home/yong27/bin/python
import sys
from Bio.Sequencing import Ace
for contig in Ace.Iterator(sys.stdin, Ace.RecordParser()):
    sys.stdout.write(contig.contig_name+'-----------\n')
    for af in contig.af:
        sys.stdout.write(af.name+' '+str(af.padded_start)+'\n')
    
$ ./aceToAbs.py < 040407.fasta.screen.ace.1 > out
Traceback (most recent call last):
  File "./aceToAbs.py", line 6, in ?
    for contig in Ace.Iterator(sys.stdin,Ace.RecordParser()):
  File "/home/yong27/python/lib/python2.3/site-packages/Bio/Sequencing/Ace.py", line 206, in next
    return self._parser.parse(File.StringHandle(data))
  File "/home/yong27/python/lib/python2.3/site-packages/Bio/Sequencing/Ace.py", line 225, in parse
    self._scanner.feed(uhandle, self._consumer)
  File "/home/yong27/python/lib/python2.3/site-packages/Bio/Sequencing/Ace.py", line 315, in feed
    self._scan_record(handle, consumer)
  File "/home/yong27/python/lib/python2.3/site-packages/Bio/Sequencing/Ace.py", line 363, in _scan_record
    read_and_call(uhandle,consumer.ct_start,start='CT')
  File "/home/yong27/python/lib/python2.3/site-packages/Bio/ParserSupport.py", line 301, in read_and_call
    method(line)
  File "/home/yong27/python/lib/python2.3/site-packages/Bio/Sequencing/Ace.py", line 502, in ct_start
    raise SyntaxError, 'CT tag does not start with CT{'
SyntaxError: CT tag does not start with CT{

This problem seems that because parser don't clarify the distinction between 'CT' record and nucleic acid '^CT'.
Temporary solution is to modify 362 line of Ace.py

if line.startswith('CT'):    --> if line.startswith('CT{'):

It needs to be modified.

Hyung-Yong Kim
---------------------------------
National Livestock Research Institute (Korea)
Division of Animal Genomics & Bioinformatics
http://bioinfo.sarang.net


From fkauff at duke.edu  Mon Apr 12 09:01:26 2004
From: fkauff at duke.edu (Frank Kauff)
Date: Mon Apr 12 09:06:15 2004
Subject: [BioPython] parsing error in Bio.Sequencing.Ace
In-Reply-To: <KOEALHFMKEJANKOPIOOIGECFCAAA.yong27@bioinfo.sarang.net>
References: <KOEALHFMKEJANKOPIOOIGECFCAAA.yong27@bioinfo.sarang.net>
Message-ID: <1081774886.2059.4.camel@osiris.biology.duke.edu>

Hi Hyung-Yong Kim,


On Mon, 2004-04-12 at 03:34, Hyung-Yong Kim wrote:

...

> This problem seems that because parser don't clarify the distinction between 'CT' record and nucleic acid '^CT'.
> Temporary solution is to modify 362 line of Ace.py
> 
> if line.startswith('CT'):    --> if line.startswith('CT{'):
> 
> It needs to be modified.
> 

It does! In theory, the parser should not run into this problem - are
you using the newest ace.py from cvs? The ace parser has been updated
after the recent 'official' release of biopython. Anyway, please send me
your input file so I can have a closer look why this happens and I can
fix it.

Frank


> Hyung-Yong Kim
> ---------------------------------
> National Livestock Research Institute (Korea)
> Division of Animal Genomics & Bioinformatics
> http://bioinfo.sarang.net
> 
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293

From =?iso-2022-jp?Q?=1B=24B=22!3Z!9=25H=25/=25H=25/=25S=258=25M=1B=28B?=  Mon Apr 12 10:00:31 2004
From: =?iso-2022-jp?Q?=1B=24B=22!3Z!9=25H=25/=25H=25/=25S=258=25M=1B=28B?= (=?iso-2022-jp?Q?=1B=24B=22!3Z!9=25H=25/=25H=25/=25S=258=25M=1B=28B?=)
Date: Mon Apr 12 10:11:32 2004
Subject: [BioPython] =?iso-2022-jp?b?GyRCIVokKkU3NSQlYSVrJV4lLCFbIzUbKEI=?=
 =?iso-2022-jp?b?GyRCMi8jOUBpS3wxXz5aNXIhJj1QTWgkazpfQnAlUyU4JU0lOSRIGyhC?=
 =?iso-2022-jp?b?GyRCQTQ5cSQqRTc1JD5wSnMhIUJoGyhCNDEwGyRCOWYhShsoQjgsMDAw?=
 =?iso-2022-jp?b?GyRCSXRHWz8uGyhCKQ==?=
Message-ID: <18976763.1081778431703.JavaMail.nobody@hosyou-b.mine.nu>

biopython@biopython.org$BMM!"(B
   $B$*FI$_D:$-!"$"$j$,$H$&$4$6$$$^$9!#(B

$B!~K\%a%k%^%,$NG[?.ITMW!"$^$?$OEPO?$7$?3P$($N$J$$>l9g$O(B
$B!!0lHV2<$N!z!z!z!!:#F|$NE75$M=Js!!!z!z!z(B
$B!!!!!!!!$H(,KhD+8+$l$k!*!!A49q$N$*E75$(,!!(B
$B!!!!!!$N4V$K$"$k%"%I%l%9$G2r=|$5$;$FD:$-$^$9!#!!(B

$B!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B!!!!!~#52/#9@iK|1_>Z5rM-%S%8%M%9!*9b3[6b3[$O4X?4$,9b$$!~(B
$B!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B!y#5@iK|1_Cy6b$,M_$7$$!*(B
$B!y#32/1_$N;v6H;q6b$,M_$7$$!*(B
$B!yO78e$N$?$a$K#4@iK|1_M_$7$$(B
$B!y0B?4$N0Y$KJ]>Z$,M_$7$$(B
$B!y%M%C%H$G#7@iK|1_0J>e<}F~$,M_$7$$(B

$B$=$s$J5.J}$K:GE,$J4JC1:_Bp%S%8%M%9$G$9!#(B
$B4{$K#32/!"#52/#9@iK|1_$N<}F~<TB3=P$7$F$$$^$9!#(B
$B$^$:$O!">Z5r$r8+$F$/$@$5$$!"$=$l$+$i$G$9!*(B
$B!z>Z5r$OEl5~9b:[$K$FH=7h!z(B 
$B"!9b3[<}F~$N>Z5r%"%j!*"!(B
$B>\$7$/$O(BHP$B$K$F$43NG'$/$@$5$$!#(B

$B!!(Bhttp://break.at/hosyou

$B2?;v$bO@$h$j>Z5r!&:[H=41$NL\$G$*3N$+$a?d>)!*!*(B
$B!!!!!!!!!!!!(B
$B!!!!!!!!!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B!!!!!!!!!!(B   $B5.J}MM$O!*$4B8$8$G$9$+!*(B
$B!!!!!!!!!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B!!$D$$$K%0%i%s%I%*!<%W%s$NF|Dx$,7h$^$j$^$7$?!*(B
$B!!$=$l$^$G$K5.J}$N%]%8%7%g%s$r3NJ]$7$FCV$$$F2<$5$$!*(B
$B!!"-!!!!"-!!(B
http://fortuna.love-jpn.com/index.cgi?id=world
$B!!!!!!(B
$B!!!!!!!!!!!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B!!!!!!!!!!!!!!F/$/%^%^$N!*;d$K$b=PMh$k;v!*(B
$B!!!!!!!!!!!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B:_8K$J$7!&%;%_%J!<;22C$J$7!&=i?4<T$K$b4JC1!*!&%5%]!<%H(B
$BBN@)%P%C%A%j(B
$B;22C<T$N(B8$B3d$,=w@-$NJ}$H$$$&$N$bG<F@$G$9(B

http://stage-one.jp/jc/myao/

$B!!!!!!!!!!!!(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(,(B
$B!Z9-9pEj9F?o;~Jg=8Cf(B !!$B![(B
$B9-9p!!#52s7G:\#3#0F|$G#3#0#0#01_(B
$B9-9p!!#72s7G:\#5F|$G#4#0#0#01_(B
$B9-9p#1#02s7G:\#6#0F|$G#5#0#0#01_(B
$B!!!!!!!!(B
$B!}!!K\F|$bEj9F!*$"$j$,$H$&$4$6$$$^$7$?"v!}(B
 <<$B%a%k%^%,H/9T<T!d!d3Z!9%H%/%H%/>pJs<R(B
yazawa@easy.to

$B"#LH@U;v9`(B
$BEv%a!<%k%^%,%8%s$K7G:\$7$F$$$k>pJs$K4X$7$FH/9T<T$G$O0l@Z$N(B
$B@UG$$rIi$$$^$;$s!#(B
$B0l@Z$N@UG$$rIi$$$+$M$^$9$N$G$4N;>5$/$@$5$$!#7G:\5-;v$K4X$9(B
$B$k$*Ld$$9g$o$;$OD>@\Ej9F<T$X$*4j$$$$$?$7$^$9!#(B

$B!!!!!!!!!!!z!z!z!!:#F|$NE75$M=Js!!!z!z!z(B

$B"#!!%"%I%l%9$GG[?.2r=|!!"#(B
$B!~K\%a%k%^%,$NG[?.ITMW!"$^$?$OEPO?$7$?3P$($N$J$$>l9g$O(B
$B!!<!$N%"%I%l%9$+$i$NEPO?$K$h$j2r=|$5$;$FD:$-$^$9!#!!!!!!!!!!!!!!(B
http://back.to/mailstop

$B!!!!!!(,(,(,KhD+8+$l$k!*!!A49q$N$*E75$(,(,!!(B

-------------------------------------------
4$B7n(B12$BF|(B11$B;~H/I=(B

$B<gMWET;T(B    $B:#Lk(B                $BL@F|(B
$B;%KZ(B        $BF^$j$N$A;~!9@2$l(B    $B@2$l(B                
$B@gBf(B        $B@2$l(B                $BF^$j;~!9@2$l(B        
$BEl5~(B        $B@2$l(B                $B@2$l$N$A;~!9F^$j(B    
$BD9Ln(B        $B@2$l(B                $B@2$l(B                
$B@E2,(B        $B@2$l(B                $B@2$l$N$A;~!9F^$j(B    
$BL>8E20(B      $B@2$l(B                $B@2$l$N$A;~!9F^$j(B    
$B?73c(B        $B@2$l(B                $B@2$l(B                
$B6bBt(B        $B@2$l(B                $B@2$l(B                
$BBg:e(B        $B@2$l(B                $B@2$l$N$A;~!9F^$j(B    
$B2,;3(B        $B@2$l(B                $B@2$l$N$A;~!9F^$j(B    
$B9-Eg(B        $B@2$l(B                $B@2$l$N$A0l;~1+(B      
$B9b>>(B        $B@2$l(B                $BF^$j(B                
$BJ!2,(B        $B@2$l$N$A;~!9F^$j(B    $BF^$j0l;~1+(B          
$B</;yEg(B      $B@2$l$N$A;~!9F^$j(B    $B1+(B                  
$BFaGF(B        $B@2$l(B                $B@2$l(B                


From dalke at dalkescientific.com  Mon Apr 12 19:13:42 2004
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Mon Apr 12 19:19:44 2004
Subject: [BioPython] BOSC
In-Reply-To: <20040408212748.GE63800@evostick.agtec.uga.edu>
References: <A5791618-8841-11D8-B94E-000393C92466@dalkescientific.com>
	<20040408212748.GE63800@evostick.agtec.uga.edu>
Message-ID: <0A6B4443-8CD7-11D8-B418-000393C92466@dalkescientific.com>

Me:
>   Just a reminder -- only a bit over a week to submit a talk
> proposal for BOSC.

Oops!  As a couple people pointed out to me, the deadline for
abstract submission is May 5, not this week.

(I was looking at last year's announcement for the date.  *chagrin*)

So another few weeks to go.

					Andrew
					dalke@dalkescientific.com

From yong27 at bioinfo.sarang.net  Mon Apr 12 21:44:41 2004
From: yong27 at bioinfo.sarang.net (Hyung-Yong Kim)
Date: Mon Apr 12 21:49:30 2004
Subject: [BioPython] parsing error in Bio.Sequencing.Ace
In-Reply-To: <1081774886.2059.4.camel@osiris.biology.duke.edu>
Message-ID: <KOEALHFMKEJANKOPIOOICECHCAAA.yong27@bioinfo.sarang.net>

Hi Frank Kauff

Thanks for your concern.

> It does! In theory, the parser should not run into this problem - are
> you using the newest ace.py from cvs? The ace parser has been updated
> after the recent 'official' release of biopython. Anyway, please send me
> your input file so I can have a closer look why this happens and I can
> fix it.

There are no problems parsing them when I use newest ace.py from cvs.
I used v1.3(in CVS) and the newest is v1.5.
I could see their diffs in
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Sequencing/Ace.py.diff?r1=1.3&r2=1.5&cvsroot=biopython

But, I also modified ace.py because Iterator class does not support __iter__. 
I want to know why __iter__ is depreciated although it is very convenient.

Hyung-Yong Kim


> Frank Kauff
> Dept. of Biology
> Duke University
> Box 90338
> Durham, NC 27708
> USA

> Phone 919-660-7382
> Fax 919-660-7293


Hyung-Yong Kim
---------------------------------
National Livestock Research Institute (Korea)
Division of Animal Genomics & Bioinformatics
http://bioinfo.sarang.net

From pieter at laeremans.org  Tue Apr 13 09:59:59 2004
From: pieter at laeremans.org (Pieter Laeremans)
Date: Tue Apr 13 10:04:58 2004
Subject: [BioPython] Parsing error when parsing a blast report
Message-ID: <874qrokmg0.fsf@hades.kotnet.org>

Hello,

I've tried to parse some output I got through NCBIWWW.Blast.  But I
receive the following error:


>>> ## working on region in file /tmp/python-8118EHE...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/tmp/python-8118EHE", line 14, in ?
    record = parser.parse(b_results)
  File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 48, in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 97, in feed
    has_re=re.compile(r'<b>.?BLAST'))
  File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 335, in read_and_call_until
    line = safe_readline(uhandle)
  File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 411, in safe_readline
    raise SyntaxError, "Unexpected end of stream."
SyntaxError: Unexpected end of stream.
>>> 

---------------------------------------

This is the script I used to get this result.  I have no idea what 's
wrong. Has anyone a clue ? thanks in advance,

Pieter


---------------------------------------

from Bio.Blast import *
from Bio.Blast.NCBIWWW import *

# here used to be a longer sequence
sequence="MKIPNIGNVMNKFEILGVVGEGAYGVVLKCRHKETHEIV"

b_results = NCBIWWW.blast(program='tblastn',  database='nr', query=sequence, expect=0.001, entrez_query='Canis familiaris[ORGN]')
parser = NCBIWWW.BlastParser()
record = parser.parse(b_results)

From bertrand.frottier at free.fr  Tue Apr 13 14:46:24 2004
From: bertrand.frottier at free.fr (Bertrand FROTTIER)
Date: Tue Apr 13 14:51:08 2004
Subject: [BioPython] Parsing error when parsing a blast report
In-Reply-To: <874qrokmg0.fsf@hades.kotnet.org>
References: <874qrokmg0.fsf@hades.kotnet.org>
Message-ID: <407C3580.6020703@free.fr>

I had the same problem last week.
I suppose it's because the NCBI is using BLAST 2.2.8 now, and the parser 
is not up-to-date yet. So I started working on a parser for the XML 
output. It's not completely over yet (missing multiple_alignment, 
support of PSI-BLAST, and a lot of testing) but I can send you the file 
if you want.

Pieter Laeremans a ?crit :
> Hello,
> 
> I've tried to parse some output I got through NCBIWWW.Blast.  But I
> receive the following error:
> 
> 
> 
>>>>## working on region in file /tmp/python-8118EHE...
> 
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/tmp/python-8118EHE", line 14, in ?
>     record = parser.parse(b_results)
>   File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 48, in parse
>     self._scanner.feed(handle, self._consumer)
>   File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 97, in feed
>     has_re=re.compile(r'<b>.?BLAST'))
>   File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 335, in read_and_call_until
>     line = safe_readline(uhandle)
>   File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 411, in safe_readline
>     raise SyntaxError, "Unexpected end of stream."
> SyntaxError: Unexpected end of stream.
> 
> 
> ---------------------------------------
> 
> This is the script I used to get this result.  I have no idea what 's
> wrong. Has anyone a clue ? thanks in advance,
> 
> Pieter
> 
> 
> ---------------------------------------
> 
> from Bio.Blast import *
> from Bio.Blast.NCBIWWW import *
> 
> # here used to be a longer sequence
> sequence="MKIPNIGNVMNKFEILGVVGEGAYGVVLKCRHKETHEIV"
> 
> b_results = NCBIWWW.blast(program='tblastn',  database='nr', query=sequence, expect=0.001, entrez_query='Canis familiaris[ORGN]')
> parser = NCBIWWW.BlastParser()
> record = parser.parse(b_results)
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
> 
> 

-- 
   ()
   (o_
   //\
<-V_/_ Demonic penguin


Win98 is called Win98 because 98 is the number of bugs occurring right after
inserting the CD.

From pieter at laeremans.org  Wed Apr 14 06:19:41 2004
From: pieter at laeremans.org (Pieter Laeremans)
Date: Wed Apr 14 06:24:29 2004
Subject: [BioPython] Transforming XML output in HTML ?
Message-ID: <871xmqdfpe.fsf@laeremans.org>

Hello,

I find it very usefull to inspect the output from blast by hand from
time to time.  Therefore it would be nice if the xml output could be
converted to html. 

Is there an XSLT or so available somewhere which makes this possible ?

thanks in advance,

Pieter

From letondal at pasteur.fr  Wed Apr 14 12:39:16 2004
From: letondal at pasteur.fr (Catherine Letondal)
Date: Wed Apr 14 12:43:59 2004
Subject: [BioPython] Martel not installed in MacOSX biopython distribution ?
Message-ID: <200404141639.i3EGdGRw206939@electre.pasteur.fr>


Hi,

We have a recent biopython installation (biopython-1.24.tar.gz) on a MacOsX platform.

When using the parse function from module Bio.Clustalw, we get an error message saying 
that Martel is missing. Otherwise, all the biopython modules work well.
Also, when pointing to the Martel link (http://www.biopython.org/~dalke/Martel/
and http://www.bioinformatics.org/bradstuff/bp/api/Martel/index.html), we
get a 'not found' message.

Is there a known problem of Martel module missing when installing biopython?

Thanks in advance!
Thanks also to include my colleague edeveaud@pasteur.fr in the reply :-)

Best,

-- 
Catherine Letondal -- Pasteur Institute Computing Center
From dalke at dalkescientific.com  Thu Apr 15 06:08:01 2004
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Thu Apr 15 06:12:43 2004
Subject: [BioPython] NBN tutorial slides
Message-ID: <C73FBAA8-8EC4-11D8-B418-000393C92466@dalkescientific.com>

Two months ago I taught an intro. to programming class as
part of a bioinformatics course organized by the National
Bioinformatics Network in South Africa.  It was two weeks
long, two hours a day, six days a week, and all in Python.

More information about the class and my slides are now
available at http://www.dalkescientific.com/writings/NBN/

					Andrew
					dalke@dalkescientific.com

From chapmanb at uga.edu  Sun Apr 18 09:47:13 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Sun Apr 18 13:55:24 2004
Subject: [BioPython] Parsing error when parsing a blast report
In-Reply-To: <407C3580.6020703@free.fr>
References: <874qrokmg0.fsf@hades.kotnet.org> <407C3580.6020703@free.fr>
Message-ID: <20040418134713.GA8067@misterbd.agtec.uga.edu>

Hello Pieter and Bertrand;

Pieter:
> >I've tried to parse some output I got through NCBIWWW.Blast.  But I
> >receive the following error:
[...]
> >SyntaxError: Unexpected end of stream.
[...]
> >This is the script I used to get this result.  I have no idea what 's
> >wrong. Has anyone a clue ? thanks in advance,

The problem isn't the parser, but rather BLASTing against NCBI. This
problem is fixed in CVS and will be in the next release -- Catherine
reported it back in March:

http://portal.open-bio.org/pipermail/biopython/2004-March/001903.html

So you can either use CVS, or the quick fix is to change:

b_results = NCBIWWW.blast(program='tblastn',  database='nr', 
query=sequence, expect=0.001, entrez_query='Canis familiaris[ORGN]')

to:

b_results = NCBIWWW.blast(program='tblastn',  database='nr', 
query=sequence, expect=0.001, entrez_query='Canis familiaris[ORGN]',
format_type = "HTML")

Bertrand:
> So I started working on a parser for the XML 
> output. It's not completely over yet (missing multiple_alignment, 
> support of PSI-BLAST, and a lot of testing) but I can send you the file 
> if you want.

If you get this finished and working, we would definitely be willing
to accept it into Biopython -- a parser for XML blast output is
currently missing but desired.

Hope this helps -- let us know if that doesn't fix the problem or
there are any other questions.
Brad
From chapmanb at uga.edu  Sun Apr 18 09:53:39 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Sun Apr 18 14:01:49 2004
Subject: [BioPython] parsing error in Bio.Sequencing.Ace
In-Reply-To: <1082038875.2059.11.camel@osiris.biology.duke.edu>
References: <KOEALHFMKEJANKOPIOOICECHCAAA.yong27@bioinfo.sarang.net>
	<1082038875.2059.11.camel@osiris.biology.duke.edu>
Message-ID: <20040418135339.GB8067@misterbd.agtec.uga.edu>

Hi Hyung-Yong and Frank;
Thanks for handling the Ace parser questions -- glad the ol' parser
is still cooking along for everyone.

> Oh, there's also a typo in line 576, where it should read 'Missing
> header line in WA tag' instead of '... CT tag' :-(

Fixed in CVS.

> > But, I also modified ace.py because Iterator class does not support __iter__. 
> > I want to know why __iter__ is depreciated although it is very convenient.
> > 
> Good question. I can't remember having removed __iter__ by myself on
> purpose? Maybe I deleted it accidentially? Brad, do you know why
> __iter__ is out of the Iterator class? 

I'm not sure -- I guess it must have been left out accidentally.
Either way, I just added it back into CVS, so it should be there for
the next release.

Just let me know if there are any other changes or patches. Thanks
again.

Brad
From chapmanb at uga.edu  Sun Apr 18 10:24:25 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Sun Apr 18 14:33:15 2004
Subject: [BioPython] Martel not installed in MacOSX biopython distribution
	?
In-Reply-To: <200404141639.i3EGdGRw206939@electre.pasteur.fr>
References: <200404141639.i3EGdGRw206939@electre.pasteur.fr>
Message-ID: <20040418142425.GD8067@misterbd.agtec.uga.edu>

Hi Catherine;

> We have a recent biopython installation (biopython-1.24.tar.gz) on a MacOsX platform.
> 
> When using the parse function from module Bio.Clustalw, we get an error message saying 
> that Martel is missing. Otherwise, all the biopython modules work well.
> Also, when pointing to the Martel link (http://www.biopython.org/~dalke/Martel/
> and http://www.bioinformatics.org/bradstuff/bp/api/Martel/index.html), we
> get a 'not found' message.
> 
> Is there a known problem of Martel module missing when installing biopython?

Yes, I did clean up some bad code I wrote a while back in the
Clustalw module -- as of revision 1.8 of
Bio/Clustalw/clustaw_format.py in CVS (since March). I had an import
check there from way back in the days when Martel was distributed
separately from Biopython, hence the bad URL as well. So this should
behave better in future releases -- sorry for the confusion with
that.

As to the symptoms of your problems, Martel should be installed by 
default with Biopython -- you can check this using:

>>> import Martel

at a python prompt. If all other Biopython modules are working and
Clustalw is not, then it is very likely that Martel is installed but
something else may be the problem.

So I guess the best way to proceed and check out the problem is:

1. Does 'import Martel' work? If not, what kind of error message are
you getting? The only current known problems with 1.24 and Martel
installation are that it can be a little strange if there is an old
Martel instance already installed into site-packages. This should be
fixed in the next release, but the quick fix is to remove
site-packages/Martel and install again.

2. Try and get the latest changes in Clustalw/clustal_format.py and
install this in place of the 1.24 version. Hopefully then you should
at least see the problem with the import Martel call and we can
diagnose the problem further.

I hope this helps. Sorry about any problems and be sure to write
again if this doesn't clear things up.

Brad
From cat at rmb.com.hk  Mon Apr 19 04:05:27 2004
From: cat at rmb.com.hk (cat)
Date: Mon Apr 19 04:09:57 2004
Subject: [BioPython] Rash guard USD6.00/PC
Message-ID: <200404190809.i3J89r6C025228@portal.open-bio.org>

product name:Rash guard

QTY:1,000PCS

USD6.00/PC

T : 0086-769-5835182 F : 0086-769-5835182

Website : http://home.netvigator.com/~sky888s/


Thanks,

cat

From lpritc at scri.sari.ac.uk  Mon Apr 19 08:34:56 2004
From: lpritc at scri.sari.ac.uk (Leighton Pritchard)
Date: Mon Apr 19 08:39:50 2004
Subject: [BioPython] Different parse error in ace.py
Message-ID: <4083C770.6010306@scri.sari.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

The Ace.py parser assumes that the DS tag will always be present in
Phrap output.  Unfortunately, it isn't always present in my output, and
so the parser gave me this traceback:

uncaught: (<class exceptions.SyntaxError at 0xbf51b74c>,
<exceptions.SyntaxError instance at 0xbf03f4cc>)
Traceback (most recent call last):
~  File
"/usr/local/lib/python2.3/site-packages/ScriptFramework/script.py", line
241, in main
~    self.run()
~  File "/home/lpritc/Data/scripts/draw_ace.py", line 71, in run
~    self.ace_file_record = self.load_record()       # Load ACE file
~  File "/home/lpritc/Data/scripts/draw_ace.py", line 97, in load_record
~    ace_file_record = parser.parse(self.input)
~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
line 295, in parse
~    rec=iter.next()
~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
line 218, in next
~    return self._parser.parse(File.StringHandle(data))
~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
line 234, in parse
~    self._scanner.feed(uhandle, self._consumer)
~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
line 331, in feed
~    self._scan_record(handle, consumer)
~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
line 352, in _scan_record
~    read_and_call(uhandle,consumer.ds,start='DS ')
~  File "/usr/local/lib/python2.3/site-packages/Bio/ParserSupport.py",
line 300, in read_and_call
~    raise SyntaxError, errmsg
SyntaxError: Line does not start with 'DS ':
RT{

I've fixed it locally by changing line 352 in Ace.py to

###

if attempt_read_and_call(uhandle,consumer.ds,start='DS '):
~                read_and_call(uhandle,consumer.ds,start='DS ')

###

so as to take the possible absence of a DS tag into account, though you
may want to do something more elegant with the real code ;)

- --
Dr Leighton Pritchard AMRSC
D104, PPI, Scottish Crop Research Institute
Invergowrie, Dundee, DD2 5DA, Scotland, UK
E: lpritc@scri.sari.ac.uk	W: http://bioinf.scri.sari.ac.uk/index.shtml
T: +44 (0)1382 568579		F: +44 (0)1382 568578
PGP key FEFC205C: GPG key E58BA41B: http://www.keyserver.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAg8dvL1gZ+OWLpBsRAllhAJ4kgHrygUcTVozoyKH2XlLTBE+07ACfRWZ1
5T0Ztu63f1flGaxHJSs0QZA=
=zTh4
-----END PGP SIGNATURE-----

From fkauff at duke.edu  Mon Apr 19 09:05:47 2004
From: fkauff at duke.edu (Frank Kauff)
Date: Mon Apr 19 09:11:53 2004
Subject: [BioPython] Different parse error in ace.py
In-Reply-To: <4083C770.6010306@scri.sari.ac.uk>
References: <4083C770.6010306@scri.sari.ac.uk>
Message-ID: <1082379947.2059.3.camel@osiris.biology.duke.edu>

Hi Leighton,

On Mon, 2004-04-19 at 08:34, Leighton Pritchard wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi all,
> 
> The Ace.py parser assumes that the DS tag will always be present in
> Phrap output.  Unfortunately, it isn't always present in my output, and
> so the parser gave me this traceback:
> 

Thanks for pointing this out. Ace parsing is learning by doing -
unfortunately there's no proper description of which tags appear when
and where... I indeed assumed that there's always a DS. I'll update the
code, so far your workaround seems to the right thing to do.

Frank

> uncaught: (<class exceptions.SyntaxError at 0xbf51b74c>,
> <exceptions.SyntaxError instance at 0xbf03f4cc>)
> Traceback (most recent call last):
> ~  File
> "/usr/local/lib/python2.3/site-packages/ScriptFramework/script.py", line
> 241, in main
> ~    self.run()
> ~  File "/home/lpritc/Data/scripts/draw_ace.py", line 71, in run
> ~    self.ace_file_record = self.load_record()       # Load ACE file
> ~  File "/home/lpritc/Data/scripts/draw_ace.py", line 97, in load_record
> ~    ace_file_record = parser.parse(self.input)
> ~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
> line 295, in parse
> ~    rec=iter.next()
> ~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
> line 218, in next
> ~    return self._parser.parse(File.StringHandle(data))
> ~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
> line 234, in parse
> ~    self._scanner.feed(uhandle, self._consumer)
> ~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
> line 331, in feed
> ~    self._scan_record(handle, consumer)
> ~  File "/usr/local/lib/python2.3/site-packages/Bio/Sequencing/Ace.py",
> line 352, in _scan_record
> ~    read_and_call(uhandle,consumer.ds,start='DS ')
> ~  File "/usr/local/lib/python2.3/site-packages/Bio/ParserSupport.py",
> line 300, in read_and_call
> ~    raise SyntaxError, errmsg
> SyntaxError: Line does not start with 'DS ':
> RT{
> 
> I've fixed it locally by changing line 352 in Ace.py to
> 
> ###
> 
> if attempt_read_and_call(uhandle,consumer.ds,start='DS '):
> ~                read_and_call(uhandle,consumer.ds,start='DS ')
> 
> ###
> 
> so as to take the possible absence of a DS tag into account, though you
> may want to do something more elegant with the real code ;)
> 
> - --
> Dr Leighton Pritchard AMRSC
> D104, PPI, Scottish Crop Research Institute
> Invergowrie, Dundee, DD2 5DA, Scotland, UK
> E: lpritc@scri.sari.ac.uk	W: http://bioinf.scri.sari.ac.uk/index.shtml
> T: +44 (0)1382 568579		F: +44 (0)1382 568578
> PGP key FEFC205C: GPG key E58BA41B: http://www.keyserver.net
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.3 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
> 
> iD8DBQFAg8dvL1gZ+OWLpBsRAllhAJ4kgHrygUcTVozoyKH2XlLTBE+07ACfRWZ1
> 5T0Ztu63f1flGaxHJSs0QZA=
> =zTh4
> -----END PGP SIGNATURE-----
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293

From ciccio at unical.it  Mon Apr 19 09:43:09 2004
From: ciccio at unical.it (ciccio@unical.it)
Date: Mon Apr 19 09:49:02 2004
Subject: [BioPython] random numbers
Message-ID: <1082382189.4083d76d575a0@webmail.unical.it>

 
Hi all, 
is there the possibility to generate random numbers according to a gamma 
distribution with both mean and shape parameters fixed? (in python or 
biopython) 
 
Thank you 
 
ernesto 
 

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

From fkauff at duke.edu  Mon Apr 19 09:54:28 2004
From: fkauff at duke.edu (Frank Kauff)
Date: Mon Apr 19 09:59:01 2004
Subject: [BioPython] Different parse error in ace.py
In-Reply-To: <1082379947.2059.3.camel@osiris.biology.duke.edu>
References: <4083C770.6010306@scri.sari.ac.uk>
	<1082379947.2059.3.camel@osiris.biology.duke.edu>
Message-ID: <1082382868.2059.11.camel@osiris.biology.duke.edu>

Leighton,

On Mon, 2004-04-19 at 09:05, Frank Kauff wrote:
> Hi Leighton,
> 
> On Mon, 2004-04-19 at 08:34, Leighton Pritchard wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > Hi all,
> > 
> > The Ace.py parser assumes that the DS tag will always be present in
> > Phrap output.  Unfortunately, it isn't always present in my output, and
> > so the parser gave me this traceback:
> > 
> 
> Thanks for pointing this out. Ace parsing is learning by doing -
> unfortunately there's no proper description of which tags appear when
> and where... I indeed assumed that there's always a DS. I'll update the
> code, so far your workaround seems to the right thing to do.
> 

Just had a look at the parser and it seems that it is unfortunately a
bigger problem. The way the read data (rd, qa, ds) is currently stored
assumes that there's the same number of rd, qa and ds records for each
read. Storing an empty DS would be a possibility, but is pretty
unelegant. Anyway, I was feeling more and more uncomfortable with the
way ace.py is storing data for record - now I have a reason for
restructuring it :-)

F.

-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293

From anewgene at hotpop.com  Mon Apr 19 10:14:28 2004
From: anewgene at hotpop.com (CL Wu)
Date: Mon Apr 19 10:17:56 2004
Subject: [BioPython] random numbers
In-Reply-To: <1082382189.4083d76d575a0@webmail.unical.it>
References: <1082382189.4083d76d575a0@webmail.unical.it>
Message-ID: <4083DEC4.9060909@hotpop.com>

Does it work for you?

 >>> import random
 >>> random.gammavariate(1,2)
1.5041091518144103

Chunlei

ciccio@unical.it wrote:

> 
>Hi all, 
>is there the possibility to generate random numbers according to a gamma 
>distribution with both mean and shape parameters fixed? (in python or 
>biopython) 
> 
>Thank you 
> 
>ernesto 
> 
>
>-------------------------------------------------
>This mail sent through IMP: http://horde.org/imp/
>
>_______________________________________________
>BioPython mailing list  -  BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>  
>

From deanna at t-online.de  Mon Apr 19 10:27:22 2004
From: deanna at t-online.de (reed)
Date: Mon Apr 19 10:41:45 2004
Subject: [BioPython] =?windows-1251?b?zO7y6OLg9uj/IO/l8PHu7eDr4A==?=
Message-ID: <200404191441.i3JEfe6B029249@portal.open-bio.org>

?? ??? ???? ?? ??????, ??? ????? ?????? ??????? ??????? ?? ????, ?????????
???????????? ?????????? ?? ??? ??????????. 
? ??? ???????? ???????, ?? ?????????? ?? ????? ???????? ????????? ??? ????? ?????????! ???
??????? ???, ????? ?????????? ? ??????????? ????????,
?? ?????? ??????????? ???????? ? ????????? ????? ????? ???? ??? ??????? ????? ??????
???????? ??? ?????????? ?? ?? ????? ??????????? 
??? ???? ???? ?????? ???? ?? ??????????? ???????????? ????????-????????:  ???????????
??????? ????????? ?????????? 26-27 ??????.

????????? ????????-????????:
?????? I. ????????? ?????????.
? ???????? ?????????.
? ???????????? ? ??????????? ??????? ? ?????????: ????????? ????????????? ??????, ?? ??????
? ??????? ??????????. ??????? ???????????
?  ????????? ????????? ?????? ? ????????? ????? ?????????? ?????????. 
?????? II. ???????? ???????? ?????????.
1.????????? ????????? ?? ?????? ???????????.
? ????????????? ?????????.
?????? ????????? ????????? ? ???????? ?????? ? ???????????? ????????.
? ?????? ????????????? ?????? ?????????.
? ??????? ??????? ? ????????? ?????? ?????????.
?  ??????? ????????????? ? ??????????????? ??????????????: ??????????? ? ???????????.
? ????????????? ????????: ????????????? ???????? ? ????? ??? ??????? ?????????.
?????????? ??????? ? ????? ???????? ? ??????? ????????? ?????????.
?  ????????? ?????????.
2. ????????? ?????? ??????????. ????????????? ??????????.
? ???????? ???????????? ??????? ?????.
?????????? ????? ?????????? ????? ? ??????????? ??????? ???????? ?????. 
?????? III.  ??? ???????
? ?????? ?????? ?????????? ????????????? ????? ???????? (???? ?????? ?????????? ?
???????????? ???? ????????????).
? ???? ?????????? ??????? ?????????.
? ??????. ???????. ??????. ??????. ?????????? ?????: ???
????????? ???????????.
? ??????????? ????????????? ??????? ????????? ???????????.
? ?????????? ????????? ???????? ?? ???????? ?????????? ?????????.

????????? ??????? ? ??????????? 
- 7500 ??????, ? ?.?. ???. ????? ?????? ????? (???????? ???
???????????). 
? ????????? ??????: ??????? ?
?????? ??????,
??????????? ???????? ????-?????, ????.

??????-????? ???????? ? ?????? (?. ?????????????).
????? ?????????? ? 10 ?? 17.30.
??????????? ?????????? ???????????.
?????? ??????? ? ???????? ? ??? ???? ?????????? ??????????? ?????????? ???????????
??????????? (?? CD,DVD ??? ?????????????).
? ??????????? ??????????? ???????????
????????.
????????? ???????????????? 4500 ???. ? ?????? ???
??? ??????? ? ??????-?????? ??? ???????
???????????????
?? ????????? ??????
????? ?????????? ??? ???????????.
    
?????????? ??????????? ??????
????????? ?? ?????:
1. ?????? ? ?????? ????????? ? ???????????.
2. ???? ????? ? ??????? (?????? ????????? ????? ????????? ? ??????)
3. ???????? ??????????? ???????????.

?????????? ???????? (095) 207-26-21 ? (095) 789-81-90 


 
From anewgene at hotpop.com  Mon Apr 19 14:25:53 2004
From: anewgene at hotpop.com (CL Wu)
Date: Mon Apr 19 14:37:35 2004
Subject: [BioPython] random numbers
In-Reply-To: <1082389538.4083f4227715f@webmail.unical.it>
References: <1082382189.4083d76d575a0@webmail.unical.it>
	<4083DEC4.9060909@hotpop.com>
	<1082389538.4083f4227715f@webmail.unical.it>
Message-ID: <408419B1.8040709@hotpop.com>

An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython/attachments/20040419/57df7087/attachment.htm
From crocha at dc.uba.ar  Mon Apr 19 17:28:52 2004
From: crocha at dc.uba.ar (Cristian S. Rocha)
Date: Mon Apr 19 17:37:26 2004
Subject: [BioPython] BioPython and Zope.
Message-ID: <1082410131.16662.12.camel@numero2>

Hello everybody,

I'm trying to parse fasta string input using Zope. I made a Zope
external method to interface with BioPython modules. This method work
fine on command line, but It don't work in Zope.

The code is:

----------------
import StringIO
from Bio import SeqRecord
from Bio.SeqIO import FASTA

def SequenceRead(SequenceString):
	input = StringIO.StringIO(SequenceString)
	reader = FASTA.FastaReader(input)   # Error line
	seq = reader.next()
	while seq:
		print "> %s" % seq.id
	 	seq = reader.next()
	return "!"

if __name__ == "__main__":
	print BioTools(
""">123
ATAGGGGATGATAGGAT
>456
GGATGAGGAGCGATGCG
"""
	)
----------------

The error is:

AttributeError: 'module' object has no attribute 'FastaReader'

The traceback is:

"""
Traceback (innermost last):
  Module ZPublisher.Publish, line 98, in publish
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 39, in call_object
  Module OFS.DTMLMethod, line 127, in __call__
  Module DocumentTemplate.DT_String, line 474, in __call__
  Module DocumentTemplate.DT_Try, line 140, in render
  Module DocumentTemplate.DT_Try, line 183, in render_try_except
  Module DocumentTemplate.DT_Util, line 201, in eval
   - __traceback_info__: SequenceText
  Module <string>, line 1, in <expression>
  Module Shared.DC.Scripts.Bindings, line 306, in __call__
  Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec
  Module Products.PythonScripts.PythonScript, line 307, in _exec
  Module None, line 15, in Translate
   - <PythonScript at /biodb/TranslateSequence/Translate>
   - Line 15
  Module Products.ExternalMethod.ExternalMethod, line 224, in __call__
   - __traceback_info__: (('> Input.\r\nATGATGA\r\n> Output\r\nATGGGGAT',), {}, None)
  Module /usr/lib/zope/Extensions/BioTools.py, line 9, in SequenceRead
AttributeError: 'module' object has no attribute 'FastaReader'
"""

Do you have any idea why it's happening?

Thanks,
Cristian.

-- 
Lic. Cristian S. Rocha.
<crocha@dc.uba.ar>
Departamento de Computacin. FCEyN. UBA.
Pabellon I. Cuarto 9.
Ciudad Universitaria.
(1428) Buenos Aires. Argentina.
Tel: +54-11-4576-3390/96 int 714
Tel/Fax: +54-11-4576-3359
Cel: 15-5-607-9192
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada
	digitalmente
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20040419/cbcc018c/attachment.bin
From thamelry at binf.ku.dk  Wed Apr  7 11:03:19 2004
From: thamelry at binf.ku.dk (Thomas Hamelryck)
Date: Tue Apr 20 10:22:38 2004
Subject: [BioPython] Bio.PDB news
In-Reply-To: <20040331003137.GH29401@evostick.agtec.uga.edu>
References: <406450B7.6060401@mitre.org>
	<20040331003137.GH29401@evostick.agtec.uga.edu>
Message-ID: <200404071703.19364.thamelry@binf.ku.dk>


Hi everybody,

Bio.PDB now has a class that deals with superimposing 
crystal structures (Superimposer - see CVS).

Brad, I am working fiercely on the Bio.PDB manual, class
documentation and a FAQ. So keep the documentation 
police away for a while!

Cheers,

-Thomas

From cymon at duke.edu  Tue Apr 20 11:06:07 2004
From: cymon at duke.edu (Cymon Cox)
Date: Tue Apr 20 11:10:39 2004
Subject: [BioPython] BioPython and Zope.
In-Reply-To: <1082410131.16662.12.camel@numero2>
References: <1082410131.16662.12.camel@numero2>
Message-ID: <1082473567.16022.30.camel@isis.biology.duke.edu>

Hi Cristian,

When you add an External Method to your Zope application you assign it
directly to a function contained in a module. I don't think the imports
at the module level are never made.

In short try:

def SequenceRead(SequenceString):
	import StringIO
	from Bio import SeqRecord
	from Bio import FASTA 
	input = StringIO.StringIO(SequenceString)
	reader = FASTA.FastaReader(input)   # Error line
	seq = reader.next()
etc...

Cheers, Cymon


On Mon, 2004-04-19 at 17:28, Cristian S. Rocha wrote:
> Hello everybody,
> 
> I'm trying to parse fasta string input using Zope. I made a Zope
> external method to interface with BioPython modules. This method work
> fine on command line, but It don't work in Zope.
> 
> The code is:
> 
> ----------------
> import StringIO
> from Bio import SeqRecord
> from Bio.SeqIO import FASTA
> 
> def SequenceRead(SequenceString):
> 	input = StringIO.StringIO(SequenceString)
> 	reader = FASTA.FastaReader(input)   # Error line
> 	seq = reader.next()
> 	while seq:
> 		print "> %s" % seq.id
> 	 	seq = reader.next()
> 	return "!"
> 
> if __name__ == "__main__":
> 	print BioTools(
> """>123
> ATAGGGGATGATAGGAT
> >456
> GGATGAGGAGCGATGCG
> """
> 	)
> ----------------
> 
> The error is:
> 
> AttributeError: 'module' object has no attribute 'FastaReader'
> 
> The traceback is:
> 
> """
> Traceback (innermost last):
>   Module ZPublisher.Publish, line 98, in publish
>   Module ZPublisher.mapply, line 88, in mapply
>   Module ZPublisher.Publish, line 39, in call_object
>   Module OFS.DTMLMethod, line 127, in __call__
>   Module DocumentTemplate.DT_String, line 474, in __call__
>   Module DocumentTemplate.DT_Try, line 140, in render
>   Module DocumentTemplate.DT_Try, line 183, in render_try_except
>   Module DocumentTemplate.DT_Util, line 201, in eval
>    - __traceback_info__: SequenceText
>   Module <string>, line 1, in <expression>
>   Module Shared.DC.Scripts.Bindings, line 306, in __call__
>   Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec
>   Module Products.PythonScripts.PythonScript, line 307, in _exec
>   Module None, line 15, in Translate
>    - <PythonScript at /biodb/TranslateSequence/Translate>
>    - Line 15
>   Module Products.ExternalMethod.ExternalMethod, line 224, in __call__
>    - __traceback_info__: (('> Input.\r\nATGATGA\r\n> Output\r\nATGGGGAT',), {}, None)
>   Module /usr/lib/zope/Extensions/BioTools.py, line 9, in SequenceRead
> AttributeError: 'module' object has no attribute 'FastaReader'
> """
> 
> Do you have any idea why it's happening?
> 
> Thanks,
> Cristian.
-- 
Cymon Cox <cymon@duke.edu>
Duke University

From marti at seznam.cz  Wed Apr 21 17:55:01 2004
From: marti at seznam.cz (julayne)
Date: Wed Apr 21 18:12:27 2004
Subject: [BioPython] for you
Message-ID: <200404212212.i3LMCL6B024451@portal.open-bio.org>

?????? ??? ????? ? ?????? ????????? ??????? ??????????.
????? ????????? ?????????? ??????????????? ?? www.mypresent.ru
???????? ????? ???????? ????????? ????? ? ???????????? ? ????? ????
? ????? ????????.


 
From jchuang8 at itsa.ucsf.edu  Wed Apr 21 21:00:07 2004
From: jchuang8 at itsa.ucsf.edu (Jer-Yee John Chuang)
Date: Wed Apr 21 21:04:37 2004
Subject: [BioPython] PSIBlastParser behavior
Message-ID: <200404220100.i3M107fI026284@itsa.ucsf.edu>

Hi,

I am observing an unexpected behavior using the PSIBlastParser.  I am
doing a simple PSI-Blast run:

Code:
--------------------------------------------------------
blastOut, errorInfo= NCBIStandalone.blastpgp(myBlastExe,myBlastDB,
                                             myBlastFile,
                                             expectation=myEValue,
                                             npasses=myNPasses)
myParser= NCBIStandalone.PSIBlastParser()
myRecord= myParser.parse(blastOut)


Error message:
--------------------------------------------------------
Traceback (most recent call last):
  File "locateHomologuesPsi_post.py", line 49, in ?
    myRecord= myParser.parse(myFile)
  File "c:\python23\lib\site-packages\Bio\Blast\NCBIStandalone.py", line
557, in
 parse
    self._scanner.feed(handle, self._consumer)
  File "c:\python23\lib\site-packages\Bio\Blast\NCBIStandalone.py", line 98, in
feed
    self._scan_database_report(uhandle, consumer)
  File "c:\python23\lib\site-packages\Bio\Blast\NCBIStandalone.py", line
413, in
 _scan_database_report
    read_and_call(uhandle, consumer.database, start='  Database')
  File "c:\python23\lib\site-packages\Bio\ParserSupport.py", line 300, in 
read_a
nd_call
    raise SyntaxError, errmsg
SyntaxError: Line does not start with '  Database':
Results from round 1


I wrote the PSI-Blast report to file, then tried deleting lines then
calling the PSIBlastParser on the remaining.  I found that the parser is
expecting a line beginning with the work "Searching" before the line
"Results from round" (this is defined in NCBIStandalone _Scanner class in 
def _scan_descriptions).  Once I correct this (manually adding the
"Searching" line in the PSI-Blast report or commenting out the relevant
lines in NCBIStandalone.py), the PSIBlastParser works fine.  However, the 
blastpgp report doesn't contain this "Searching" line in the report it
generates.  Is there something that I am missing here??

Thank you for your time.

Cheers,
John


-----------
John Chuang
UCSF, MB S516JJ 
500 16th St.
San Francisco, CA 94143
jchuang8@itsa.ucsf.edu


From crocha at dc.uba.ar  Thu Apr 22 17:46:05 2004
From: crocha at dc.uba.ar (Cristian S. Rocha)
Date: Thu Apr 22 18:00:19 2004
Subject: [BioPython] Problems with Zope.
Message-ID: <1082670364.22677.6.camel@numero2>

Hello,

I'm trying to use BioPython in an External Method in Zope but it return
"ImportError: cannot import name FastaReader" (and others functions). I
check some "tricks" but nothing change.

Then I modify my external method to know what happening with the symbol
table having the following results:

The program:

"""
def SequenceRead(SequenceString):
        import StringIO
        import Bio.SeqIO.FASTA

        return vars(Bio.SeqIO.FASTA).keys()
"""

When I run it at command line the function return:

"""
['Bio', 'string', 'Seq', 'FastaReader', '__builtins__', '__file__',
'SeqRecord', 'FastaWriter', '__name__', 'os', '__doc__']
"""

That's ok. But When I run it in Zope the function return:

"""
['Bio', 'string', 'Seq', '__builtins__', '__name__', '__file__', 'os',
'__doc__']
"""

The function lost the "FastaReader", "SeqRecord" and "FastaWriter" in
the symbol table!!!

What's wrong?

Thxs,
Cristian.

-- 
Lic. Cristian S. Rocha.
<crocha@dc.uba.ar>
Departamento de Computacin. FCEyN. UBA.
Pabellon I. Cuarto 9.
Ciudad Universitaria.
(1428) Buenos Aires. Argentina.
Tel: +54-11-4576-3390/96 int 714
Tel/Fax: +54-11-4576-3359
Cel: 15-5-607-9192
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada
	digitalmente
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20040422/a0f13d68/attachment.bin
From ciccio at unical.it  Fri Apr 23 04:51:17 2004
From: ciccio at unical.it (ciccio@unical.it)
Date: Fri Apr 23 04:57:54 2004
Subject: [BioPython] parametric bootstrap
Message-ID: <1082710277.4088d905b6916@webmail.unical.it>

 
Hi all, 
do you know how is possible to analyze parametric bootstrap results to obtain 
a P value (by means of Biopython or python) ?  
 
Thanks 
 
ernesto 

-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/

From chunlei.wu at uth.tmc.edu  Fri Apr 23 01:03:33 2004
From: chunlei.wu at uth.tmc.edu (Chunlei Wu)
Date: Fri Apr 23 08:30:51 2004
Subject: [BioPython] RecordFile.py
Message-ID: <4088A3A5.8010901@uth.tmc.edu>

Hi, group,
             I just tried "RecordFile.py", but it failed for both fasta 
file and genbank file I tested.

 >>> rec_h=RecordFile.RecordFile(open(r"gb_test.txt" ),'LOCUS','\\')
or
 >>> rec_h=RecordFile.RecordFile(open(r"gb_test.txt" ),'>','')

both returned the same error:

 >>> rec_h.read()
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "C:\Python23\Lib\site-packages\Bio\RecordFile.py", line 83, in read
    text = self._in_record_state( args, keywds )
  File "C:\Python23\Lib\site-packages\Bio\RecordFile.py", line 120, in 
_in_record_state
    requested_text = text
UnboundLocalError: local variable 'text' referenced before assignment

I checked the code, but the code is not obvious for me to fix it.

Actually, I wrote a simply script before using Bio.File's UndoHandle for 
the same purpose. It looks much simpler, maybe not as powerful as 
RecordFile.py, but it does works for me.  I post it here and hope it is 
worth sharing with you.

Best,

Chunlei Wu
-------------- next part --------------
#Chunlei Wu 07/30/2003
'''
FlatRecHandle is a class simulating a file handle for Flatfile format record file,
using record as a reading unit instead of line.
FlatRecHandle.readrecord() returns a record everytime.
'''


from Bio import File,Fasta

class FlatRecHandle:
    '''A FileHandle for Flatfile format record file, using record as a reading unit instead of line.
       start_marker is the marker of the start of each record:
           ">" for Fasta format record,
           "LOCUS" for GenBank format record, etc.
       stop_marker is the marker of the stop of each record, if None, the record stops till next start_marker or file end.
         e.g.:
           None for Fasta format record,
           "//" for GenBank format record, etc.
        return '' if reaching eof.'''

    def __init__(self,handle,start_marker=None,stop_marker=None):
        self._handle = File.UndoHandle(handle)
        self.start_marker=start_marker
        self.stop_marker=stop_marker
        
        
    def readrecord(self):
        '''return one record at one time,just like readline().
           return '' if reaching eof.'''        
        is_record=0
        saved_record=''
        while 1:
            line=self._handle.readline()

            if line == '': ##reach eof.
                if self.stop_marker is not None and is_record :                    
                    print 'Warning: This record may be incomplete. No stop marker("%s") found,but reach EOF!' % self.stop_marker
                    break
                else:
                    break
                
            if line[:len(self.start_marker)] == self.start_marker:
                is_record=1
            if is_record:
                saved_record += line
                if self.stop_marker is None:
                    next_line=self._handle.peekline()
                    if next_line[:len(self.start_marker)] == self.start_marker or next_line == '':
                        break
                else:
                    if line[:len(self.stop_marker)] == self.stop_marker:
                        break
        return saved_record          
                
    def rewind(self):
        '''rewind the handler pointer to the beginning.'''
        return self._handle.seek(0)

    def tell(self):
        return self._handle.tell()

    def close(self):
        return self._handle.close()

    def closed(self):
        return self._handle.closed()

    def readrecords(self):
        '''return list of records,just like readlines()'''
        
        rec_list=[]
        while 1:
            rec=self.readrecord()
            if rec == '':
                break
            rec_list.append(rec)
        return rec_list

    
def fasta_handle(in_f_handle):
    '''return a FlatRecHandle for fasta format.
       input is a fasta format file handle.'''

    return FlatRecHandle(in_f_handle,">")

def fasta_iterator(fastafile_handle):
    '''return a Fasta file iterator using Bio.Fasta
    input is a fasta format file handle.'''
    parser=Fasta.RecordParser()
    return Fasta.Iterator(fastafile_handle,parser)

def gb_handle(in_f_handle):
    '''return a FlatRecHandle for GenBank format.'''

    return FlatRecHandle(in_f_handle,"LOCUS","//")

From cat at rmb.com.hk  Fri Apr 23 08:51:27 2004
From: cat at rmb.com.hk (cat)
Date: Fri Apr 23 08:55:51 2004
Subject: [BioPython] Rash guard USD6.00/PC
Message-ID: <200404231255.i3NCtk6C018400@portal.open-bio.org>


product name:Rash guard

QTY:1,000PCS

USD6.00/PC

T : 0086-769-5835182 F : 0086-769-5835182

Website : http://home.netvigator.com/~sky888s/

Thanks,
cat
From crocha at dc.uba.ar  Fri Apr 23 12:16:39 2004
From: crocha at dc.uba.ar (Cristian S. Rocha)
Date: Fri Apr 23 12:22:10 2004
Subject: [BioPython] Solved BioPython & Zope problem and a Bug?
Message-ID: <1082736999.23447.67.camel@numero2>

Hello,

I could solve my problem with BioPython in Zope. The main problem was
Zope can't import some functions defined in some modules. These modules
have other modules with relative imports and Zope can't import them. For
example:

Bio.config.FormatRegistry.py have the following relative import:

import _support

I change it to:

from Bio.config import _support

and Zope start to import FormatRegistry.py functions.

I do the same with FormatIO.py with ReseekFile import.

The second problem IS """Bio.MultiProc.copen""". I really don't know
why, but Zope begin an infinity loop when I ask for the symbol table of
the module. To solve these problem I commented the import on the
Bio.config._support module.

Thanks,
Cristian.

-- 
Lic. Cristian S. Rocha.
<crocha@dc.uba.ar>
Departamento de Computacin. FCEyN. UBA.
Pabellon I. Cuarto 9.
Ciudad Universitaria.
(1428) Buenos Aires. Argentina.
Tel: +54-11-4576-3390/96 int 714
Tel/Fax: +54-11-4576-3359
Cel: 15-5-607-9192

From ellis at seznam.cz  Mon Apr 26 09:21:53 2004
From: ellis at seznam.cz (shan-min)
Date: Mon Apr 26 09:39:09 2004
Subject: [BioPython] presents
Message-ID: <200404261338.i3QDci6B016787@portal.open-bio.org>

?????? ???? ?????? ?????????? ??????! ??? ??? ??? ??? ?????????, ?????? ????? ?????? ???????!
????? ?? ?????????? ??????? ?????? ?? ??????????: ??! ?? 1 ??? ??? ????????!
http://www.mypresent.ru


 
From crocha at dc.uba.ar  Mon Apr 26 09:35:24 2004
From: crocha at dc.uba.ar (Cristian S. Rocha)
Date: Mon Apr 26 09:46:29 2004
Subject: [BioPython] Standard API to sequence parser?
Message-ID: <1082986523.31151.5.camel@numero2>

Simple question:

Does exist an standard API to parse sequence files of different formats?

Thxs,
Cristian.

-- 
Lic. Cristian S. Rocha.
<crocha@dc.uba.ar>
Departamento de Computacin. FCEyN. UBA.
Pabellon I. Cuarto 9.
Ciudad Universitaria.
(1428) Buenos Aires. Argentina.
Tel: +54-11-4576-3390/96 int 714
Tel/Fax: +54-11-4576-3359
Cel: 15-5-607-9192
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Esta parte del mensaje =?ISO-8859-1?Q?est=E1?= firmada
	digitalmente
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20040426/c1fe2a6d/attachment.bin
From sbassi at asalup.org  Mon Apr 26 12:43:59 2004
From: sbassi at asalup.org (Sebastian Bassi)
Date: Mon Apr 26 12:49:25 2004
Subject: [BioPython] Standard API to sequence parser?
In-Reply-To: <1082986523.31151.5.camel@numero2>
References: <1082986523.31151.5.camel@numero2>
Message-ID: <408D3C4F.8060702@asalup.org>

Cristian S. Rocha wrote:
> Simple question:
> Does exist an standard API to parse sequence files of different formats?

On the BioPython cookbook there is an example of what you are looking 
for. A 3 line converter from one format to another (Genbank to fasta, 
but you can adapt it to your needs).

-- 
Best regards,
//=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ   //=\
                 http://Bioinformatica.info
From Myriam.Vezain at loria.fr  Tue Apr 27 08:58:30 2004
From: Myriam.Vezain at loria.fr (Myriam Vezain)
Date: Tue Apr 27 09:02:53 2004
Subject: [BioPython] Looking for functions
In-Reply-To: <408E587A.5060209@loria.fr>
References: <1082986523.31151.5.camel@numero2> <408E587A.5060209@loria.fr>
Message-ID: <408E58F6.5010604@loria.fr>

Myriam Vezain a ?crit :

> Hello,
>
> I am looking for several functions :
> - convert a swissprot file to a fasta file,
> - find EcoRI restriction sites in a fasta sequence.
>
> Can you help me to find those functions.
> I try this one :
>
> from Bio.SeqIO import FASTA
> from Bio.SwissProt import SProt
> from sys import *
>
> def convert_sp_fasta(infile,outfile):
>    """
>    convert a SwissProt file into a Fasta formatted file
>    """
>    in_h = open(infile)
>    sp = SProt.Iterator(in_h, SProt.SequenceParser())
>    out_h = FASTA.FastaWriter(outfile)
>    sequence = sp.next()
>    out_h.write(sequence)
>    in_h.close()
>    out_h.close()
>
> But it was not a succes.
>
> Thank you.
>
> Myriam.
>
>
>
>
>


From dlondon at ebi.ac.uk  Wed Apr 28 06:06:37 2004
From: dlondon at ebi.ac.uk (Darin London)
Date: Wed Apr 28 06:11:00 2004
Subject: [BioPython] BOSC 2nd Call For Papers
Message-ID: <Pine.LNX.4.44.0404281104180.4053-100000@parrot.ebi.ac.uk>

 {Please pass the word!}
 
 SECOND CALL FOR SPEAKERS
 
 BOSC PROGRAM & CONTACT INFO
 
 * Web: http://www.open-bio.org/bosc2004/
 * Email: bosc@open-bio.org
 * Online registration: https://www.cteusa.com/iscb3/
  
 The program committee is currently seeking abstracts for talks at BOSC 
 2004. BOSC is a great opportunity for you to tell the community about 
 your use, development, or philosophy of open source software development 
 in bioinformatics. The committee will select several submitted abstracts 
 for 25-minute talks and others for shorter "lightning" talks. Accepted 
 abstracts will be published on the BOSC web site.
 
 If you are interested in speaking at BOSC 2004, 
 please send us:
 
 * an abstract (no more than a few paragraphs)
 * a URL for the project page, if applicable
 * information about the open source license used for your software or 
   your release plans.

*** Abstracts for formal presentations must be recieved by 5-May-2004. ***

 LIGHTNING-TALK SPEAKERS WANTED!
 
 The program committee is currently seeking speakers for the lightning 
 talks at BOSC 2004. Lightning talks are quick - only five minutes 
 long - and a great opportunity for you to give people a quick 
 summary of your open source project, code, idea, or vision of the future.

 If you are interested in giving a lightning talk at BOSC 2004, 
 please send us:

 * a brief title and summary (one or two lines)
 * a URL for the project page, if applicable
 * information about the open source license used for your software or 
   your release plans.

 We will accept entries on-line until BOSC starts, but
 space for demos and lightning talks is limited.<br/>
    
 SOFTWARE DEMONSTRATIONS WANTED!
 If you are involved in the development of Open Source Bioinformatics 
 Software, you are invited to provide a short demonstration to attendees 
 of BOSC 2004.

 If you are interested in giving a software demonstration at BOSC 2004,
 please send us:

 * a brief title and summary (one or two lines)
 * a URL for the project page, if applicable
 * Internet connectivity requirements (e.g. website Application served on 
the  world wide web, or web based client application).

 We will accept entries on-line until the BOSC starts, but
 space for demos and lightning talks is limited. 

** Because the mission of the OBF is to promote Open Source software, we 
will favor submissions for projects that apply a recognized Open Source 
License, or adhere to the general Open Source Philosophy.
   See the following websites for further details:
   href="http://www.opensource.org/licenses/
   href="http://www.opensource.org/docs/definition.php


cheers,

-- 
Darin London dlondon@ebi.ac.uk    European Bioinformatics Institute, 
+44 (0)1223 49 2566               Wellcome Trust Genome Campus, Hinxton 
+44 (0)1223 49 4468 (fax)         Cambridgeshire CB10 1SD, UK

From stephandamen at hotmail.com  Thu Apr 29 05:43:56 2004
From: stephandamen at hotmail.com (Stephan Damen)
Date: Thu Apr 29 09:07:09 2004
Subject: [BioPython] Robustness of parsing
Message-ID: <BAY2-DAV57g2oHzGkzM00014bb2@hotmail.com>

Dear sir / madam,

I have a question about the robustness of parsing in my case the UniGene at NCBI. I have been assigned to write a parser to put the UniGene flat files into an existing database structure, before starting writing code I thought I'd better search the web to find some existing solutions. This is when I opened your site and found a parser for the UniGene.
At my work we already have a parser for these flat files written in C, the only problem with this parser is, is that it will not run anymore if the structure of the UniGene changes. For instance if a new field is added or if relations change from a 1-to-1 to 1-to-many.
My question about biopython; has it the same problems? If that is the case; in what timespan are updates available?
If biopython is also lacking these problems I want to write a more generetic solution, perhaps in python.

Kind regards,

Stephan Damen
Project leader at UL

From lpritc at scri.sari.ac.uk  Thu Apr 29 10:03:14 2004
From: lpritc at scri.sari.ac.uk (Leighton Pritchard)
Date: Thu Apr 29 10:07:31 2004
Subject: [BioPython] GenBank parser
Message-ID: <40910B22.6040705@scri.sari.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I've noticed an oddity in the GenBank FeatureParser (CVS installation
19/4).  While parsing the Salmonella typhi file NC_003198.gbk, my way of
dealing with 'gene' tags fell over.  This turned out to be because the
GenBank file contains entries with valueless tags such as /partial and
/pseudo.  The current parser concatenates these tags with the following
tag, e.g for:

~     CDS             1449249..1450391
~                     /partial
~                     /gene="fdnG"
~                     /note="Similar to part of Escherichia coli formate
~                     dehydrogenase, nitrate-inducible, major subunit fdnG
~                     SW:FDNG_ECOLI (P24183; P78261) (1015 aa) fasta scores:
~                     E(): 0, 94.4% id in 376 aa"
~                     /pseudo
~                     /codon_start=1
~                     /transl_table=11

it returns a set of qualifiers which include the tags "partial gene" and
"pseudo codon_start".  This probably isn't what was intended by the
authors ;)

I haven't got a fix for the parser, but my workaround in the code was:

##################

qualifiers = cds.qualifiers             # Shorthand for qualifiers
# We need to account for use of qualifiers, e.g. in
# NC_003198.gbk, the /partial and /pseudo tags often have no
# associated value - the BioPython GenBank feature parser lumps the
# two together into a single tag, e.g. 'partial gene' and
# 'pseudo codon_start'.  This buggers up our processing below,
# so the solution is to split tags by the ' ' space character,
# and add a qualifier comprising only the last item in the
# resulting list
for key in qualifiers.keys():
~    if key.count(' '):
~        qualifiers[key.split(' ')[-1]] = qualifiers[key]

###################

...I wasn't bothered about the partial or pseudo tags for my script

- --
Dr Leighton Pritchard AMRSC
D104, PPI, Scottish Crop Research Institute
Invergowrie, Dundee, DD2 5DA, Scotland, UK
E: lpritc@scri.sari.ac.uk	W: http://bioinf.scri.sari.ac.uk/index.shtml
T: +44 (0)1382 568579		F: +44 (0)1382 568578
PGP key FEFC205C: GPG key E58BA41B: http://www.keyserver.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAkQsiL1gZ+OWLpBsRAg2mAJkBe3EvfNiygGEwsJ4i5wwA85t5DwCfVfPp
nFoRXTGoAdrq8shnfhSPjuA=
=P60G
-----END PGP SIGNATURE-----

From no_reply at powered-hosting.com  Thu Apr 29 16:54:44 2004
From: no_reply at powered-hosting.com (AntiVirus)
Date: Thu Apr 29 16:57:30 2004
Subject: [BioPython] =?iso-8859-1?q?Atenci=F3n_=3A_Virus_de_e-mail_detect?=
	=?iso-8859-1?q?ado?=
Message-ID: <200404292054.i3TKsiWn000823@ns1.powered-hosting.com>

Nuestro detector de virus ha sido activado por un mensaje enviado por Usted:
  A: 3db9f928.3000603@asalup.org
  Asunto: Re: Yahoo!
  Fecha: Thu Apr 29 17:54:44 2004

Uno o m?s de los anexos est?n en la lista de archivos no aceptados
por este sitio y no ser?n entregados.

Considere renombrar los archivos o comprimirlos en un archivo ".zip" 
para evitar esta restricci?n.

El detector de virus dijo lo siguiente acerca del mensaje:
Informe: Control panel items are often used to hide viruses (Smoke.cpl)


-- 
Protecci?n contra Virus de E-mail

From crocha at dc.uba.ar  Thu Apr 29 17:11:19 2004
From: crocha at dc.uba.ar (Cristian S. Rocha)
Date: Thu Apr 29 17:16:54 2004
Subject: [BioPython] FormatIO + Fasta parser + BioDB.
Message-ID: <1083273079.14904.37.camel@numero2>

Hello,

I'm writing a procedure to store files in a BioDB but I have the
following error:

"""
...
  File "/usr/lib/python2.2/site-packages/BioSQL/Loader.py", line 209, in
_load_bioentry_table
    if record.id.find('.') >= 0: # try to get a version from the id
AttributeError: 'NoneType' object has no attribute 'find'
"""

and the procedure is:

"""
def SequenceStoreFile(SeqFile, database, format='genbank'):
	server = BioSeqDatabase.open_database(driver='MySQLdb', user='bio',
passwd='bio', host='localhost', db='bio')
	if server[database]:
		db = server[database]
	else:
		db = server.new_database(database)
	formatter = FormatIO.FormatIO("SeqRecord", formats[format])
	itr = formatter.readFile(SeqFile, format=formats[format])
	db.load(itr)
	return

if __name__ == "__main__":
	SequenceStoreFile(open('example.fasta'), 'estC', 'fasta')
"""

I feel that is because I don't define the title2ids function for the
Fasta parser. If I'm right, how can I tell to the FormatIO module to use
a title2ids function?

Thanks,
Cristian.

-- 
Lic. Cristian S. Rocha.
<crocha@dc.uba.ar>
Departamento de Computacin. FCEyN. UBA.
Pabellon I. Cuarto 9.
Ciudad Universitaria.
(1428) Buenos Aires. Argentina.
Tel: +54-11-4576-3390/96 int 714
Tel/Fax: +54-11-4576-3359
Cel: 15-5-607-9192

From mdehoon at ims.u-tokyo.ac.jp  Fri Apr 30 02:58:29 2004
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Fri Apr 30 03:02:51 2004
Subject: [BioPython] Lowess function for nonparametric regression
Message-ID: <4091F915.2060101@ims.u-tokyo.ac.jp>

Dear Biopythoneers,

Recently I wrote a pure Python implementation of the Lowess function for 
nonparametric regression. For an example, see
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/lowess.png
the red line is a nonparametric regression curve fitted to the scatterplot. The 
value on the x-axis is the mean of replicated gene expression measurements; the 
y-axis is the associated measurement error. Such plots are used to find out how 
large the measurement error typically is for a given magnitude of a measured 
gene expression level.
If this function is useful for other computational biologists, I can submit it 
to Biopython. In that case, which module would this fall under? Or can I just 
make a Lowess.py under Bio? The code is very short, about 25 lines of Python for 
the actual function.

--Michiel

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From nomy2020 at yahoo.com  Fri Apr 30 00:26:00 2004
From: nomy2020 at yahoo.com (Bzy Bee)
Date: Fri Apr 30 07:46:13 2004
Subject: [BioPython] HSPs in Blast parser
Message-ID: <20040430042600.6570.qmail@web90001.mail.scd.yahoo.com>


Hi

 
I am stuck on parsing a BlastN output and would appreciate some help. I am working on multiple HSPs for a single hit . For example if there are two hsps found for one hit, I need to find where query and subject ends for one hsp and then compare it with the query and subject start for the next hsp, e.g. in the following example:

 
>test_seq1
          Length = 424

 Score =  841 bits (424), Expect = 0.0
 Identities = 424/424 (100%)
 Strand = Plus / Plus

                                                                       
Query: 1   ggactggttcgtcgtttacaagctgccggcccacacagggtcgggagatgcgacgcagaa 60
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1   ggactggttcgtcgtttacaagctgccggcccacacagggtcgggagatgcgacgcagaa 60

                                                                       
Query: 61  cggcctgcggtacaagtactttgacgaacactcagaagactggagcgacggcgtggggtt 120
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61  cggcctgcggtacaagtactttgacgaacactcagaagactggagcgacggcgtggggtt 120

                                                                       
 Score =  226 bits (114), Expect = 2e-58
 Identities = 141/150 (94%)
 Strand = Plus / Plus

                                                                       
Query: 275 ccagctcgcctttgtgctctacaatgaccaaccgcctaaatgcagcgagtgtaaggactc 334
           ||||||||||||||||||||||||||||||||||||||||| |||||||| |||||||||
Sbjct: 513 ccagctcgcctttgtgctctacaatgaccaaccgcctaaatccagcgagtctaaggactc 572

                                                                       
Query: 335 ttgcagtcgtgggcacacgaagggtgtgctgctcctggaccaagaagggggcttgtggtt 394
           || ||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
Sbjct: 573 ttccagtcgtgggcacacgaagggtgtgctgctcctggaccaagaagggggcttctggtt 632


I am interetsed in where Query and sbjct ended in first hsp (i.e. 120, 120) and where it started in the second hsp (i.e. 275, 513).

 
I have noticed that in the blast parser one can iterate through each hsp for every single hit, but am not too sure how to treat two hsps of a single hit as related and iterate through the two hsps of a single hit in order to find the query (and subject) end of one and query (and subject) start of the other.

 
Any help would be highly appreciated.

 
Thanks

 
Jawad Ali


---------------------------------
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs