From katel at worldpath.net  Sun Oct  7 22:10:50 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] IntelliGenetics parser
Message-ID: <001401c14f9e$748fe920$499403cf@g0fjl>

  Tonihjt I ran into a snag with Ithe iterator.  The problem is the starting tag which is a semicolon.  The IntelliGenetics format starts with a block of comment lines, each beginning with a semicolon.  So the iterator interprets each comment as the start of a new record.  Should I write a custom iterator?

                                                                   Cayte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011007/35e31fab/attachment.htm
From jchang at SMI.Stanford.EDU  Mon Oct  8 01:11:36 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] IntelliGenetics parser
In-Reply-To: <001401c14f9e$748fe920$499403cf@g0fjl>
References: <001401c14f9e$748fe920$499403cf@g0fjl>
Message-ID: <p05101001b7e6e5e3449a@[192.168.0.4]>

I'm not familiar with the IntelliGenetics format.  Could you provide 
a sample of it, so that people will be able to provide more comments?

Thanks,
Jeff


At 10:10 PM -0400 10/7/01, Cayte wrote:
>   Tonihjt I ran into a snag with Ithe iterator.  The problem is the 
>starting tag which is a semicolon.  The IntelliGenetics format 
>starts with a block of comment lines, each beginning with a 
>semicolon.  So the iterator interprets each comment as the start of 
>a new record.  Should I write a custom iterator?
>
>                                                                    Cayte


From katel at worldpath.net  Mon Oct  8 13:45:21 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] IntelliGenetics parser
References: <001401c14f9e$748fe920$499403cf@g0fjl> <p05101001b7e6e5e3449a@[192.168.0.4]>
Message-ID: <000b01c15021$01988480$6b72bbd1@g0fjl>

----- Original Message -----
From: "Jeffrey Chang" <jchang@SMI.Stanford.EDU>
To: "Cayte" <katel@worldpath.net>; <biopython-dev@biopython.org>
Sent: Monday, October 08, 2001 1:11 AM
Subject: Re: [Biopython-dev] IntelliGenetics parser


> I'm not familiar with the IntelliGenetics format.  Could you provide
> a sample of it, so that people will be able to provide more comments?
>
> Thanks,
  Its the same as the MASE format.  I committed a sample under
Tests\IntelliGenetics.  I saw a way to write the iterator with a simple state
machine so I used that approach.  The files are checked in but I need to check the
test output closely.

                                                            Cayte

>
>


From chapmanb at arches.uga.edu  Mon Oct  8 23:39:40 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] GenBank parser fails (on large files?)
In-Reply-To: <20010928081702.A973@ci350185-a.athen1.ga.home.com>
References: <200109281141.f8SBfDM188776@electre.pasteur.fr> <20010928081702.A973@ci350185-a.athen1.ga.home.com>
Message-ID: <20011008233940.B13537@ci350185-a.athen1.ga.home.com>

Michel:
> [Talking about the translation]
> > Note, incidentally, that this is a bit ugly, because the \012's and spaces 
> > should have been cleaned out 

me:
> I agree with you here -- I haven't yet done any work at massaging
> the feature value information. I'll think about a good way to do
> this (I'm sure there are other cases where this also needs to be
> done), and try to get something done on it this weekend.

I finally managed to come up with a good (in my opinion, of course)
way to handle the problem of selectively cleaning up values based on
their type. Basically, what I did was add a Bio.GenBank.utils class
that has a FeatureValueCleaner class. Right now this class is quite
simple (it just deals with the translation problem mentioned), but
could be extended quite easily to deal with other special cases as
they come up.

You can use this class by passing it as a feature_cleaner argument
to the FeatureParser, ie:

from Bio import GenBank
from Bio.GenBank.utils import FeatureValueCleaner

parser = GenBank.FeatureParser(feature_cleaner =
                               FeatureValueCleaner())

Right now this is not enabled by default, but I'm definately 
open to opinions about whether or not it should be. 

Michel, I'd be happy to hear if this does what you'd like it to. If
you have additional things that need cleaning up, I'd be more than
happy to accept patches against utils.py adding these things. The
utils.py class is attached, along with the patch against
__init__.py. These are also checked into CVS. 

Hope this works for you.
Brad
-- 
PGP public key available from http://pgp.mit.edu/
-------------- next part --------------
*** __init__.py.orig	Thu Sep 27 16:00:49 2001
--- __init__.py	Mon Oct  8 23:16:50 2001
***************
*** 239,245 ****
  class FeatureParser:
      """Parse GenBank files into Seq + Feature objects.
      """
!     def __init__(self, debug_level = 0, use_fuzziness = 1):
          """Initialize a GenBank parser and Feature consumer.
  
          Arguments:
--- 239,246 ----
  class FeatureParser:
      """Parse GenBank files into Seq + Feature objects.
      """
!     def __init__(self, debug_level = 0, use_fuzziness = 1, 
!                  feature_cleaner = None):
          """Initialize a GenBank parser and Feature consumer.
  
          Arguments:
***************
*** 249,262 ****
          you can set this as high as two and see exactly where a parse fails.
          o use_fuzziness - Specify whether or not to use fuzzy representations.
          The default is 1 (use fuzziness).
          """
          self._scanner = _Scanner(debug_level)
          self.use_fuzziness = use_fuzziness
  
      def parse(self, handle):
          """Parse the specified handle.
          """
!         self._consumer = _FeatureConsumer(self.use_fuzziness)
          self._scanner.feed(handle, self._consumer)
          return self._consumer.data
  
--- 250,268 ----
          you can set this as high as two and see exactly where a parse fails.
          o use_fuzziness - Specify whether or not to use fuzzy representations.
          The default is 1 (use fuzziness).
+         o feature_cleaner - A class which will be used to clean out the
+         values of features. This class must implement the function 
+         clean_value. GenBank.utils has a "standard" cleaner class.
          """
          self._scanner = _Scanner(debug_level)
          self.use_fuzziness = use_fuzziness
+         self._cleaner = feature_cleaner
  
      def parse(self, handle):
          """Parse the specified handle.
          """
!         self._consumer = _FeatureConsumer(self.use_fuzziness, 
!                                           self._cleaner)
          self._scanner.feed(handle, self._consumer)
          return self._consumer.data
  
***************
*** 398,409 ****
      Attributes:
      o use_fuzziness - specify whether or not to parse with fuzziness in
      feature locations.
      """
!     def __init__(self, use_fuzziness):
          _BaseGenBankConsumer.__init__(self)
          self.data = SeqRecord(None, id = None)
  
          self._use_fuzziness = use_fuzziness
  
          self._seq_type = ''
          self._seq_data = []
--- 404,418 ----
      Attributes:
      o use_fuzziness - specify whether or not to parse with fuzziness in
      feature locations.
+     o feature_cleaner - a class that will be used to provide specialized
+     cleaning-up of feature values.
      """
!     def __init__(self, use_fuzziness, feature_cleaner = None):
          _BaseGenBankConsumer.__init__(self)
          self.data = SeqRecord(None, id = None)
  
          self._use_fuzziness = use_fuzziness
+         self._feature_cleaner = feature_cleaner
  
          self._seq_type = ''
          self._seq_data = []
***************
*** 856,861 ****
--- 865,872 ----
          if self._cur_qualifier_key:
              key = self._cur_qualifier_key
              value = self._cur_qualifier_value
+             if self._feature_cleaner is not None:
+                 value = self._feature_cleaner.clean_value(key, value)
              # if the qualifier name exists, append the value
              if self._cur_feature.qualifiers.has_key(key):
                  self._cur_feature.qualifiers[key].append(value)
-------------- next part --------------
"""Useful utilities for helping in parsing GenBank files.
"""
# standard library
import string

class FeatureValueCleaner:
    """Provide specialized capabilities for cleaning up values in features.

    This class is designed to provide a mechanism to clean up and process
    values in the key/value pairs of GenBank features. This is useful 
    because in cases like:
        
         /translation="MED
         YDPWNLRFQSKYKSRDA"

    you'll end up with a value with \012s and spaces in it like:
        "MED\012 YDPWEL..."

    which you probably don't want. 
    
    This cleaning needs to be done on a case by case basis since it is
    impossible to interpret whether you should be concatenating everything
    (as in translations), or combining things with spaces (as might be
    the case with /notes).
    """
    keys_to_process = ["translation"]
    def __init__(self, to_process = keys_to_process):
        """Initialize with the keys we should deal with.
        """
        self._to_process = to_process

    def clean_value(self, key_name, value):
        """Clean the specified value and return it.

        If the value is not specified to be dealt with, the original value
        will be returned.
        """
        if key_name in self._to_process:
            try:
                cleaner = getattr(self, "_clean_%s" % key_name)
                value = cleaner(value)
            except AttributeError:
                raise AssertionError("No function to clean key: %s" 
                                     % key_name)
        return value

    def _clean_translation(self, value):
        """Concatenate a translation value to one long protein string.
        """
        translation_parts = value.split()
        return string.join(translation_parts, '')
From chapmanb at arches.uga.edu  Tue Oct  9 01:00:50 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] SeqIO
In-Reply-To: <y9vadzf3oke.fsf@delphinus.cbs.dtu.dk>
References: <y9vhetz5sj8.fsf@genome.cbs.dtu.dk> <20010926224754.E27721@ci350185-a.athen1.ga.home.com> <y9vadzf3oke.fsf@delphinus.cbs.dtu.dk>
Message-ID: <20011009010050.E13537@ci350185-a.athen1.ga.home.com>

Hi Thomas;
Hope you're doing well!

[I ask how many features we want to keep between conversions]
> All of them. I think each GenBank feature has an exact equivalence in EMBL
> and SwissProt (GenPept). So that leaves us just with the definition of the
> corresponding feature names.

[relatedly, I ask in a confusing manner about a "specialized
converter"]
> I don't know if I understood this question...

What I mean is that I'm not sure how I would plug in "lossless"
EBML->GenBank conversion into the current scheme. I can write a
generic writer that will convert a basic SeqRecord to a simple
GenBank (no features). But the way ReadSeq.Convert works now is that
I only get a record, and don't know the starting format. In order to
have "smart" EMBL->GenBank I need to know that the format is EMBL,
so I can look for the conversions.

It seems like a simple thing to do would be to add an optional
second argument to write, so that I could do something like:

def write(self, record, starting_format):
    if starting_format == "embl":
        _do_embl_to_genbank()
    elif starting_format = "swissprot":
        _do_swissprot_to_genbank()
    else:
        _do_generic_to_genbank()

Does this make sense and seem like a good idea? Or am I still making
no darn sense?

[I ask about duplicated SeqRecord stuff]
> I copied everything so that I c?uld play around without breaking e.g. your
> code. Now I think the changes are actually backward compatible - so we
> could move it back.

Yeah, you are more than welcome to add additions that are
back-compatible to the SeqRecord stuff. This will help eliminate
duplication (and thus lots of confusion for me :-).

> P.S. is anybody going to the Atlanta meeting in November ?

Well, I am not going (I was kind of scared away by the
"Bioinformatics after Human Genome" sub-title), but I am only an
hour away from Atlanta so I'll definately "be in the area" :-). Many
of the talks look good though, so I may try to sneak in to listen to
a couple of them.

It's-okay-for-graduate-students-to-be-cheap-ly yr's,
Brad
-- 
PGP public key available from http://pgp.mit.edu/

From chapmanb at arches.uga.edu  Tue Oct  9 03:27:09 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Implementation of Application interface
Message-ID: <20011009032708.G13537@ci350185-a.athen1.ga.home.com>

Hello all;
I thought this might be of interest to Davide and others interested
in accessing applications through Biopython.

We talked a while back about a generic interface for specifying
command lines and running programs through biopython, in a thread
starting here:

http://www.biopython.org/pipermail/biopython-dev/2001-August/000476.html

Well, I've been working in biopython-corba on implementing
interfaces to remote programs (through Novella,
http://industry.ebi.ac.uk/novella/) and wrote up a "biopython-like"
interface that implements what we were talking about.

The code for this is in biopython-corba CVS in
BioCorba/Bio/Application/__init__.py, and on-line here (sorry 'bout
the long URL):

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython-corba/BioCorba/Bio/Application/__init__.py?rev=1.1&content-type=text/vnd.viewcvs-markup&cvsroot=biopython
                    
Anyways, I thought a working implementation of what we talked about
might be of interest to people, and once we
get the generic application stuff working in biopython, we can
synchronize these interfaces so the biopython-corba stuff models as
closely as possible the biopython code.

Comments on whether I accurately implemented what we talked about,
etc, are very welcome.

Brad
-- 
PGP public key available from http://pgp.mit.edu/

From jchang at SMI.Stanford.EDU  Wed Oct 10 03:56:30 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] recent checkins to CVS
Message-ID: <p05101000b7e9ae7b6734@[192.168.0.4]>

Hello everyone,

I've checked in some new stuff into the CVS tree:

1.  I've implemented the fastpairwise dynamic programming code in C. 
It runs much faster now and is probably about as fast as it will get 
without making some more assumptions.

2.  There are 2 new modules in Bio.Tools.Classification: 
LogisticRegression and MaxEntropy.

3.  There is now some preliminary support for the NLM's XML format 
for Medline.  There's a Martel format definition and some code to 
index the files.  However, getting a parser to put things into a data 
structure will take some work and may not happen soon...

Enjoy, and let me know if anything breaks!

Jeff

From reillywu at yahoo.com  Fri Oct 12 23:29:28 2001
From: reillywu at yahoo.com (Chunlei Wu)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] The 'year' attribute of parsed Medline record return empty
Message-ID: <20011013032928.78364.qmail@web20504.mail.yahoo.com>

Hi, all,
    Here is a sample code:
==========
....
cur_record=medline_dict[id]
print cur_record.year
=========

    The 'year' attribute always returns empty string,
while 'publication_date' attribute returns the correct
whole date string. When I looked into the __init__.py,
I found the 'year' attr. coresponding to 'YR'
qualifier. But 'YR' doesn't exist in most of Medline
format record. Although we can get year-value easily
from cur_record.publication_date[:4], I think it's
better give the proper value to the attr of 'year'.

Thanks

Chunlei Wu

__________________________________________________
Do You Yahoo!?
Make a great connection at Yahoo! Personals.
http://personals.yahoo.com

From jchang at SMI.Stanford.EDU  Sat Oct 13 17:26:44 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] The 'year' attribute of parsed Medline record
 return empty
In-Reply-To: <20011013032928.78364.qmail@web20504.mail.yahoo.com>
References: <20011013032928.78364.qmail@web20504.mail.yahoo.com>
Message-ID: <p05101001b7ee6152da48@[171.65.33.250]>

This is as designed.  The members of the Record class are supposed to 
mirror the information given in the MEDLARS format.  If there's no YR 
line, then the year member of the record is empty.

If you need the year member, it should be pretty simple to make a 
parser that uses it.  For example, you could do (untested):

class MyParserWithYear:
   def parse(self, handle):
     rec = Medline.RecordParser().parse(handle)
     if not rec.year:
       rec.year = rec.publication_date[:4]
     return rec


Jeff


At 8:29 PM -0700 10/12/01, Chunlei Wu wrote:
>Hi, all,
>     Here is a sample code:
>==========
>....
>cur_record=medline_dict[id]
>print cur_record.year
>=========
>
>     The 'year' attribute always returns empty string,
>while 'publication_date' attribute returns the correct
>whole date string. When I looked into the __init__.py,
>I found the 'year' attr. coresponding to 'YR'
>qualifier. But 'YR' doesn't exist in most of Medline
>format record. Although we can get year-value easily
>from cur_record.publication_date[:4], I think it's
>better give the proper value to the attr of 'year'.
>
>Thanks
>
>Chunlei Wu
>
>__________________________________________________
>Do You Yahoo!?
>Make a great connection at Yahoo! Personals.
>http://personals.yahoo.com
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev


From katel at worldpath.net  Sun Oct 14 00:48:58 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] bio formats
Message-ID: <000701c1546b$8aeb6100$010a0a0a@cadence.com>

This looks like a great web site:

http://newfish.mbl.edu/Course/Software/FileFormats


                           Cayte


From katel at worldpath.net  Sun Oct 14 01:06:18 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] another great web page
Message-ID: <000701c1546d$f66b9a60$010a0a0a@cadence.com>

http://www.sander.embl-ebi.ac.uk/Services/webin/help/webin-align/align_forma
t_help.html

                                             Cayte


From katel at worldpath.net  Sun Oct 14 01:16:47 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Pir format
Message-ID: <000c01c1546f$6d3e7f80$010a0a0a@cadence.com>

  Maybe I'll try PIR next.  The other formats( MSF, Phyllip, Nexus ) contain
phylogenetic info.  I'm not sure how to fit phylogenetic data  in with what
we have so far.  Annotation? Or should we define phylogenetic classes.

  So far the MASE output seems fine.  But I've checked only 3 sequences so
far.  I'll wait for a rainy day before I slog through a letter by letter
check of some 30 sequences.!:)


                                                        Cayte


From mkersz at pasteur.fr  Mon Oct 15 11:59:16 2001
From: mkersz at pasteur.fr (Michel Kerszberg)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Re: cleaning features
Message-ID: <3260000.1003161556@cricri>

Dear Brad,

The Bio.GenBank FeatureValueCleaner utility is what the doctor prescribed!
Personnally, I would vote for applying it by default.

Meanwhile, I discovered that the GenBank BLAST parser stalls at the 
interactive map when this is included in the HTML file. I guess the parser 
should ignore anything enclosed in <PRE> and </PRE> flags and including a 
"#graphical-overview" string. Mind you, no big deal to take this out by 
hand, or cheking the right options when doing the BLAST!

Also, parsing of TBLASTX records stalls due to the unusual format of the 
final information:

Matrix: BLOSUM62
Number of Hits to DB: 10,447,982,379
Number of Sequences: 988209
Number of extensions: 209322370
Number of successful extensions: 17383937
Number of sequences better than 1.0e-50: 110
length of database: 1,426,479,391
effective HSP length: 60
effective length of database: 1,367,186,851
effective search space used: 869530837236
frameshift window, decay const: 50,  0.5
T: 13
A: 40
X1: 16 ( 7.3 bits)
X2: 0 ( 0.0 bits)
S1: 41 (21.7 bits)

Thanks for the good work! I appreciate biopython more and more (not to 
speak of python)

Best regards,

Michel


From jchang at SMI.Stanford.EDU  Mon Oct 15 14:05:52 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Re: cleaning features
In-Reply-To: <3260000.1003161556@cricri>
References: <3260000.1003161556@cricri>
Message-ID: <p05101003b7f0d5588faf@[171.65.33.250]>

>Also, parsing of TBLASTX records stalls due to the unusual format of 
>the final information:
>
>Matrix: BLOSUM62
>Number of Hits to DB: 10,447,982,379
>Number of Sequences: 988209
>Number of extensions: 209322370
>Number of successful extensions: 17383937
>Number of sequences better than 1.0e-50: 110
>length of database: 1,426,479,391
>effective HSP length: 60
>effective length of database: 1,367,186,851
>effective search space used: 869530837236
>frameshift window, decay const: 50,  0.5
>T: 13
>A: 40
>X1: 16 ( 7.3 bits)
>X2: 0 ( 0.0 bits)
>S1: 41 (21.7 bits)

Thanks for pointing this out.  The Standalone parser gets used more 
often, so fixes make it into there more often than the web one.  I'll 
update the WWW parser, and it should fix this problem.

jeff

From biopython-bugs at bioperl.org  Tue Oct 23 18:50:52 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/44
Message-ID: <200110232250.f9NMoqB13198@pw600a.bioperl.org>

JitterBug notification

new message incoming/44

Message summary for PR#44
	From: gec@compbio.berkeley.edu
	Subject: Raised no existant error?
	Date: Tue, 23 Oct 2001 18:50:52 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:50:52 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMoqB13192
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:50:52 -0400
Date: Tue, 23 Oct 2001 18:50:52 -0400
Message-Id: <200110232250.f9NMoqB13192@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Raised no existant error?

Full_Name: Gavin Crooks
Module: SCOP/Dom.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


When fed a corrupt file Dom.DomainParser will 
attempt to raise "error", but error hasn't been 
defined.

NameError: global name 'error' is not defined


From biopython-bugs at bioperl.org  Tue Oct 23 18:54:40 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/45
Message-ID: <200110232254.f9NMseB13272@pw600a.bioperl.org>

JitterBug notification

new message incoming/45

Message summary for PR#45
	From: gec@compbio.berkeley.edu
	Subject: PDB sequence numbers can be negative
	Date: Tue, 23 Oct 2001 18:54:38 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:54:38 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMscB13266
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:54:38 -0400
Date: Tue, 23 Oct 2001 18:54:38 -0400
Message-Id: <200110232254.f9NMscB13266@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: PDB sequence numbers can be negative

Full_Name: Gavin Crooks
Module: SCOP/Location.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


PDB residue sequence numbers can, on occasion, be
negative. e.g. 1B9N. SCOP domains sometimes start
on negative sequence numbers. This breaks the
location parser in Bio.SCOP.Location.py


From biopython-bugs at bioperl.org  Tue Oct 23 18:56:44 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/46
Message-ID: <200110232256.f9NMuiB13336@pw600a.bioperl.org>

JitterBug notification

new message incoming/46

Message summary for PR#46
	From: gec@compbio.berkeley.edu
	Subject: PDB sequence numbers can be negative
	Date: Tue, 23 Oct 2001 18:56:44 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:56:44 -0400
Date: Tue, 23 Oct 2001 18:56:44 -0400
Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: PDB sequence numbers can be negative

Full_Name: Gavin Crooks
Module: SCOP/Location.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


PDB residue sequence numbers can, on occasion, be
negative. e.g. 1B9N. SCOP domains sometimes start
on negative sequence numbers. This breaks the
location parser in Bio.SCOP.Location.py


From biopython-bugs at bioperl.org  Tue Oct 23 23:19:42 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/47
Message-ID: <200110240319.f9O3JgB15062@pw600a.bioperl.org>

JitterBug notification

new message incoming/47

Message summary for PR#47
	From: gec@compbio.berkeley.edu
	Subject: Tutorial typos
	Date: Tue, 23 Oct 2001 23:19:41 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 23:19:41 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9O3JfB15056
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 23:19:41 -0400
Date: Tue, 23 Oct 2001 23:19:41 -0400
Message-Id: <200110240319.f9O3JfB15056@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Tutorial typos

Full_Name: Gavin Crooks
Module: Tutotial.tex
Version: 
OS: 
Submission from: sdn-ar-013casfrmp012.dialsprint.net (158.252.217.14)


The tutorial contains a few minor bugs.

Page 5: "Installation of FreeBSD" should be "Installation on FreeBSD"

Page 6: The first sentance of section 1.3.3 does not make sence.

Everywhere: "ie." should be "i.~e.~TheNextWord", or "i.~e.,"

Page 11 : "created for free for you" should be "created for free"?

Page 43ish: Some html has worked its way into the tex file, producing some odd
symbols. Plus some of the number have hats on.


From biopython-bugs at bioperl.org  Wed Oct 24 13:17:43 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/48
Message-ID: <200110241717.f9OHHhB21139@pw600a.bioperl.org>

JitterBug notification

new message incoming/48

Message summary for PR#48
	From: gec@compbio.berkeley.edu
	Subject: Unclosed file
	Date: Wed, 24 Oct 2001 13:17:43 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Wed Oct 24 13:17:43 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9OHHgB21133
	for <biopython-bugs@pw600a.bioperl.org>; Wed, 24 Oct 2001 13:17:43 -0400
Date: Wed, 24 Oct 2001 13:17:43 -0400
Message-Id: <200110241717.f9OHHgB21133@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Unclosed file

Full_Name: Gavin Crooks
Module: ParserSupport.AbstractParser
Version: 
OS: 
Submission from: sdn-ar-005casfrmp182.dialsprint.net (158.252.212.184)


AbstractParser.parse_file(self,filename) does not close the file it opens.


From biopython-bugs at bioperl.org  Wed Oct 24 19:50:14 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/48
Message-ID: <200110242350.f9ONoEB24543@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#48
	From: gec@compbio.berkeley.edu
	Subject: Unclosed file
	Date: Wed, 24 Oct 2001 13:17:43 -0400
	0 replies 	0 followups
	Notes: It gets closed implicitly as the reference in parse goes out of scope.  However,
you're right that it's better to be done explicitly, so I've made the changes in
the file.

Thanks,
Jeff


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Wed Oct 24 13:17:43 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9OHHgB21133
	for <biopython-bugs@pw600a.bioperl.org>; Wed, 24 Oct 2001 13:17:43 -0400
Date: Wed, 24 Oct 2001 13:17:43 -0400
Message-Id: <200110241717.f9OHHgB21133@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Unclosed file

Full_Name: Gavin Crooks
Module: ParserSupport.AbstractParser
Version: 
OS: 
Submission from: sdn-ar-005casfrmp182.dialsprint.net (158.252.212.184)


AbstractParser.parse_file(self,filename) does not close the file it opens.


From biopython-bugs at bioperl.org  Wed Oct 24 19:50:14 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/48
Message-ID: <200110242350.f9ONoEB24547@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#48 from incoming to fixed-bugs
Message summary for PR#48
	From: gec@compbio.berkeley.edu
	Subject: Unclosed file
	Date: Wed, 24 Oct 2001 13:17:43 -0400
	0 replies 	0 followups
	Notes: It gets closed implicitly as the reference in parse goes out of scope.  However,
you're right that it's better to be done explicitly, so I've made the changes in
the file.

Thanks,
Jeff


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Wed Oct 24 13:17:43 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9OHHgB21133
	for <biopython-bugs@pw600a.bioperl.org>; Wed, 24 Oct 2001 13:17:43 -0400
Date: Wed, 24 Oct 2001 13:17:43 -0400
Message-Id: <200110241717.f9OHHgB21133@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Unclosed file

Full_Name: Gavin Crooks
Module: ParserSupport.AbstractParser
Version: 
OS: 
Submission from: sdn-ar-005casfrmp182.dialsprint.net (158.252.212.184)


AbstractParser.parse_file(self,filename) does not close the file it opens.


From biopython-bugs at bioperl.org  Wed Oct 24 19:54:54 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/44
Message-ID: <200110242354.f9ONsrB24695@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#44
	From: gec@compbio.berkeley.edu
	Subject: Raised no existant error?
	Date: Tue, 23 Oct 2001 18:50:52 -0400
	0 replies 	0 followups
	Notes: Oops, you're right.  Changed to SyntaxError.

Jeff


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:50:52 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMoqB13192
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:50:52 -0400
Date: Tue, 23 Oct 2001 18:50:52 -0400
Message-Id: <200110232250.f9NMoqB13192@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Raised no existant error?

Full_Name: Gavin Crooks
Module: SCOP/Dom.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


When fed a corrupt file Dom.DomainParser will 
attempt to raise "error", but error hasn't been 
defined.

NameError: global name 'error' is not defined


From biopython-bugs at bioperl.org  Wed Oct 24 19:54:54 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/44
Message-ID: <200110242354.f9ONssB24699@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#44 from incoming to fixed-bugs
Message summary for PR#44
	From: gec@compbio.berkeley.edu
	Subject: Raised no existant error?
	Date: Tue, 23 Oct 2001 18:50:52 -0400
	0 replies 	0 followups
	Notes: Oops, you're right.  Changed to SyntaxError.

Jeff


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:50:52 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMoqB13192
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:50:52 -0400
Date: Tue, 23 Oct 2001 18:50:52 -0400
Message-Id: <200110232250.f9NMoqB13192@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Raised no existant error?

Full_Name: Gavin Crooks
Module: SCOP/Dom.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


When fed a corrupt file Dom.DomainParser will 
attempt to raise "error", but error hasn't been 
defined.

NameError: global name 'error' is not defined


From biopython-bugs at bioperl.org  Wed Oct 24 19:56:24 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/46
Message-ID: <200110242356.f9ONuOB24799@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#46
	From: gec@compbio.berkeley.edu
	Subject: PDB sequence numbers can be negative
	Date: Tue, 23 Oct 2001 18:56:44 -0400
	0 replies 	0 followups
	Notes: dup of 45


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:56:44 -0400
Date: Tue, 23 Oct 2001 18:56:44 -0400
Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: PDB sequence numbers can be negative

Full_Name: Gavin Crooks
Module: SCOP/Location.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


PDB residue sequence numbers can, on occasion, be
negative. e.g. 1B9N. SCOP domains sometimes start
on negative sequence numbers. This breaks the
location parser in Bio.SCOP.Location.py


From biopython-bugs at bioperl.org  Wed Oct 24 19:56:24 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/46
Message-ID: <200110242356.f9ONuOB24803@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#46 from incoming to fixed-bugs
Message summary for PR#46
	From: gec@compbio.berkeley.edu
	Subject: PDB sequence numbers can be negative
	Date: Tue, 23 Oct 2001 18:56:44 -0400
	0 replies 	0 followups
	Notes: dup of 45


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:56:44 -0400
Date: Tue, 23 Oct 2001 18:56:44 -0400
Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: PDB sequence numbers can be negative

Full_Name: Gavin Crooks
Module: SCOP/Location.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


PDB residue sequence numbers can, on occasion, be
negative. e.g. 1B9N. SCOP domains sometimes start
on negative sequence numbers. This breaks the
location parser in Bio.SCOP.Location.py


From biopython-bugs at bioperl.org  Wed Oct 24 19:57:16 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/49
Message-ID: <200110242357.f9ONvGB24883@pw600a.bioperl.org>

JitterBug notification

new message incoming/49

Message summary for PR#49
	From: Jeffrey Chang <jchang@SMI.Stanford.EDU>
	Subject: Re: [Biopython-dev] Notification: incoming/46
	Date: Wed, 24 Oct 2001 16:58:30 -0700
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From jchang@SMI.Stanford.EDU Wed Oct 24 19:57:15 2001
Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9ONvAB24866
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 19:57:15 -0400
Received: from [171.65.33.250] (air11-smi.Stanford.EDU [171.65.33.250])
	by crg-gw.Stanford.EDU (8.11.5/8.11.5) with ESMTP id f9ONvEC09544
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 16:57:14 -0700 (PDT)
Mime-Version: 1.0
X-Sender: jchang@smi.stanford.edu (Unverified)
Message-Id: <p05101004b7fd060a5fa5@[171.65.33.250]>
In-Reply-To: <200110232256.f9NMuiB13336@pw600a.bioperl.org>
References: <200110232256.f9NMuiB13336@pw600a.bioperl.org>
Date: Wed, 24 Oct 2001 16:58:30 -0700
To: biopython-bugs@bioperl.org
From: Jeffrey Chang <jchang@SMI.Stanford.EDU>
Subject: Re: [Biopython-dev] Notification: incoming/46
Content-Type: text/plain; charset="us-ascii" ; format="flowed"

Hi Gavin,

Could you send me a sample of this?  It'll be helpful to have a test 
case to test fixes.

Thanks,
Jeff

>JitterBug notification
>
>new message incoming/46
>
>Message summary for PR#46
>	From: gec@compbio.berkeley.edu
>	Subject: PDB sequence numbers can be negative
>	Date: Tue, 23 Oct 2001 18:56:44 -0400
>	0 replies	0 followups
>
>====> ORIGINAL MESSAGE FOLLOWS <====
>
>>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
>Received: from localhost (localhost [127.0.0.1])
>	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
>	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 
>18:56:44 -0400
>Date: Tue, 23 Oct 2001 18:56:44 -0400
>Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
>From: gec@compbio.berkeley.edu
>To: biopython-bugs@bioperl.org
>Subject: PDB sequence numbers can be negative
>
>Full_Name: Gavin Crooks
>Module: SCOP/Location.py
>Version:
>OS:
>Submission from: sienna.berkeley.edu (128.32.236.51)
>
>
>
>PDB residue sequence numbers can, on occasion, be
>negative. e.g. 1B9N. SCOP domains sometimes start
>on negative sequence numbers. This breaks the
>location parser in Bio.SCOP.Location.py
>
>
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev


From biopython-bugs at bioperl.org  Wed Oct 24 20:49:43 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/50
Message-ID: <200110250049.f9P0nhB25253@pw600a.bioperl.org>

JitterBug notification

new message incoming/50

Message summary for PR#50
	From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
	Subject: Re: [Biopython-dev] Notification: incoming/49
	Date: Wed, 24 Oct 2001 17:40:52 -0700
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@sienna.berkeley.edu Wed Oct 24 20:49:43 2001
Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9P0ngB25247
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 20:49:42 -0400
Received: from localhost (localhost [[UNIX: localhost]])
	by sienna.berkeley.edu (8.9.3/8.9.3) id RAA03432
	for biopython-bugs@bioperl.org; Wed, 24 Oct 2001 17:49:42 -0700
From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
Reply-To: gec@compbio.berkeley.edu
Organization: Very Little
To: biopython-bugs@bioperl.org
Subject: Re: [Biopython-dev] Notification: incoming/49
Date: Wed, 24 Oct 2001 17:40:52 -0700
X-Mailer: KMail [version 1.0.29]
Content-Type: text/plain
References: <200110242357.f9ONvGB24883@pw600a.bioperl.org>
In-Reply-To: <200110242357.f9ONvGB24883@pw600a.bioperl.org>
MIME-Version: 1.0
Message-Id: <01102417494205.14420@sienna.berkeley.edu>
Content-Transfer-Encoding: 8bit


How about  "A:-1-126", direct from SCOP...
16118	px	a.4.5.8	d1b9ma1	1b9m A:-1-126

I am in the middle of updating the SCOP module, and I have already
refactored that code, and fixed this bug. And I've written a nice shiny
unit test.  But I was concerned that this same bug could crop up elsewhere.
Its the kind of obscure boundary case that could trip up any code working 
with PDB sequence numbers.

Gavin

gec@compbio.berkeley.edu
http://threeplusone.com

> Hi Gavin,
> 
> Could you send me a sample of this?  It'll be helpful to have a test 
> case to test fixes.
> 
> Thanks,
> Jeff
>
> >Full_Name: Gavin Crooks
> >Module: SCOP/Location.py
> >Version:
> >OS:
> >Submission from: sienna.berkeley.edu (128.32.236.51)
> >
> >PDB residue sequence numbers can, on occasion, be
> >negative. e.g. 1B9N. SCOP domains sometimes start
> >on negative sequence numbers. This breaks the
> >location parser in Bio.SCOP.Location.py
>


From gec at compbio.berkeley.edu  Wed Oct 24 21:03:46 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/48
In-Reply-To: <200110242350.f9ONoEB24543@pw600a.bioperl.org>
References: <200110242350.f9ONoEB24543@pw600a.bioperl.org>
Message-ID: <01102418073807.14420@sienna.berkeley.edu>

The new code dosn't work as intended, since parse() may raise an exception.

This

    def parse_file(self, filename):
        h = open(filename)
        retval = self.parse(h)
        h.close()
        return retval

should be

    def parse_file(self, filename):
        h = open(filename)
        try:
            return self.parse(h)
        finally :
            h.close()
 
Gavin

p.s. The viewcvs diff appears to be broken.


On Wed, 24 Oct 2001, you wrote:
> JitterBug notification
> 
> jchang changed notes
> 
> Message summary for PR#48
> 	From: gec@compbio.berkeley.edu
> 	Subject: Unclosed file
> 	Date: Wed, 24 Oct 2001 13:17:43 -0400
> 	0 replies 	0 followups
> 	Notes: It gets closed implicitly as the reference in parse goes out of scope.  However,
> you're right that it's better to be done explicitly, so I've made the changes in
> the file.
> 
> Thanks,
> Jeff
> 
> 
> ====> ORIGINAL MESSAGE FOLLOWS <====
> 
> From gec@compbio.berkeley.edu Wed Oct 24 13:17:43 2001
> Received: from localhost (localhost [127.0.0.1])
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9OHHgB21133
> 	for <biopython-bugs@pw600a.bioperl.org>; Wed, 24 Oct 2001 13:17:43 -0400
> Date: Wed, 24 Oct 2001 13:17:43 -0400
> Message-Id: <200110241717.f9OHHgB21133@pw600a.bioperl.org>
> From: gec@compbio.berkeley.edu
> To: biopython-bugs@bioperl.org
> Subject: Unclosed file
> 
> Full_Name: Gavin Crooks
> Module: ParserSupport.AbstractParser
> Version: 
> OS: 
> Submission from: sdn-ar-005casfrmp182.dialsprint.net (158.252.212.184)
> 
> 
> AbstractParser.parse_file(self,filename) does not close the file it opens.
> 
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev

From biopython-bugs at bioperl.org  Wed Oct 24 21:56:27 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/47
Message-ID: <200110250156.f9P1uRB25817@pw600a.bioperl.org>

JitterBug notification

chapmanb changed notes

Message summary for PR#47
	From: gec@compbio.berkeley.edu
	Subject: Tutorial typos
	Date: Tue, 23 Oct 2001 23:19:41 -0400
	0 replies 	0 followups
	Notes: Thanks for the pointers! All are fixed in the tex file and on
the web.


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 23:19:41 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9O3JfB15056
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 23:19:41 -0400
Date: Tue, 23 Oct 2001 23:19:41 -0400
Message-Id: <200110240319.f9O3JfB15056@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Tutorial typos

Full_Name: Gavin Crooks
Module: Tutotial.tex
Version: 
OS: 
Submission from: sdn-ar-013casfrmp012.dialsprint.net (158.252.217.14)


The tutorial contains a few minor bugs.

Page 5: "Installation of FreeBSD" should be "Installation on FreeBSD"

Page 6: The first sentance of section 1.3.3 does not make sence.

Everywhere: "ie." should be "i.~e.~TheNextWord", or "i.~e.,"

Page 11 : "created for free for you" should be "created for free"?

Page 43ish: Some html has worked its way into the tex file, producing some odd
symbols. Plus some of the number have hats on.


From biopython-bugs at bioperl.org  Wed Oct 24 21:56:27 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:05 2005
Subject: [Biopython-dev] Notification: incoming/47
Message-ID: <200110250156.f9P1uRB25821@pw600a.bioperl.org>

JitterBug notification

chapmanb moved PR#47 from incoming to fixed-bugs
Message summary for PR#47
	From: gec@compbio.berkeley.edu
	Subject: Tutorial typos
	Date: Tue, 23 Oct 2001 23:19:41 -0400
	0 replies 	0 followups
	Notes: Thanks for the pointers! All are fixed in the tex file and on
the web.


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 23:19:41 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9O3JfB15056
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 23:19:41 -0400
Date: Tue, 23 Oct 2001 23:19:41 -0400
Message-Id: <200110240319.f9O3JfB15056@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: Tutorial typos

Full_Name: Gavin Crooks
Module: Tutotial.tex
Version: 
OS: 
Submission from: sdn-ar-013casfrmp012.dialsprint.net (158.252.217.14)


The tutorial contains a few minor bugs.

Page 5: "Installation of FreeBSD" should be "Installation on FreeBSD"

Page 6: The first sentance of section 1.3.3 does not make sence.

Everywhere: "ie." should be "i.~e.~TheNextWord", or "i.~e.,"

Page 11 : "created for free for you" should be "created for free"?

Page 43ish: Some html has worked its way into the tex file, producing some odd
symbols. Plus some of the number have hats on.


From pewilkinson at informaxinc.com  Wed Oct 24 22:19:04 2001
From: pewilkinson at informaxinc.com (Peter Wilkinson)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] mxTextools install and biopython 2.1
In-Reply-To: <200110241603.f9OG32B20618@pw600a.bioperl.org>
Message-ID: <005501c15cfb$6b776ce0$3ac53604@l001696w00>

Does anyone know why the mxTexttools is strangely configures? If activestate
Python 2.1 comes with Martel, and we install Biopython in the root of the
install as Bio:

How is mxTexttools supposed to be linked up properly, how and where is it
installed?  I had a problem with my install and I had to redo it. I can not
figure it out.

anyone?

Peter

> -----Original Message-----
> From: biopython-dev-admin@biopython.org
> [mailto:biopython-dev-admin@biopython.org]On Behalf Of
> biopython-dev-request@biopython.org
> Sent: Wednesday, October 24, 2001 10:03 AM
> To: biopython-dev@biopython.org
> Subject: Biopython-dev digest, Vol 1 #228 - 4 msgs
>
>
> Send Biopython-dev mailing list submissions to
> 	biopython-dev@biopython.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://biopython.org/mailman/listinfo/biopython-dev
> or, via email, send a message with subject or body 'help' to
> 	biopython-dev-request@biopython.org
>
> You can reach the person managing the list at
> 	biopython-dev-admin@biopython.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biopython-dev digest..."
>
>
> Today's Topics:
>
>    1. Notification: incoming/44 (biopython-bugs@bioperl.org)
>    2. Notification: incoming/45 (biopython-bugs@bioperl.org)
>    3. Notification: incoming/46 (biopython-bugs@bioperl.org)
>    4. Notification: incoming/47 (biopython-bugs@bioperl.org)
>
> --__--__--
>
> Message: 1
> Date: Tue, 23 Oct 2001 18:50:52 -0400
> From: biopython-bugs@bioperl.org
> To: biopython-dev@biopython.org
> Subject: [Biopython-dev] Notification: incoming/44
>
> JitterBug notification
>
> new message incoming/44
>
> Message summary for PR#44
> 	From: gec@compbio.berkeley.edu
> 	Subject: Raised no existant error?
> 	Date: Tue, 23 Oct 2001 18:50:52 -0400
> 	0 replies 	0 followups
>
> ====> ORIGINAL MESSAGE FOLLOWS <====
>
> >From gec@compbio.berkeley.edu Tue Oct 23 18:50:52 2001
> Received: from localhost (localhost [127.0.0.1])
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMoqB13192
> 	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct
> 2001 18:50:52 -0400
> Date: Tue, 23 Oct 2001 18:50:52 -0400
> Message-Id: <200110232250.f9NMoqB13192@pw600a.bioperl.org>
> From: gec@compbio.berkeley.edu
> To: biopython-bugs@bioperl.org
> Subject: Raised no existant error?
>
> Full_Name: Gavin Crooks
> Module: SCOP/Dom.py
> Version:
> OS:
> Submission from: sienna.berkeley.edu (128.32.236.51)
>
>
> When fed a corrupt file Dom.DomainParser will
> attempt to raise "error", but error hasn't been
> defined.
>
> NameError: global name 'error' is not defined
>
>
>
>
>
>
> --__--__--
>
> Message: 2
> Date: Tue, 23 Oct 2001 18:54:40 -0400
> From: biopython-bugs@bioperl.org
> To: biopython-dev@biopython.org
> Subject: [Biopython-dev] Notification: incoming/45
>
> JitterBug notification
>
> new message incoming/45
>
> Message summary for PR#45
> 	From: gec@compbio.berkeley.edu
> 	Subject: PDB sequence numbers can be negative
> 	Date: Tue, 23 Oct 2001 18:54:38 -0400
> 	0 replies 	0 followups
>
> ====> ORIGINAL MESSAGE FOLLOWS <====
>
> >From gec@compbio.berkeley.edu Tue Oct 23 18:54:38 2001
> Received: from localhost (localhost [127.0.0.1])
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMscB13266
> 	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct
> 2001 18:54:38 -0400
> Date: Tue, 23 Oct 2001 18:54:38 -0400
> Message-Id: <200110232254.f9NMscB13266@pw600a.bioperl.org>
> From: gec@compbio.berkeley.edu
> To: biopython-bugs@bioperl.org
> Subject: PDB sequence numbers can be negative
>
> Full_Name: Gavin Crooks
> Module: SCOP/Location.py
> Version:
> OS:
> Submission from: sienna.berkeley.edu (128.32.236.51)
>
>
>
> PDB residue sequence numbers can, on occasion, be
> negative. e.g. 1B9N. SCOP domains sometimes start
> on negative sequence numbers. This breaks the
> location parser in Bio.SCOP.Location.py
>
>
>
> --__--__--
>
> Message: 3
> Date: Tue, 23 Oct 2001 18:56:44 -0400
> From: biopython-bugs@bioperl.org
> To: biopython-dev@biopython.org
> Subject: [Biopython-dev] Notification: incoming/46
>
> JitterBug notification
>
> new message incoming/46
>
> Message summary for PR#46
> 	From: gec@compbio.berkeley.edu
> 	Subject: PDB sequence numbers can be negative
> 	Date: Tue, 23 Oct 2001 18:56:44 -0400
> 	0 replies 	0 followups
>
> ====> ORIGINAL MESSAGE FOLLOWS <====
>
> >From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
> Received: from localhost (localhost [127.0.0.1])
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
> 	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct
> 2001 18:56:44 -0400
> Date: Tue, 23 Oct 2001 18:56:44 -0400
> Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
> From: gec@compbio.berkeley.edu
> To: biopython-bugs@bioperl.org
> Subject: PDB sequence numbers can be negative
>
> Full_Name: Gavin Crooks
> Module: SCOP/Location.py
> Version:
> OS:
> Submission from: sienna.berkeley.edu (128.32.236.51)
>
>
>
> PDB residue sequence numbers can, on occasion, be
> negative. e.g. 1B9N. SCOP domains sometimes start
> on negative sequence numbers. This breaks the
> location parser in Bio.SCOP.Location.py
>
>
>
> --__--__--
>
> Message: 4
> Date: Tue, 23 Oct 2001 23:19:42 -0400
> From: biopython-bugs@bioperl.org
> To: biopython-dev@biopython.org
> Subject: [Biopython-dev] Notification: incoming/47
>
> JitterBug notification
>
> new message incoming/47
>
> Message summary for PR#47
> 	From: gec@compbio.berkeley.edu
> 	Subject: Tutorial typos
> 	Date: Tue, 23 Oct 2001 23:19:41 -0400
> 	0 replies 	0 followups
>
> ====> ORIGINAL MESSAGE FOLLOWS <====
>
> >From gec@compbio.berkeley.edu Tue Oct 23 23:19:41 2001
> Received: from localhost (localhost [127.0.0.1])
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9O3JfB15056
> 	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct
> 2001 23:19:41 -0400
> Date: Tue, 23 Oct 2001 23:19:41 -0400
> Message-Id: <200110240319.f9O3JfB15056@pw600a.bioperl.org>
> From: gec@compbio.berkeley.edu
> To: biopython-bugs@bioperl.org
> Subject: Tutorial typos
>
> Full_Name: Gavin Crooks
> Module: Tutotial.tex
> Version:
> OS:
> Submission from: sdn-ar-013casfrmp012.dialsprint.net (158.252.217.14)
>
>
>
> The tutorial contains a few minor bugs.
>
> Page 5: "Installation of FreeBSD" should be "Installation on FreeBSD"
>
> Page 6: The first sentance of section 1.3.3 does not make sence.
>
> Everywhere: "ie." should be "i.~e.~TheNextWord", or "i.~e.,"
>
> Page 11 : "created for free for you" should be "created for free"?
>
> Page 43ish: Some html has worked its way into the tex file,
> producing some odd
> symbols. Plus some of the number have hats on.
>
>
>
>
> --__--__--
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
>
>
> End of Biopython-dev Digest


From idoerg at cc.huji.ac.il  Thu Oct 25 03:55:12 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] mxTextools install and biopython 2.1
In-Reply-To: <005501c15cfb$6b776ce0$3ac53604@l001696w00>
Message-ID: <Pine.GSO.4.33_heb2.09.0110250946290.7307-200000@new-shum>

Hi Peter,

On Wed, 24 Oct 2001, Peter Wilkinson wrote:

:
: Does anyone know why the mxTexttools is strangely configures? If activestate
: Python 2.1 comes with Martel, and we install Biopython in the root of the
: install as Bio:
:
: How is mxTexttools supposed to be linked up properly, how and where is it
: installed?I had a problem with my install and I had to redo it. I can not
: figure it out.
:

Hi Peter,

I remember running into the same problems myself, though I'm a bit fuzzy
about the whys and wherefores of the solution. Anyhow, it works for me,
so I checked my

/usr/local/lib/Python2.1/site-packages/mx tree. Apparently I put a dummy
__init__.py file under mx/ directory, and the rest is undes mx/TextTools

I'm attaching the tree scheme. I hope this helps.

Iddo


--

Iddo Friedberg                                  | Tel: +972-2-6757374
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


-------------- next part --------------
From idoerg@arrakis.md.huji.ac.il Thu Oct 25 09:53:00 2001
Date: Thu, 25 Oct 2001 09:49:33 +0200
From: Iddo Friedberg <idoerg@arrakis.md.huji.ac.il>
To: idoerg@cc.huji.ac.il

/usr/local/lib/python2.1/site-packages/mx/
|-- TextTools
|   |-- Constants
|   |   |-- Sets.py
|   |   |-- Sets.pyc
|   |   |-- TagTables.py
|   |   |-- TagTables.pyc
|   |   |-- __init__.py
|   |   `-- __init__.pyc
|   |-- Doc
|   |   `-- mxTextTools.html
|   |-- Examples
|   |   |-- HTML.py
|   |   |-- HTML.pyc
|   |   |-- Loop.py
|   |   |-- Python.py
|   |   |-- RTF.py
|   |   |-- RegExp.py
|   |   |-- Tim.py
|   |   |-- Words.py
|   |   |-- __init__.py
|   |   |-- __init__.pyc
|   |   |-- altRTF.py
|   |   `-- pytag.py
|   |-- LICENSE
|   |-- Makefile.pkg
|   |-- README
|   |-- TextTools.py
|   |-- TextTools.pyc
|   |-- __init__.py
|   |-- __init__.pyc
|   `-- mxTextTools
|       |-- Makefile
|       |-- Makefile.pre
|       |-- Makefile.pre.in
|       |-- Setup
|       |-- Setup.in
|       |-- __init__.py
|       |-- __init__.pyc
|       |-- config.c
|       |-- mx.h
|       |-- mxTextTools.c
|       |-- mxTextTools.def
|       |-- mxTextTools.h
|       |-- mxTextTools.o
|       |-- mxTextTools.pyd
|       |-- mxTextTools.so
|       |-- mxbmse.c
|       |-- mxbmse.h
|       |-- mxbmse.o
|       |-- mxh.h
|       |-- mxpyapi.h
|       |-- mxstdlib.h
|       |-- mxte.c
|       |-- mxte.h
|       |-- mxte.o
|       |-- sedscript
|       `-- test.py
|-- __init__.py
`-- __init__.pyc

5 directories, 54 files
From adalke at mindspring.com  Thu Oct 25 04:18:57 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] mxTextools install and biopython 2.1
Message-ID: <0abc01c15d2d$b234c9c0$0301a8c0@josiah.dalkescientific.com>

Peter Wilkinson:
>Does anyone know why the mxTexttools is strangely configures? If
activestate
>Python 2.1 comes with Martel, and we install Biopython in the root of the
>install as Bio:

ActiveState Python comes with Martel?  That's news to me!
I suspect that's a typo, and you meant "comes with mxTextTools."
Here's the list of extensions shipped with their Python 2.1
  http://aspn.activestate.com/ASPN/Downloads/ActivePython/Extensions/

I don't know anything about that distribution so I can't give
you any pointers about it.

>How is mxTexttools supposed to be linked up properly, how and where is it
>installed?  I had a problem with my install and I had to redo it. I can not
>figure it out.
>
>anyone?

What went wrong?  If it comes with ActiveState Python then it should
just work.  If you're installing mxTextTools from scratch, it should
be a matter of following the instructures.  It's distutils enabled,
right?  (Just checked.  Yes.)  So "python setup.py install" should
do things just fine.

It gets installed in the standard installation location.  On Unix
machines it's something like
   /usr/local/lib/python2.1/site-packages
(where the "/usr/local" comes from the installation prefix and
is usually "/usr" for Linux boxes, and where the "2.1" comes
from the Python version number.)

(There are a few other places it could be installed which would
still work.  They are almost never used.)

Just copy&paste your work session.  That should be enough for
me or someone else on the list to figure out what's munged up.

                    Andrew
                    dalke@dalkescientific.com


From biopython-bugs at bioperl.org  Thu Oct 25 09:36:28 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/51
Message-ID: <200110251336.f9PDaRB30900@pw600a.bioperl.org>

JitterBug notification

new message incoming/51

Message summary for PR#51
	From: crm17@cornell.edu
	Subject: biopython-1.00a3/Bio/__init__.py error
	Date: Thu, 25 Oct 2001 09:36:27 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From crm17@cornell.edu Thu Oct 25 09:36:27 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9PDaRB30894
	for <biopython-bugs@pw600a.bioperl.org>; Thu, 25 Oct 2001 09:36:27 -0400
Date: Thu, 25 Oct 2001 09:36:27 -0400
Message-Id: <200110251336.f9PDaRB30894@pw600a.bioperl.org>
From: crm17@cornell.edu
To: biopython-bugs@bioperl.org
Subject: biopython-1.00a3/Bio/__init__.py error

Full_Name: Chris Myers
Module: biopython-1.00a3/Bio/__init__.py
Version: biopython-1.00a3
OS: Linux
Submission from: sowhat.tc.cornell.edu (128.84.162.75)


the __all__ definition in 
biopython-1.00a3/Bio/__init__.py is missing a
comma between "Alphabet" and "Blast"

__all__ = [
    "Align",
    "Alphabet"
    "Blast",
# ...
]

The result is that 

from Bio import *

chokes, claiming:

  AttributeError: 'Bio' module has no 
  attribute 'AlphabetBlast'

Adding the missing comma fixes this.


From biopython-bugs at bioperl.org  Thu Oct 25 09:37:29 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/52
Message-ID: <200110251337.f9PDbTB30989@pw600a.bioperl.org>

JitterBug notification

new message incoming/52

Message summary for PR#52
	From: crm17@cornell.edu
	Subject: biopython-1.00a3/Bio/__init__.py error
	Date: Thu, 25 Oct 2001 09:37:29 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From crm17@cornell.edu Thu Oct 25 09:37:29 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9PDbTB30983
	for <biopython-bugs@pw600a.bioperl.org>; Thu, 25 Oct 2001 09:37:29 -0400
Date: Thu, 25 Oct 2001 09:37:29 -0400
Message-Id: <200110251337.f9PDbTB30983@pw600a.bioperl.org>
From: crm17@cornell.edu
To: biopython-bugs@bioperl.org
Subject: biopython-1.00a3/Bio/__init__.py error

Full_Name: Chris Myers
Module: biopython-1.00a3/Bio/__init__.py
Version: biopython-1.00a3
OS: Linux
Submission from: sowhat.tc.cornell.edu (128.84.162.75)


the __all__ definition in 
biopython-1.00a3/Bio/__init__.py is missing a
comma between "Alphabet" and "Blast"

__all__ = [
    "Align",
    "Alphabet"
    "Blast",
# ...
]

The result is that 

from Bio import *

chokes, claiming:

  AttributeError: 'Bio' module has no 
  attribute 'AlphabetBlast'

Adding the missing comma fixes this.


From biopython-bugs at bioperl.org  Thu Oct 25 15:22:41 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/51
Message-ID: <200110251922.f9PJMeB01895@pw600a.bioperl.org>

JitterBug notification

chapmanb changed notes

Message summary for PR#51
	From: crm17@cornell.edu
	Subject: biopython-1.00a3/Bio/__init__.py error
	Date: Thu, 25 Oct 2001 09:36:27 -0400
	0 replies 	0 followups
	Notes: Thanks for pointing this out. Fixed in CVS (my easiest fix ever :-)


====> ORIGINAL MESSAGE FOLLOWS <====

>From crm17@cornell.edu Thu Oct 25 09:36:27 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9PDaRB30894
	for <biopython-bugs@pw600a.bioperl.org>; Thu, 25 Oct 2001 09:36:27 -0400
Date: Thu, 25 Oct 2001 09:36:27 -0400
Message-Id: <200110251336.f9PDaRB30894@pw600a.bioperl.org>
From: crm17@cornell.edu
To: biopython-bugs@bioperl.org
Subject: biopython-1.00a3/Bio/__init__.py error

Full_Name: Chris Myers
Module: biopython-1.00a3/Bio/__init__.py
Version: biopython-1.00a3
OS: Linux
Submission from: sowhat.tc.cornell.edu (128.84.162.75)


the __all__ definition in 
biopython-1.00a3/Bio/__init__.py is missing a
comma between "Alphabet" and "Blast"

__all__ = [
    "Align",
    "Alphabet"
    "Blast",
# ...
]

The result is that 

from Bio import *

chokes, claiming:

  AttributeError: 'Bio' module has no 
  attribute 'AlphabetBlast'

Adding the missing comma fixes this.


From biopython-bugs at bioperl.org  Thu Oct 25 15:22:41 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/51
Message-ID: <200110251922.f9PJMfB01899@pw600a.bioperl.org>

JitterBug notification

chapmanb moved PR#51 from incoming to fixed-bugs
Message summary for PR#51
	From: crm17@cornell.edu
	Subject: biopython-1.00a3/Bio/__init__.py error
	Date: Thu, 25 Oct 2001 09:36:27 -0400
	0 replies 	0 followups
	Notes: Thanks for pointing this out. Fixed in CVS (my easiest fix ever :-)


====> ORIGINAL MESSAGE FOLLOWS <====

>From crm17@cornell.edu Thu Oct 25 09:36:27 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9PDaRB30894
	for <biopython-bugs@pw600a.bioperl.org>; Thu, 25 Oct 2001 09:36:27 -0400
Date: Thu, 25 Oct 2001 09:36:27 -0400
Message-Id: <200110251336.f9PDaRB30894@pw600a.bioperl.org>
From: crm17@cornell.edu
To: biopython-bugs@bioperl.org
Subject: biopython-1.00a3/Bio/__init__.py error

Full_Name: Chris Myers
Module: biopython-1.00a3/Bio/__init__.py
Version: biopython-1.00a3
OS: Linux
Submission from: sowhat.tc.cornell.edu (128.84.162.75)


the __all__ definition in 
biopython-1.00a3/Bio/__init__.py is missing a
comma between "Alphabet" and "Blast"

__all__ = [
    "Align",
    "Alphabet"
    "Blast",
# ...
]

The result is that 

from Bio import *

chokes, claiming:

  AttributeError: 'Bio' module has no 
  attribute 'AlphabetBlast'

Adding the missing comma fixes this.


From biopython-bugs at bioperl.org  Thu Oct 25 15:23:39 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/52
Message-ID: <200110251923.f9PJNdB01997@pw600a.bioperl.org>

JitterBug notification

chapmanb changed notes

Message summary for PR#52
	From: crm17@cornell.edu
	Subject: biopython-1.00a3/Bio/__init__.py error
	Date: Thu, 25 Oct 2001 09:37:29 -0400
	0 replies 	0 followups
	Notes: Duplicate of bug 51


====> ORIGINAL MESSAGE FOLLOWS <====

>From crm17@cornell.edu Thu Oct 25 09:37:29 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9PDbTB30983
	for <biopython-bugs@pw600a.bioperl.org>; Thu, 25 Oct 2001 09:37:29 -0400
Date: Thu, 25 Oct 2001 09:37:29 -0400
Message-Id: <200110251337.f9PDbTB30983@pw600a.bioperl.org>
From: crm17@cornell.edu
To: biopython-bugs@bioperl.org
Subject: biopython-1.00a3/Bio/__init__.py error

Full_Name: Chris Myers
Module: biopython-1.00a3/Bio/__init__.py
Version: biopython-1.00a3
OS: Linux
Submission from: sowhat.tc.cornell.edu (128.84.162.75)


the __all__ definition in 
biopython-1.00a3/Bio/__init__.py is missing a
comma between "Alphabet" and "Blast"

__all__ = [
    "Align",
    "Alphabet"
    "Blast",
# ...
]

The result is that 

from Bio import *

chokes, claiming:

  AttributeError: 'Bio' module has no 
  attribute 'AlphabetBlast'

Adding the missing comma fixes this.


From biopython-bugs at bioperl.org  Thu Oct 25 15:23:40 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/52
Message-ID: <200110251923.f9PJNeB02001@pw600a.bioperl.org>

JitterBug notification

chapmanb moved PR#52 from incoming to fixed-bugs
Message summary for PR#52
	From: crm17@cornell.edu
	Subject: biopython-1.00a3/Bio/__init__.py error
	Date: Thu, 25 Oct 2001 09:37:29 -0400
	0 replies 	0 followups
	Notes: Duplicate of bug 51


====> ORIGINAL MESSAGE FOLLOWS <====

>From crm17@cornell.edu Thu Oct 25 09:37:29 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9PDbTB30983
	for <biopython-bugs@pw600a.bioperl.org>; Thu, 25 Oct 2001 09:37:29 -0400
Date: Thu, 25 Oct 2001 09:37:29 -0400
Message-Id: <200110251337.f9PDbTB30983@pw600a.bioperl.org>
From: crm17@cornell.edu
To: biopython-bugs@bioperl.org
Subject: biopython-1.00a3/Bio/__init__.py error

Full_Name: Chris Myers
Module: biopython-1.00a3/Bio/__init__.py
Version: biopython-1.00a3
OS: Linux
Submission from: sowhat.tc.cornell.edu (128.84.162.75)


the __all__ definition in 
biopython-1.00a3/Bio/__init__.py is missing a
comma between "Alphabet" and "Blast"

__all__ = [
    "Align",
    "Alphabet"
    "Blast",
# ...
]

The result is that 

from Bio import *

chokes, claiming:

  AttributeError: 'Bio' module has no 
  attribute 'AlphabetBlast'

Adding the missing comma fixes this.


From katel at worldpath.net  Thu Oct 25 20:02:46 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Last chance to clam PIR parser
Message-ID: <000801c15db1$8c21bb60$3170bbd1@g0fjl>

  I've completed checking the MASE( IntelliGenetics ) parser and I would like to start the PIR parser.  Let me know if you are working on it currently.

                         Cayte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011025/7c3b2d8f/attachment.htm
From katel at worldpath.net  Thu Oct 25 20:21:17 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Phylogenetic rats nests( oops I meant trees )
Message-ID: <001401c15db4$21ffe740$3170bbd1@g0fjl>

Many of the bioformats represent phylogenetic data as well as sequence or path data.  I can think of a number of problems with
phylogenetic data.

1.  A number of types of relationships are possible.  A sequence may be a descendent or an ancestor of another sequence.
They both may have a common ancestor.  They may have converged to the same patteern.  They may have hopped across species.  Whatever the arguments against transgenic species, the assertion that it never happens in nature ain'tr so!  The latest issue of Natural History describes how the vertebrate immune system may have once been a parasite.

2.  Researchers often don't agree among themselves what these relationships are.  Should the links contain epistemology links that describe the level of confidence and the methodology plus journal references?

3.  Links will change as new research unfolds.  This will be a maintenance issue( luckily not ours ).  But we need the ability to easily change links and remove dead links.  Should a mechanism for the storage historical information be provided?

4. What if an intermediate is found between an ancestor descendent pair? Should we delete the old link?  Then the annotation will be lost.  Should the old link contain pointers to the new links?

5.  Should we limit our scope to just seuences?

Please share your thoughts on this.

                                                                 Cayte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011025/8b04bd4e/attachment.htm
From adalke at mindspring.com  Fri Oct 26 00:27:32 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Last chance to clam PIR parser
Message-ID: <0d0201c15dd6$885a7a60$0301a8c0@josiah.dalkescientific.com>

Hey Cayte,

Oops!  I forgot to follow up when you mentioned this the first time.
There's a set of parsers in Martel (pre-CVS version, not yet
integrated into the current code base).  One of them is for PIR.

The last Martel release I see was version 0.5
  http://www.biopython.org/~dalke/Martel-0.5.tar.gz
and the PIR format definition for Martel is at
  http://www.biopython.org/~dalke/Martel-0.5/formats/PIR_3_0.py

I don't recall the status of the parser.  Searching through
the back emails I see a thread of mine titled "PIR parsing" so
you might want to start from
 http://biopython.org/pipermail/biopython-dev/2000-December/000186.html

BTW, in other news I'm getting closer to signing a development
agreement to work more on Biopython, which means spending time
finishing up Martel, working on format definitions, etc.

                    Andrew


From thomas at cbs.dtu.dk  Fri Oct 26 03:10:35 2001
From: thomas at cbs.dtu.dk (Thomas Sicheritz-Ponten)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Phylogenetic rats nests( oops I meant trees )
In-Reply-To: "Cayte"'s message of "Thu, 25 Oct 2001 20:21:17 -0400"
References: <001401c15db4$21ffe740$3170bbd1@g0fjl>
Message-ID: <y9vzo6eev10.fsf@genome.cbs.dtu.dk>

"Cayte" <katel@worldpath.net> writes:

> Many of the bioformats represent phylogenetic data as well as sequence or path
> data.? I can think of a number of problems with
> phylogenetic data.


I do not really understand your questions. Are you concerned about how to
store and convert sequence formats containing alignments, or are planning a
huge phylogeny database project or are you trying to answer philosophical
aspects of molecular evolution :-) ?

The sequence is the base object. An alignment represents
- one among many - solution to linearize sequences. A phylogenetic tree is
way of clustering sequences considering evolutionary changes. The
reconstruction of a phylogenetic tree is most of the time based on a
sequence alignment and dependents on how you interpret the data and which
method you use (Distance [UPGMA and, Neighbor Joining], Maximum Parsimony
and Maximum Likelihood)
> 
> 1.? A number of types of relationships are possible.? A sequence may be a
> descendent or an ancestor of another sequence.
> 
> They both may have a common ancestor.? They may have converged to the same
> patteern.? They may have hopped across species.? Whatever the arguments against
> transgenic species, the assertion that it never happens in nature ain'tr so!? The
> latest issue of Natural History describes how the vertebrate immune system may have
> once been a parasite.

Lateral (horizontal) gene transfer [LGT] is very common, the biggest known events
are the origin of eukaryotic mitochondria from alpha-proteo bacteria and
the origin of plant chloroplast from cyanobacteria. There is a still
ongoing flow of genetic material between Thermotoga (eubacteria) and
Pyrococcus (archaea), which makes it hard to tell the original "owner" of a
gene. 
But: LGT does mainly affect our _interpretation_ of sequence data.

> 2.? Researchers often don't agree among themselves what these relationships are.?
> Should the links contain epistemology links that describe the level of confidence
> and the methodology plus journal references?

Who is going to decide the level of confidence ? 
a referee ? 
the bootstrap values ?
There is no way to prove a phylogenetic relationship with sequences only.

> 3.? Links will change as new research unfolds.? This will be a maintenance issue(
> luckily not ours ).?But we need the ability to easily change links and remove dead
> links.? Should?a mechanism for the?storage historical information be provided?

New phylogenetic trees with new information and biological interpretation
will emerge and be published ... which will result in new sequence entries.
> 
> 4. What if an intermediate is found between an ancestor descendent pair?
> Should we delete the old link?? Then the annotation will be lost.? Should
> the old link contain pointers to the new links?

What exactly are "links" ? Is this a synonym for nodes in the tree or
hyperlinks (XRef's) in e.g. EMBL annotations ?

> 5.? Should we limit our scope to just seuences?

What is the original problem description ? If you are planning normal
sequence/alignment/tree format storage, then you should not include
additional interpretations and views which are not found in the original
experiment (experiment =alignment, tree reconstruction, evolutionary interpretation)

On the other hand, if you are thinking about an internal phylogeny database
which gets dynamically updated during e.g. the coordination of sequencing
projects, then trees should be reconstructed after each change and
contradicting node annotations should get logged inside the database.

Could you please mail me your intended scope ?

maybe-I-should-start-the-day-with-coffee-instead-of-biopython-postings'ly yr's 
thomas

-- 
Sicheritz-Ponten Thomas, Ph.D  CBS, Department of Biotechnology
thomas@biopython.org           The Technical University of Denmark
CBS:  +45 45 252489            Building 208, DK-2800 Lyngby
Fax   +45 45 931585            http://www.cbs.dtu.dk/thomas

	De Chelonian Mobile ... The Turtle Moves ...

From klindner at tality.com  Fri Oct 26 13:26:42 2001
From: klindner at tality.com (Kathy Lindner)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Re: Phylogenetic scope
Message-ID: <000701c15e43$61492d50$bda58c9e@pc-klindner1.cadence.com>

  The scope is simply to represent phylogenetic data if is comes with a sequence.  Some formats( nexus for example ) support phylogenetic data.  No databases, please.:)
                                                                           Cayte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011026/3b4c5e21/attachment.htm
From katel at worldpath.net  Sat Oct 27 00:19:15 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Last chance to clam PIR parser
References: <0d0201c15dd6$885a7a60$0301a8c0@josiah.dalkescientific.com>
Message-ID: <003e01c15e9e$8b931340$ea70bbd1@g0fjl>


----- Original Message ----- 
From: "Andrew Dalke" <adalke@mindspring.com>
To: <biopython-dev@biopython.org>
Sent: Friday, October 26, 2001 12:27 AM
Subject: Re: [Biopython-dev] Last chance to clam PIR parser


> The last Martel release I see was version 0.5
>   http://www.biopython.org/~dalke/Martel-0.5.tar.gz
> and the PIR format definition for Martel is at
>   http://www.biopython.org/~dalke/Martel-0.5/formats/PIR_3_0.py
> 
   What formats can I work on without a collision? GCG/MSF?

  The PIR format described in 
www.sander.embl-ebi.ac.uk/Services/webin/help/webin-align/align_format_help.html#pir

seems different and more fastalike than the way you describe PIR.  Are there different renditions of pir?


                       Cayte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011027/4b3aaba2/attachment.htm
From adalke at mindspring.com  Sat Oct 27 00:34:04 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Last chance to clam PIR parser
Message-ID: <10a001c15ea0$9ca4aa20$0301a8c0@josiah.dalkescientific.com>

Cayte:
>  The PIR format described in
>www.sander.embl-ebi.ac.uk/Services/webin/help/webin-align/align_format_help
.html#pir

> seems different and more fastalike than the way you describe PIR.  Are
> there different renditions of pir?

Oh, right.  Yes, there are.  The PIR format I have is for the CODATA
format, which includes a lot more data than the NBRF format you're
probably thinking of.

PIR releases data in three formats:
  CODATA -- human readable (meaning it has 2D formatting to make
      it easier to find the different sections), hard to parse --
      even with Martel
  NBRF -- machine readable, hard for humans to read
  XML (new) -- somewhere in the middle
These are linked from
  http://pir.georgetown.edu/pirwww/dbinfo/pirpsd.html

It seems most people want to read the NBRF format, not the CODATA
one.  But I did the CODATA one because I was thinking about the
"convert to HTML" ability of Martel.  Also, there were more fields
in the CODATA format than the other two -- at least, there were
fields there which were undocumented.  I sent some email to the
PIR people on them but never got a response about what they meant.

                    Andrew
                    dalke@dalkescientific.com


From katel at worldpath.net  Sat Oct 27 01:04:29 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Last chance to clam PIR parser
References: <10a001c15ea0$9ca4aa20$0301a8c0@josiah.dalkescientific.com>
Message-ID: <005001c15ea4$dd44b6c0$ea70bbd1@g0fjl>

----- Original Message -----
From: "Andrew Dalke" <adalke@mindspring.com>
To: <biopython-dev@biopython.org>
Sent: Saturday, October 27, 2001 12:34 AM
Subject: Re: [Biopython-dev] Last chance to clam PIR parser


> Cayte:
> >  The PIR format described in
> >www.sander.embl-ebi.ac.uk/Services/webin/help/webin-align/align_format_help
> .html#pir
>
> > seems different and more fastalike than the way you describe PIR.  Are
> > there different renditions of pir?
>
> Oh, right.  Yes, there are.  The PIR format I have is for the CODATA
> format, which includes a lot more data than the NBRF format you're
> probably thinking of.
>
  So shud I write a parser for NBRF?  With the recession, I'm working 2 days a
week.  Might as well take advantage of the free time, it won'y last.

                                    Cayte


From adalke at mindspring.com  Sat Oct 27 00:49:41 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Last chance to clam PIR parser
Message-ID: <10b801c15ea2$cabbc220$0301a8c0@josiah.dalkescientific.com>

Cayte:
>  So shud I write a parser for NBRF?

Yes.

I think more people are interested in importing data into
their DBMS then marking it up into HTML.

                    Andrew


From j.joung at AptusGenomics.com  Tue Oct 30 14:17:29 2001
From: j.joung at AptusGenomics.com (Jeong Joung)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] RE: Parsing Protein GenBank Records
In-Reply-To: <20010918212532.A3580@ci350185-a.athen1.ga.home.com>
Message-ID: <OIEFKBIBGGKFMCHEPFMJAEPNCAAA.j.joung@AptusGenomics.com>

Hello,

Thanks for your help. The updated parser now works well for most REFSEQ
proteins. I came across several REFSEQ protein records where the parser
still fails on UNIX machine. The following is the error message:

Traceback (most recent call last):
entry = parser.parse(gb_handle)
File "/usr/.../Bio/GenBank/__init__.py", line 281, in parse
self._scanner.feed(handle, self_consumer)
File "/usr/.../Bio/GenBank/__init__.py", line 1143, in feed
self._parser.parseFile(handle)
File "/usr/.../Martel/Parser.py", line 226, in parseFile
self.parseString(fileobj.read())
File "/usr/.../Martel/Parser.py", line 254, in parseString
self._err_handler.fatalError(result)  File
"/usr/.../python2.1/xml/sax/handler.py", line 38, in fatalError
raise exceptionParserPositionException: error parsing at or beyond character
2889

Any help will be greatly appreciated.

Thank You,
Jeong

-----Original Message-----
From: Brad Chapman [mailto:chapmanb@arches.uga.edu]
Sent: Tuesday, September 18, 2001 9:26 PM
To: Jeong Joung
Cc: biopython-dev@biopython.org
Subject: Re: Parsing Protein GenBank Records


Hi Joung;
(ccing this to biopython-dev since this is relevant to everyone)

> I'm having trouble parsing GenBank records obtained from the protein
> database. The parser works fine for nucleotide GenBank records , but not
for
> protein records. I would appreciate it very much if you can guide me in
> right direction for parsing such records.
>
> Here is the code and the error that I get back.
>
> >>> parser = GenBank.RecordParser()
> >>> ncbi = GenBank.NCBIDictionary(database='Protein')
> >>> rec = ncbi['6754304']

The parser does work for proteins in general, but does fail badly on
this particular REFSEQ sequence. In the past, REFSEQ stuff has been
only "sort of" GenBank format, and this record is no exception. It
has a lot of formatting problems (has no identifier for the sequence
type in the LOCUS line, has extra DBSOURCE tag, has non-standard
feature table types and keys (Protein, Region, region_name)).
Anyways, it is a big non-standard formatting mess.

I've fixed the GenBank parser to be able to handle this, and checked
the changes into CVS. Diffs to the relevant files (Record.py,
__init__.py and genbank_format.py in Bio.GenBank) are also attached
to this file in case you don't have CVS access.

Thanks for the bug report. Hope this works for you!

Brad
--
PGP public key available from http://pgp.mit.edu/


From j.joung at AptusGenomics.com  Tue Oct 30 15:48:19 2001
From: j.joung at AptusGenomics.com (Jeong Joung)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Parsing Protein GenBank Records
Message-ID: <OIEFKBIBGGKFMCHEPFMJIEPNCAAA.j.joung@AptusGenomics.com>

Hi,

I just found out that this problem occurs on some REFSEQ nucleotide records
as well.

Thank You,
Jeong

-----Original Message-----
From: Jeong Joung [mailto:j.joung@AptusGenomics.com]
Sent: Tuesday, October 30, 2001 2:17 PM
To: Brad Chapman
Cc: biopython-dev@biopython.org
Subject: RE: Parsing Protein GenBank Records


Hello,

Thanks for your help. The updated parser now works well for most REFSEQ
proteins. I came across several REFSEQ protein records where the parser
still fails on UNIX machine. The following is the error message:

Traceback (most recent call last):
entry = parser.parse(gb_handle)
File "/usr/.../Bio/GenBank/__init__.py", line 281, in parse
self._scanner.feed(handle, self_consumer)
File "/usr/.../Bio/GenBank/__init__.py", line 1143, in feed
self._parser.parseFile(handle)
File "/usr/.../Martel/Parser.py", line 226, in parseFile
self.parseString(fileobj.read())
File "/usr/.../Martel/Parser.py", line 254, in parseString
self._err_handler.fatalError(result)  File
"/usr/.../python2.1/xml/sax/handler.py", line 38, in fatalError
raise exceptionParserPositionException: error parsing at or beyond character
2889

Any help will be greatly appreciated.

Thank You,
Jeong

-----Original Message-----
From: Brad Chapman [mailto:chapmanb@arches.uga.edu]
Sent: Tuesday, September 18, 2001 9:26 PM
To: Jeong Joung
Cc: biopython-dev@biopython.org
Subject: Re: Parsing Protein GenBank Records


Hi Joung;
(ccing this to biopython-dev since this is relevant to everyone)

> I'm having trouble parsing GenBank records obtained from the protein
> database. The parser works fine for nucleotide GenBank records , but not
for
> protein records. I would appreciate it very much if you can guide me in
> right direction for parsing such records.
>
> Here is the code and the error that I get back.
>
> >>> parser = GenBank.RecordParser()
> >>> ncbi = GenBank.NCBIDictionary(database='Protein')
> >>> rec = ncbi['6754304']

The parser does work for proteins in general, but does fail badly on
this particular REFSEQ sequence. In the past, REFSEQ stuff has been
only "sort of" GenBank format, and this record is no exception. It
has a lot of formatting problems (has no identifier for the sequence
type in the LOCUS line, has extra DBSOURCE tag, has non-standard
feature table types and keys (Protein, Region, region_name)).
Anyways, it is a big non-standard formatting mess.

I've fixed the GenBank parser to be able to handle this, and checked
the changes into CVS. Diffs to the relevant files (Record.py,
__init__.py and genbank_format.py in Bio.GenBank) are also attached
to this file in case you don't have CVS access.

Thanks for the bug report. Hope this works for you!

Brad
--
PGP public key available from http://pgp.mit.edu/