[Biopython-dev] Error in tutorial program

abc at palantir.chem.emory.edu abc at palantir.chem.emory.edu
Fri Oct 18 08:01:39 EDT 2002


Hi,

I've been working through the Tutorial document and I noticed that the
Cypripedioideae FASTA parsing example is somewhat broken.  The
Doc/examples/fasta_consumer.py script works if you use the
ls_orchid.fasta file from the same directory.  However, if you do your own
Entrez search and run the script on it you will get an error like this:


Traceback (most recent call last):
  File "./x.py", line 32, in ?
    extract_organisms('/home/abc/x.fasta', 95)
  File "./x.py", line 27, in extract_organisms
    scanner.feed(file_to_parse, consumer)
  File "/usr/local/lib/python2.2/site-packages/Bio/Fasta/__init__.py", line 225, in feed
    self._scan_record(uhandle, consumer)
  File "/usr/local/lib/python2.2/site-packages/Bio/Fasta/__init__.py", line 229, in _scan_record
    self._scan_title(uhandle, consumer)
  File "/usr/local/lib/python2.2/site-packages/Bio/Fasta/__init__.py", line 234, in _scan_title
    read_and_call(uhandle, consumer.title, start='>')
  File "/usr/local/lib/python2.2/site-packages/Bio/ParserSupport.py", line 331, in read_and_call
    raise SyntaxError, errmsg
SyntaxError: Line does not start with '>':
TCAGCGGTGGCTCACTGACTGGGTTGCATCCAAGTGGCCGTCACCGCCCATGGGGTTGACGTGCCTCCAA

The problem occurs because the individual records in the search output that I
get aren't separated by newlines like the ones in ls_orchid.fasta.  The
Bio.Fasta._Scanner class can't handle this format unless the file handle you
pass to it implements the `saveline' method (even though it doesn't say this).
fasta_consumer.py just passes a regular python file object.

Below are a couple of simple fixes.  The _Scanner patch potentially breaks
existing code, so maybe it's not so good.

Regards,

Ben




--- Doc/examples/fasta_consumer.py	2001/02/01 05:45:28	1.3
+++ Doc/examples/fasta_consumer.py	2002/10/17 15:31:36
@@ -1,6 +1,8 @@
 from Bio.ParserSupport import AbstractConsumer
 from Bio import Fasta
+from Bio.File import UndoHandle
 
+
 import string
 
 class SpeciesExtractor(AbstractConsumer):
@@ -19,9 +21,9 @@
 def extract_organisms(file, num_records):
     scanner = Fasta._Scanner()
     consumer = SpeciesExtractor()
-
-    file_to_parse = open(file, 'r')
 
+    file_to_parse = UndoHandle(open(file, 'r'))
+    
     for fasta_record in range(num_records):
         scanner.feed(file_to_parse, consumer)





--- Bio/Fasta/__init__.py	2001/07/05 23:56:49	1.4
+++ Bio/Fasta/__init__.py	2002/10/17 15:30:47
@@ -211,15 +211,12 @@
     def feed(self, handle, consumer):
         """feed(self, handle, consumer)
 
-        Feed in FASTA data for scanning.  handle is a file-like object
-        containing FASTA data.  consumer is a Consumer object that will
-        receive events as the FASTA data is scanned.
+        Feed in FASTA data for scanning.  handle is an instance of
+        File.UndoHandle containing FASTA data.  consumer is a Consumer
+        object that will receive events as the FASTA data is scanned.
 
         """
-        if isinstance(handle, File.UndoHandle):
-            uhandle = handle
-        else:
-            uhandle = File.UndoHandle(handle)
+        assert isinstance(handle, File.UndoHandle):
         
         if uhandle.peekline():
             self._scan_record(uhandle, consumer)




More information about the Biopython-dev mailing list