[BioPython] QIO
Andrew Dalke
dalke@bioreason.com
Tue, 05 Oct 1999 23:13:06 -0600
Tim Peters <tim_one@email.msn.com> said:
| 3 > 2-3, even if you don't read the latter as subtraction <wink>.
Weeelllll, some necks are longer than others. I do believe a
camel's neck is longer than that of a python (and just *where* is
the neck on a python?)
>Without changing anything in Python, you can get a major speed
> boost by "chunking" the input, as in:
>
>BUFSIZE = 100000
>while 1:
> lines = f.readlines(BUFSIZE)
> if not lines:
> break
> for line in lines:
> xxx
>
Ewan Birney (of bioperl) wants to do something along the lines of
a cookbook for bioinformatics. An equivalent for Python should
include this snippet.
>I'm surprised people don't do this more often -- I suppose because it's 5
>lines of boilerplate instead of 4 <wink>.
First off, I haven't gotten into the habit of using that idiom.
There are quite a few places where I could/should have done things
that way.
However, there are certain types of parsers I've written which
work like:
infile = open("spam.txt")
header = parse_header(infile)
content = []
while 1:
data = parse_content(infile)
if not data:
break
content.append(data)
that is, the input file object is passed around to different
routines which each expect to consume a line at a time.
Actually, when I did this sort of work in Perl I ended up passing
long @lists of lines around, and popping off the list to consume.
Made it easy when I needed to push back for look-aheads.
>not-opposed-to-fast-input-ly y'rs - tim
Of course, I also want to take a look at Andrew Kuckling's memory
mapped IO to see how well that works. And I'll have time to start
looking at this in only a few more weeks....
Andrew
dalke@acm.org