[Biopython-dev] Martel changes
Jeffrey Chang
jchang at smi.stanford.edu
Fri Dec 14 02:01:59 EST 2001
On Wed, Dec 12, 2001 at 01:05:55PM -0700, Andrew Dalke wrote:
> Me:
> >> Is anyone using the iterator facility in Martel?
>
> Jeff:
> >Yes. I'm using it in Bio/Medline/NLMMedlineXML to parse the
> >XML-formatted PubMed records. Each XML file contains about ~30000
> >records and is too big to keep in memory at once.
Oops, I just looked over the code. I'm in fact not using the
iterator, but thre RecordReader. Sorry about the confusion!
[adding Word, Integer, ... as built-in expressions]
> When do you use Unprintable? When do you use Punctuation?
I use them both for matching things in english text. Sometimes the
text contains unprintable characters from foreign character sets.
> My 'Float' isn't very powerful, as it only understands
> numbers of the form (with optional +/-)
> 1
> 1.
> 1.2
> .2
>
> It doesn't handle things like 1E-3, or IEEE values
> like NaN or +Inf. I could (and probably should) support
> the first of these. I'm not sure if I should the second.
It gets pretty complicated, e.g.
1.315E2.24
Jeff
More information about the Biopython-dev
mailing list