[Biopython-dev] Martel based replacement for Fasta _Scanner
Brad Chapman
chapmanb at arches.uga.edu
Sun Oct 1 10:48:31 EDT 2000
Hey all;
As I was talking about yesterday, I went ahead and generated a
Martel-based replacement for the current _Scanner framework that Jeff
wrote for Fasta parsing. I was just interested in doing this so that I
could see how Martel based parsing could fit in with the nice
Scanner/Consumer framework that Jeff set up.
Basically the approach I took was to let Martel do the low level parsing,
and then generate the appropriate scanner events using the SAX handler
that looks at the XML generated by Martel. So basically all I did was
rewrite the _Scanner to use Martel.
I attached two files to this mail which shows this in action:
1. Fasta.py -> This is a replacement for Bio/Fasta/Fasta.py. It just
replaces _Scanner and adds a SAX handler class to turn the Martel XML
into Scanner events.
2. fasta_format.py -> This should be put in Bio/Fasta, and is the Martel
based regexp for reading fasta files. My regular expressions suck, so
this got pretty ugly, especially when I was trying to deal with that
annoying dos line break stuff in the test suite. I'm quite open to
suggestions for making this nicer!
This should work almost exactly the same as the _Scanner class from
before, except that it parses everything that gets fed into it (instead
of just one record from a file, as before). So all of the tests work with
the new parser, but test_Fasta will fail in the regression test because
of this different behavior.
Feedback on all of this would be very welcome!
Brad
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Fasta.py
Type: application/x-unknown
Size: 11129 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20001001/ff075c6e/Fasta.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fasta_format.py
Type: application/x-unknown
Size: 1219 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20001001/ff075c6e/fasta_format.bin
More information about the Biopython-dev
mailing list