[BioPython] Question about Martial

Andrew Dalke dalke@acm.org
Thu, 22 Mar 2001 11:53:51 -0700


>Can't find the Martel's developer's contact on the docs,
>so I'm going to send this here...

That would be me.  Guess I should start putting my
name and self promotional links to my company web site
as comments in the code and docs.  Oh wait, my company
doesn't have a web site yet.  :)


>Anyone know how to make Martel's parser NOT forward
>non-named events? Such as found in, say, Martel.Str().

I'm not sure exactly what you mean by this.  I see that
Brad gave one answer, but I think you are asking for
something different, which is to not get characters()
events outside of tag names.

It its most literal interpretation, that isn't possible,
since I think everything in XML needs to be singly
rooted, so everything is enclosed in a top-level tag.

However, what I think you want is something like:

class FilterText(handler.ContentHandler):
    def __init__(self, tags, handler):
        self._tags= tags
        self._handler = handler
        self._count = 0
    def beginElement(self, name, attrs):
        if name in self._tags:
            self._count += 1
            self._handler.beginElement(name, attrs)
    def endElement(self, name):
        if name in self._words:
            self._handler.endElement(name)
            self._count -= 1
    def characters(self, text):
        if self._count:
            self._handler.characters(text)
    def beginDocument(self):
        self._handler.beginDocument()
    def endDocument(self):
        self._handler.endDoument()

Then instead of

  my_handler = WhatEver()
  parser.setContentHandler(my_handler)

you would do:

  my_handler = WhatEver()
  my_handler = FilterText(["ac_number", "entry_name", "sequence"],
                          my_handler)

However, using this filter isn't necessarily the best thing
for every case.  Take a look at the builder/ subdir of the
distribution.  That has a pretty nice example of how to 
build biopython SProt objects.  In that case I combine
the filtering with something which accumulates characters()
so the endElement for useful tags becomes quite simple.

                    Andrew
                    dalke@acm.org