[Biopython-dev] Re: Gobase

Cayte katel at worldpath.net
Wed Aug 30 03:08:56 EDT 2000


> So to answer your question, you may want to try to create a Gobase parser
> in Martel, and then let us know what you think.  It would be a good test
> case, and probably helpful to Andrew to know whether it can handle the
> format.
>
> Jeff
>
>
>
> On Mon, 21 Aug 2000, Cayte wrote:
>
> >    I've been looking into Gobase, a mitochondrial database, and
> > wondering whether to use a line oriented or a streaming approach.
> > The Gobase pages don't use as much formatting as Rebase, so the
> > ParserSupport routines would work.  But the streaming lets the utility
> > strip off all the HTML, so the user doesn'y have to delete the
> > preamble.  The streaming is also less brittle if the format should
> > change.  On the other hand, it's more bug prone because it removes
> > linefeeds before they can be used as delimiters.
> >
> >
   Unfortunately, I already started Gobase.  By generalizing, I was able to
scrunch the code I used for Rebase.

Instead of things like:

    def _scan_methylation(self, text, consumer ):
        start = string.find( text, 'Base (Type of methylation):' )
        if( start != -1 ):
            end = string.find( text, 'REBASE enzyme #:' )
            next_item = text[ start:end ]
            consumer.methylation( next_item )

I coded:
    def _scan_field(self, text, field, next_field = None ):
        start = string.find( text, field )
        if( start == -1 ):
            return None
        if( next_field == None ):
            end = start + 40
        else:
            end = string.find( text, next_field ):' )
            if( end == -1 ):
                return None
            next_item = text[ start:end ]
            return( next_item )

   But xml is a-comin' and we'll need something like Martel.  I plan to get
familiarized with it.  I may use it for my next parser, when I get a round
tooit.   Maybe www.methdb.de ( methylation database ).

                                                                    Cayte





More information about the Biopython-dev mailing list