[BioPython] new parser questions

Thu, 4 Apr 2002 21:55:40 -0800

Have you taken a look at Martel?  We're starting to move our parsers
to it.  It's quite robust and fast.  It might make things easier to
strip the HTML first, though...

Jeff

On Thu, Apr 04, 2002 at 07:12:15PM +0200, Danny Navarro wrote:
> Hi all,
> 
> So now that I can use biopython modules from Zope. I'll try to do the
> following proyect:
> 
> A database accesible through the web which will serve the result of
> high-throughput mass-spec experiments in a handy way. I'd like to use
> biopython modules to provide biology services. The first step would be a
> parser to extract information from the mass-spec results.
> 
> The file with the experiment results is simple HTML format. They are
> very huge, ~60 MBytes. I'll attach one sample.
> 
> The parser will extract protein names, peptides sequence which their
> score, reporting from where protein come, whether they are red and/or
> bold, and its delta.
> 
> I have done some parsers but they were not very robust. With this parser
> my first though was to use htmlparser python module. 
> 
> Is it better to reuse something from ParserSupport.py?
> 
> Should I use the scanner and consumer framework? 
> 
> Any hints about how to start properly the parser design in this kind of
> file?
> 
> I'll put the whole project in sourceforge.net soon. Then you can see
> exactly what is the project about.
> 
> If anybody is interested in participate with me in this project it'd be
> great!
> 
> Danny