[Bioperl-l] bioperl pulls, xml parsers push, and things get complicated
Stephen Gordon Lenk
slenk at emich.edu
Wed Jul 19 16:04:16 EDT 2006
Hi,
I have found that POE fails to execute a periodic task after 32
iterations in a Perl thread, consistent failure on both XP and OSX -
if I knew how to write up a defect for Perl I would do this (hint ?
how is this done - I'm *not* asking RTFM etc) - probably remiss for
not doing so - I was going to write messages to a Controller Area
Network (CAN) to control automotive widgets from Perl - I wound up
using a C code exe (piped to from Perl) with its own threads to do
this. Oh yes I believe that bio lab systems can be done this way as
well.
But ... POE is really neat if you think in state machine terms. I have
an alternate architecture for my test harness (Perlizer) that would
use POE to run tests with CAN and GPIB.
Steve Lenk
----- Original Message -----
From: Robert Buels <rmb32 at cornell.edu>
Date: Wednesday, July 19, 2006 3:30 pm
Subject: Re: [Bioperl-l] bioperl pulls, xml parsers push, and things
get complicated
> POE is a really neat thing, I didn't know about it before.
> Something
> tells me, however, that I would have trouble convincing people to
> install POE as a dependency for a genomethreader output parser. ;-
> ) I
> hope I'll have the opportunity to use it sometime.
>
> For the curious, here's a nice intro to POE:
> http://perl.com/pub/a/2001/01/poe.html
> And the POE main site:
> http://poe.perl.org/
>
> Rob
>
> aaron.j.mackey at GSK.COM wrote:
> > There are 3rd generation XML "Pull" parsers (also called "StAX"
> for
> > Streaming API for XML), but they seem to still be stuck in Java
> land (e.g.
> > "MXP1")
> >
> > You could probably use POE to setup a state machine that used
> XML::Twig to
> > "push" units of XML content onto a stack, to be read by your
> "next_*" pull
> > method (where the XML::Twig push "stalled" until the "next_*"
> method was
> > called, and vice versa).
> >
> > -Aaron
> >
> > bioperl-l-bounces at lists.open-bio.org wrote on 07/18/2006
> 08:06:02 PM:
> >
> >
> >> Hi all,
> >>
> >> Here's a kind of abstract question about Bioperl and XML parsing:
> >>
> >> I'm thinking about writing a bioperl parser for genomethreader
> XML, and
> >> I'm sort of mulling over the 'impedence mismatch' between the
> way
> >> bioperl Bio::*IO::* modules work and the way all of the current
> XML
> >> parsers work. Bioperl uses a 'pull' model, where every time
> you want a
> >> new chunk of stuff, you call $io_object->next_thing. All the
> XML
> >> parsers (including XML::SAX, XML::Parser::PerlSAX and
> XML::Twig) use a
> >> 'push' model, where every time they parse a chunk, they call
> _your_
> >> code, usually via a subroutine reference you've given to the
> XML parser
> >> when you start it up.
> >>
> >> From what I can tell, current Bioperl IO modules that parse
> XML are
> >> using push parsers to parse the whole document, holding stuff
> in memory,
> >>
> >
> >
> >> then spoon-feeding it in chunks to the calling program when it
> calls
> >> next_*(). This is fine until the input XML gets really big, in
> which
> >> case you can quickly run out of memory.
> >>
> >> Does anybody have good ideas for nice, robust ways of writing a
> bioperl
> >> IO module for really big input XML files? There don't seem to
> be any
> >> perl pull parsers for XML. All I've dug up so far would be
> having the
> >> XML push parser running in a different thread or process,
> pushing chunks
> >>
> >
> >
> >> of data into a pipe or similar structure that blocks the
> progress of the
> >>
> >
> >
> >> push parser until the pulling bioperl code wants the next piece
> of data,
> >>
> >
> >
> >> but there are plenty of ugly issues with that, whether one were
> too use
> >> perl threads for it (aaagh!) or fork and push some kind of
> intermediate
> >> format through a pipe or socket between the two processes (eek!).
> >>
> >> So, um, if you've read this far, do you have any ideas?
> >>
> >> Rob
> >>
> >> --
> >> Robert Buels
> >> SGN Bioinformatics Analyst
> >> 252A Emerson Hall, Cornell University
> >> Ithaca, NY 14853
> >> Tel: 503-889-8539
> >> rmb32 at cornell.edu
> >> http://www.sgn.cornell.edu
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >
> >
> >
>
> --
> Robert Buels
> SGN Bioinformatics Analyst
> 252A Emerson Hall, Cornell University
> Ithaca, NY 14853
> Tel: 503-889-8539
> rmb32 at cornell.edu
> http://www.sgn.cornell.edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list