[Bioperl-pipeline] Bioperl-pipeline

06 Aug 2002 20:30:11 +0800

Sending this again cuz it was held for having a suspicious header.

On Tue, 2002-08-06 at 20:22, Shawn wrote:
> Hi Mat,
> one can subscribe to the list at:
> 
> http://bioperl.org/mailman/listinfo/bioperl-pipeline
> 
> I think most your questions were captured in Elia's mail.
> 
> >I have mySQL loaded (should I move to postgres?  
> >Chris Mungall seemed to say that it was better, haven't heard other endorsements one way or the other yet)  
> 
> I have no experience with postgres, mysql works great for us. I think
> what Chris pointed out was that postgres allowed certain programmatic
> queries which reduces some latency without having to do multiple
> queries.
> 
> > In the meantime, I think me getting the infrastructure ready for the
> pipeline will be good, maybe testing something out?  
> 
> yup sounds good. I will commit a blast pipeline then u can try it out.
> You will need to have the bioperl-run package as well.
> 
> In the mean time, you can run the pipeline tests to see whether it
> passes.
> 
> Do this for the first time:
> perl Makefile.PL
> make
> 
> Then run:
>  make test 
> and see whether it runs
> 
> Before doing this you will need to modify 
> bioperl-pipeline/Bio/Pipeline/PipeConf.pm
> 
> Ensure that NFSTMP_DIR  writes to a dir that is accessible
> Ensure that BATCH_MOD is correct either LSF/PBS. Hmm. I remember now,
> you don't have a load sharing software right? You will need something
> like this unless there is some way you can submit the jobs from one
> machine. THen you can encapsulate that in a BatchSubmission object.
> But the tests will skip if u don't so its ok for now.
> 
> > I don't have your slides (will you be posting them at the OBF or bioperl-site), but any other servers needed?  Apache
> 
> attach are my slides, yeah..I will pass it to Chris D for the bioperl
> site.
> 
> >:, tomcat (guess not, no java?)  Can I go check cvs for an ERD? (Already have BioSQL loaded, but what of GFD
> 
> Nope, only mysql needed. ERD? hehe.. good idea...we might come up with
> one.
> 
> 
> 
> shawn
> 
> 
> On Tue, 2002-08-06 at 04:53, Wiepert, Mathieu wrote:
> > Hi Shawn,
> > 
> > Glad you got the list started.  I have a friend down at Wustl, who is interested in this as well.
> 
>   It was his research that generated the process for us.
> 
>   If you could add him to the list, or else give me the url to have him enroll, that would be great.(rfreimut@im.wustl.edu)
> > 
> > We will have to flesh this out re the cron job.  I see that aspect as taking a normal pipeline job and then just cron'ing it, which I think is what you are saying?  So a cron wrapper on an already existing pipeline path?
> > 
> > Regarding 2, I can help with trying to write first passes at any runnable that I might need (it sounds simple enough), but will wait for Martin's attempt to define the interface as you suggest.
> > 
> 
> 
> 
> 
> I have been updating the bioperl-pipeline code from anonymous cvs.  
> 
> > 
> > ś?O=
> > 
> > BTW, I didn't get this except as direct from you, nothing from a bioperl-pipeline list.  Just FYI...
> > 
> > Thanks,
> > -Mat
> > 
> > -----Original Message-----
> > From: Shawn [mailto:shawnh@fugu-sg.org]
> > Sent: Monday, August 05, 2002 8:41 PM
> > To: Wiepert, Mathieu
> > Cc: Elia Stupka; kiran; bioperl-pipeline@bioperl.org
> > Subject: Re: bioperl-pipeline for the small lab
> > 
> > 
> > Hi Mat,
> > 	great to hear from you. We are most interested to see how we can help
> > you out. We want to encourage as much use of the pipeline as possible
> > and that will give us tremendous support in terms of validating our
> > design and incorporating new features as new requirements arise.
> > >From your mail, I will try and summarize what you propose
> > 
> > 1) The daemon pipeline which I can see as a cron job is essentially a
> > one stage pipeline. But we might have one or more runnables for the
> > logic of doing the diffs and annotating your database based on new hits
> >  etc. Yup, its interesting to me to develop this functionality which is
> > very generalizable.
> > 
> > 2) The second pipeline is seems like a series of blast pipelines running
> > in parallel. Then using the hits to run framesearch and TFASTA. We can
> > write bioperl wrapper for those or you could help out too :) Wait for
> > the proposal of a new Bio::Tools::Run::Analysis interfaceI think that
> > Martin Senger is proposing for writing these wrappers which should be
> > quite nice and clean  . Outputs can be dumped to database or csv as you
> > wish :) In terms of GFD, it seems that we are mostly storing
> > hits/featurepairs for your searches which is now doable and we are
> > refining. You prolly also want to store your genes in the db as well.
> > For your sequences you may want to store them in BioSQL.
> > 
> > When I get back, we can come up with more concrete plans, I will try and
> > write up some configurations for your pipeline :)
> > 
> > its looks like an interesting start. FYI, we just created a
> > bioperl-pipeline mailing list and hope u don't mind I have cc'ed this 
> > mail to the list. 
> > 
> > >I can write some sort of docs for you on how to do this for the small
> > lab? 
> > :) we will definitely take u up on that
> > 
> > are u still at ISMB? lets talk some more if you want 
> > great mail.
> > 
> > cheers,
> > shawn
> > 
> > 
> > 
>