[Biopython-dev] biopython web interface

Massimo Di Pierro mdipierro at cs.depaul.edu
Sun May 1 04:51:23 UTC 2011


Hello Andrea

I am a looking at something a little different than what you are doing but we should definitely collaborate.
I am trying to identify tasks that are not domain specific that could benefit more than one scientific community.

It seems to me all scientific communities have data, have program (in python or not it irrelevant to me) and have a workflow.
They all need:
1) a tool to post the data online in a semi-automated fashion
2) a tool to share data easily (both via web interface and scripting via web service) with access control
3) a way to annotate the data as in a CMS
4) a mechanism to connect data with a workflow so that certain programs are executed automatically when new data is uploaded in the system. The programs may require user input so it should possible to somehow register a task (a program) by describing what input data it needs and what user input it needs and the system should automatically generate an interface.
5) an interface to local clusters and grid resources to submit computing jobs to

I do not have the resources or the expertise to build an interface specific for biopython but I think we should collaborate because if what I am going is general enough (and I am not sure it is unless we talk more about it) it could be used to create an interface to biopython with minimal programming.

I understand your focus is on algorithms but I need to start on data. It is my experience it is very difficult to automate the workflow of algorithms if there is no standard exchange format for the data.

The first thing I would need to understand are:
- does biopython handle some standard file formats? What do they contain? how can they be recognized? Can you send me a few example?
- is there a graph of which algorithms run on which file types?
- what are the most common algorithms? Can you point me to the source?

I like to think of the system as something that will represent the workflow as a graph. Each file type is a node. An algorithm is a link.
If a node is an image or a csv file or an xml file or a movie or a vtk file, etc. the system will be able to represent it (show it).
Links "define" the file type. As long as you have a standard, you will be able to register your algorithms and the system will know what to do.

The all graph is built automatically without programming by introspecting your folders and identifying your files. You will be able to annotate your folders using a markup language to augment the information.

In my approach starting from the data is critical. My approach does not fly if you do not have standard file formats.

Massimo







P.S. Sei italiano?

On Apr 30, 2011, at 12:03 PM, Andrea Pierleoni wrote:

> 
>> 
>> Message: 3
>> Date: Fri, 29 Apr 2011 08:34:34 -0500
>> From: Massimo Di Pierro <mdipierro at cs.depaul.edu>
>> Subject: [Biopython-dev] biopython web interface
>> To: <biopython-dev at biopython.org>
>> Message-ID: <57629245-F184-4143-8B18-80E69BC2C351 at cs.depaul.edu>
>> Content-Type: text/plain; charset="us-ascii"
>> 
>> Hello everybody,
>> 
>> I am new to biopython and I have some silly questions.
>> 
>> Does biopython have a web interface?
>> If not, would you be interested in help developing one?
>> What kind of features would you be interested in?
>> 
>> Reason for my question: I am a physicist and a professor of CS. I am
>> working with a few different groups to build a unified platform to bring
>> scientific data online. The main idea is that of having a tool that
>> requires no programming and scientists can use to introspect an existing
>> directory and turn it into dynamical web pages. Those pages can then be
>> edited and re-oreganized like a CMS. The system should be able to
>> recognize basic file types, group, tag and categorize them. It should them
>> be possible to register algorithms, run them on the server, create a
>> workflow. The system will also have an interface for mobile.
>> 
>> Here is a first prototype for physics data that interface with the
>> National Energy Research Computing Center:
>> http://tests.web2py.com/nersc
>> 
>> Since we are doing this it would be great to have as many community on
>> board as possible so that we can write specs that are broad enough.
>> We can do all the work or you can help us if you want.
>> 
>> So, if you have a wish list please share it with me.
>> 
>> Personally, I need to be educated on biopython since I do not fully
>> understand what are the basic file types it handles, what are the most
>> popular algorithms it provides, nor I am familiar with the typical usage
>> workflow.
>> 
>> Massimo
>> 
>> 
>> 
> 
> 
> Hi Massimo,
> BioPython itself is a python library, but a web interface would enable many
> functions to biological scientist with no programming expertise.
> There are some parts of the library that cope well with a
> web-interface/server,
> in particular the BioSQL modules.
> The BioSQL schema is a relational database model to store biological data.
> I do have working code for using the BioPython BioSQL functions (and more)
> with
> the web2py DAL, and I'm working on a complete web2py-based opensource
> webserver to store and manage biological sequences/entities.
> If you (or any other) are interested and want to contribute, let me know.
> There are  many things in common between what I'm doing and what you want
> to do,
> so maybe its a good idea to work together.
> 
> Andrea Pierleoni
> 
> 
> 





More information about the Biopython-dev mailing list