[Biopython] Fwd: Why so few recipes in the cookbook?

Peter biopython at maubp.freeserve.co.uk
Mon Dec 21 12:03:34 UTC 2009


I just checked with Daniel to make sure he was happy for me to
forward this back to the mailing list.

Peter
---------- Forwarded message ----------
From: Peter <biopython at maubp.freeserve.co.uk>
Date: Fri, Dec 18, 2009 at 10:42 PM
Subject: Re: [Biopython] Why so few recipes in the cookbook?
To: Daniel Silvestre <daniel at dim.fm.usp.br>


Hi Daniel,

Do you mind if I send this to the list too?

2009/12/18 Daniel Silvestre <daniel at dim.fm.usp.br>:
>>
>> I confess I don't know or use the full power of the Entrez website,
>> although that is in part since I can do clever stuff via their API ;)
>
> This is exactly what we want to do when get to the Entrez interface.
> But, the information "How to submit complex query" is hidden (and
> scattered) under many layers of web pages.
>
> The ability to do such things in a more customized way is the dream of
> all life science guy.

This is partly down to the NCBI's Entrez documentation - a lots of
the examples in the Biopython tutorial took some serious exploration
to get working, including trawling the net for other Entrez users (in
other languages). I hope that we've managed to make things clearer.

> While this tutorial is enough to CS-oriented guys, it's a really big
> step to grasp such information for people from other communities.
> That's why I'm always a little confused about the idea behind bio
> projects. If the idea is programming of scientists, the approach is
> way too CS.

You are probably right in that the Bio* projects do cater more to a
programming scientist than a wet biologist - not that there aren't
people that can and do both. You have to be able to program to
take full advantage of any of the Bio* kits. However, there are a
number of front ends, webpages, etc which use them internally.

>> Now here I'd like a little clarification about what you want to do.
>>
>> My guess would be something I have considered working up
>> into a cookbook recipe, based one stuff I have already done:
>> Taking a small genome (viral or prokaryote), doing simple
>> gene predictions (e.g. ORF finding, pick first start codon,
>> or maybe calling a command line tool to do it for us), then
>> taking the predicted peptides and BLASTing them, then
>> making a GenBank file with these predicted features and
>> stick a summary of the BLAST results in their annotation.
>>
>> However, while this is a reasonable first step, there are
>> downsides to encouraging this sort of naive approach to
>> annotation - the example would ideally have "Further
>> Reading" section, see for example Schnoes et al 2009.
>> http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000605
>>
>
> That's exactly my point. Without a complete recipe with a
> specific motivation and a clear stated problem behinf it,
> people will continue with this kind of behavior.

We agree here.

> And I don't see why one need to start simple. The first time I've
> entered in a molbio lab was to carry on a old fashioned gene cloning
> procedure. This is a simple procedure. How do this compares to the
> simple examples we see on bio project tutorials?

I see the tutorial as a teaching aid, and many of the cookbook
examples also. For someone learning to program, an overly
complicated example is intimidating. This is not to say we can't
have some complex cookbook entries too.

i.e. I thank for learning to program, you need to start simple,
and build up the complexity gradually.

>>> These are real problems faced by the common biologist. The proposed
>>> snippets in the tutorial and the cookbook is already dealt by a lot of
>>> web tools. It's absolutely necessary to show that biopython can increase
>>> the power and range of a biologist everyday work, and can possibly be
>>> automated.
>>>
>>> I have some examples to obtain statistics over genome sequences which
>>> address complete examples (including globbing filelists, retrieving from
>>> online databases, etc.) and can prepare them as a recipe. But, I could
>>> use some help . . .
>>
>> If you start a cookbook entry on the wiki, and some outline
>> code, I'm sure we can as a group contribute ideas and tips
>> (particularly in the code, but maybe in the approach too). Or,
>> if you would rather, discuss some specific ideas here on the
>> mailing list first.
>>
>> Note that some of these topics would be ideal for an OBF
>> project wide set of examples, with reference solutions in
>> Biopython, BioPerl, BioJava, BioRuby, etc. That is however
>> a much much bigger task.
>
> I think that there is no need to worry about big things right now. By
> the very nature of programming, people will mirror ideas from one
> another. I've tried a similar approach in the bioperl community. But,
> for the pragmatic life scientist, perl is over expressive while python
> has a much higher first encounter acceptance rate (I'm not sure why,
> tough...).
>
> My idea is not a master blaster cookbook, just to assemble simple ideas
> that work for the everyday user, be this guy a CS or a life scientist.
>
> How do this sound to you?

Wonderful :)

(And I would agree with you that Python is probably easier to teach
to beginners than Perl)

Peter



More information about the Biopython mailing list