[BioSQL-l] Plone4bio 1.0 and BioSQL

Thu Oct 1 15:32:01 UTC 2009

James wrote:
> Peter wrote:
>> James wrote:
>>>
>>> * issue #2: The imagemap shown under the 'Features' tab is generated
>>> using bioperl from a genbank file emitted by biopython. This is a flaw,
>>> and means lots of info is lost (my biosql db is used to serve protein
>>> sequence DAS annotation, so it has URLs, scores, and lots of notes).
>>
>> That is a curious and round about way of doing things, with many
>> data transformations risking loosing things at each point.
>
> I can understand why it was done - if you already have an image renderer
> that eats genbank, its the shortest path :)

That could easily be the case.

>> It would be possible to use Biopython's GenomeDiagram module to
>> draw the image directly (although the style and capabilities would
>> differ). I've done this for an in house TurboGears based BioSQL
>> front end, and it was fine for prokaryotic organisms.
>
> Sounds good... I was sure there was a python way to go here. Happy
> to test any alternative you can provide ;)

Without me installing my own copy of Plone4Bio, I would at least need
to see a sample PNG image to try and mimic. However, from Ivan's
email it sounds like they need features we don't currently support.

>> Another more elegant alternative would be to call a BioPerl script which
>> talks to the BioSQL database directly to get the data to draw the image.
>
> definitely. It does incur the overhead of creating a new database connection
> and instantiating another object representation of the same biosql records,
> the latter isn't really a problem but the former could have scalabilty
> implications.

It does look like Plone already has a Biopython (DB)SeqRecord object
in memory, so yes, constructing the BioPerl equivalent from the database
might be a bit of a waste.

>> Can you point me at the relevant files in Plone4bio to see their code?
>> I agree with your general point that a pluggable rendering option might
>> be best, but that would be a question for the Plone4bio team to debate.
>
> The bioperl bits that generate images/maps for genbank files are here:
> https://www.plone4bio.org/trac/browser/plone4bio.base/trunk/src/plone4bio/base/png/perl
>
> The python that does the piping is here:
> https://www.plone4bio.org/trac/browser/plone4bio.base/trunk/src/plone4bio/base/png/seqrecord.py
>
> Looking at the code again, I can see that there are well defined interfaces
> - so in principle, plugging in other instances should be fairly easy.

I'm sure they could also spit out the database primary keys, and pass
that the BioPerl script which can use bioperl-db to talk to the database.
It may be that the current GenBank file route is faster though ;)

> My issues are here:
> Patch for genbank parser: https://www.plone4bio.org/trac/ticket/8

Could you attach an example of the problem GenBank files being
generated? Before we blame the BioPerl parser, we should check
that Biopython is producing n valid GenBank file. Off hand, I'm not
sure if feature types are allowed to have spaces in them for example.

James - how was your BioSQL database populated?

Peter