[BioPython] [bip] [OT] Revision control and databases

Bruce Southey bsouthey at gmail.com
Thu Oct 23 13:55:49 UTC 2008


Giovanni Marco Dall'Olio wrote:
> Hi,
> I have a question (well, it's not directly related to biopython or 
> pygr, but to scientific computing).
>
> I always used flat files to store results and data for my 
> bioinformatics analys, but not (as I was saying in another thread) I 
> would like to start using a database to do that.
Of course Biopython's BioSQL interface may provide a starting point.
> The problem is I don't know if databases do Revision Control.
> When I used flat files, I was used to save all the results in a git 
> repository, and, everytime something was changed or calculated again, 
> I did commit it.
> Do you know how to do this with databases? Does MySQL provide support 
> for revision control?
> Thanks :)
I think you are asking the wrong questions because it depends on what 
you want to do and what you actually store. There are a number of 
questions that you need to ask yourself about what you really need to do 
(knowing you have used git helps refine these). Examples include:
How often do you use the old versions in your git repository?
How do you use the old revisions in your git repository?
Do you even use the information of an older version if a newer version 
exists?
Do you actually determine when 'something was changed or calculated 
again' or it this partly determined by an external source like a Genbank 
or UniProt update? (At least in a database approach you could automate 
this.)
How many users that can make changes?
How often do you have conflicts?
Are the conflicts hard to solve?

Revision control may be overkill for your use because this is aims to 
handle many tasks and change conflicts related to multiple users rather 
than a single user.  If you don't need all these fancy features then you 
can use a database. If you just want to store and retrieve a version 
then you can use a database but you need to at least force the inclusion 
a date and comment fields to be useful.


Regards
Bruce



More information about the Biopython mailing list