[GSoC] GSoC 2014 queries and inputs

Ujjwal Thaakar ujjwalthaakar at gmail.com
Mon Mar 17 16:26:50 EDT 2014


When we say BioRuby I think it should work with Ruby - CRuby, JRuby,
Rubinius etc. I'm not sure it's a good idea to constrain people to JRuby!


On 18 March 2014 01:48, Francesco Strozzi <francesco.strozzi at gmail.com>wrote:

> I don't think it's necessary.  If you would like to use JRuby, there is
> the Picard API ( http://picard.sourceforge.net ) which you can reuse
> right away. It's fast and well tested.
>
> All the best.
> Francesco
> Il 17/mar/2014 20:38 "Ujjwal Thaakar" <ujjwalthaakar at gmail.com> ha
> scritto:
>
>> Would we have to write a new VCF parser in Ruby?
>>
>>
>>
>> On 15 March 2014 17:33, Ujjwal Thaakar <ujjwalthaakar at gmail.com> wrote:
>>
>> > Hi,
>> > My name is Ujjwal. I'm a 21 years old student from India and interested
>> in
>> > contributing to Bioruby this year. I have certain queries regarding the
>> > project idea listed.
>> >
>> >    1. Can you give me some more use cases for this tool. Some specific
>>
>> >    functional requirements we'd like to see. What we need to mine
>> determines
>> >    the data structure of our persistence layer and therefore which
>> database
>> >    engine to use.
>> >    2. When you say a RESTful api, we want to deploy this on a server
>> with
>>
>> >    a backing database together with a ruby gem that communicates with
>> the api
>> >    right? And I presume we also want people to be able to make
>> comparison of
>> >    our hosted VFC files with their local VCF files
>> >    3. Although this is a *Bioruby* project, the server doesn't
>>
>> >    necessarily need to be written in Ruby I presume? As is mentioned,
>> Scala or
>> >    JRuby could be used. I would suggest we have a look at Go lang too.
>> >
>> > To give you a background about me. I was a GSoC intern last year for
>> Ruby
>> > on Rails where I implemented a RESTful collection routing api. I am an
>> > intermediate ruby programmer. I have also been interested in synthetic
>> > biology for about a year now and have some lab experience too so I
>> > understand the basics of biology and specifically genetic engineering.
>> I am
>> > a computer science undergrad and have taken a course on data engineering
>> > too. I also have experience working with REST apis and am building one
>> > right now for my startup.
>> >
>> > I have been wondering on the database. I think Neo4J will be a great
>> fit.
>> > It's not heavy like oracle and does not need installation. It's portable
>> > and can be started and stopped easily on the machine. Has low memory
>> > footprint and support for SPARQL too although it's native query language
>> > Cypher will do the trick for us right now. We can run embedded instances
>> > too using JRuby which are super fast. I'm the maintainer of the most
>> > popular Neo4j ruby bindings and also in the process of rewriting the
>> next
>> > version of neo4j-core. It will allow us to make all sorts of queries
>> and do
>> > data mining at an incredible speed while being incredibly portable and
>> > light. All logic can then reside within the gem itself and we do not
>> need
>> > any backend. It should be fast enough since we'll be directly dealing
>> with
>> > java objects made available through jruby. I have a fair idea of how
>> fast
>> > this is and its really fast although working with such huge files will
>> have
>> > different challenges. We don't need a database for the embedded version.
>> > All we need is jars which fortunately are available as a gem so all we
>> have
>> > to do is include them as dependencies and our database is ready! I don't
>> > think it will be this easy for any other db while giving us the same
>> speed,
>> > power and capabilities!
>> >
>> > I've started working on the proposal and will upload it in a couple of
>> > days for your feedback. This is going to be incredibly fun :)
>> >
>> > BTW what is the user base of bioruby like? What does it lack from other
>> > bio libraries like biopython?
>> >
>> > How much biology do I need to understand for this project or will I
>> learn
>> > as we go along?
>> >
>> > --
>> > Thanks
>> > Ujjwal
>> >
>>
>>
>>
>> --
>> Thanks
>> Ujjwal
>> _______________________________________________
>> GSoC mailing list
>> GSoC at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/gsoc
>>
>


-- 
Thanks
Ujjwal


More information about the GSoC mailing list