[Biojava-dev] BioJava 3 Begins - Volunteers please!

Richard Holland holland at eaglegenomics.com
Mon Oct 20 08:23:17 UTC 2008


Good point, and the answer is no it doesn't really matter! So I will remove
the singleton-ish ness of Alphabet.



2008/10/20 Mark Schreiber <markjschreiber at gmail.com>

> Hi -
>
> Just a comment ...
>
> Does an alphabet need to be a Singleton in this new paradigm? If it
> does then do you want to have an equals() method? Currently you could
> have:
>
> Alphabet a; Alphabet b;
>
> a.equals(b) //true;
> a == b //false
>
> Unless there is a strong reason why Alphabet needs to be a Singleton I
> don't think it should be (Singletons make life hard when transporting
> between JVMs).  You can get a similar kind of behaivor with caching
> where it doesn't hurt if there is more than one instance of an equal
> alphabet but when they pass through the cache they can get cleaned up
> (like the interning behaivour of Strings).
>
> Put it this way. If I have two copies of the DNA alphabet will it
> matter (other than a bit of memory waste)?
>
> - Mark
>
> On Mon, Oct 20, 2008 at 8:18 AM, Richard Holland
> <holland at eaglegenomics.com> wrote:
> > Hi all,
> >
> > I've just committed some new code to the biojava3 branch of the
> biojava-live
> > subversion repository. It's the foundations of a brand new
> alphabet+symbol
> > set of classes, and an example of how to use them to represent DNA.
> You'll
> > notice that the new code is very lightweight and allows for a lot more
> > flexibility than the old code - for instance, the concept of Alphabet has
> > changed radically. It also makes much more extensive use of the
> Collections
> > API.
> >
> > I haven't got any test cases or usage examples yet but give me a shout if
> > you don't understand the code and I'll explain how it works. (Hint:
> > SymbolFormat is there to convert Strings into SymbolList objects, and
> vice
> > versa).
> >
> > So, now we want some volunteers! We're starting from scratch here so
> there's
> > a lot of work to do. The whole of BioJava needs 'translating' into BJ3,
> > whether it be copy-and-paste existing classes and modify them to suit the
> > new style, or write completely new ones to provide equivalent
> functionality.
> >
> >
> > I'll post an example of how to do file parsing soon, probably starting
> with
> > FASTA. In the meantime, a good place to start would be for people to
> design
> > object models to represent their favourite data types (e.g. Genbank, or
> > microarray data). Utility classes to manipulate those objects would be
> great
> > too.
> >
> > The object models need to be normalised as much as possible - e.g. if
> your
> > data has a lot of comments, and the order of those comments is important,
> > then give your object model a collection of comment objects. The object
> > model for each data type should be completely independent and use basic
> data
> > types wherever possible (e.g. store sequences as strings, don't attempt
> to
> > parse them into anything fancy like SymbolLists). The closer the object
> > model is to the original data format, the better. There's going to be
> clever
> > tricks when it comes to converting data between different object models
> > (e.g. Genbank to INSDSeq), which I will explain later when I put the file
> > parsing examples up.
> >
> > You'll notice how the biojava3 branch uses Maven instead of Ant. This is
> > because we want to make it as modular as possible, so if you want to
> write
> > microarray stuff, create a new microarray sub-project (as per the dna
> > example that's already there). This way if someone only wants the
> microarray
> > bit of BJ3, they only need install the appropriate JAR file and can
> ignore
> > the rest. (The 'core' module is for stuff that is so generic it could be
> > used anywhere, or is used in every single other module.)
> >
> > If coding isn't your cup of tea, then we would very much welcome testers
> > (particularly those who enjoy writing test cases!), documenters
> > (particularly code commenters), translators (for internationalisation of
> the
> > code), and of course all those who wish to contribute ideas and
> suggestions
> > no matter how off-the-wall they might be. In particular if you'd like to
> > take charge of an area of the development process, e.g. Documentation
> Chief,
> > or Protein Champion, then that would be much appreciated.
> >
> > I'm very much looking forward to working with everyone on this. Good
> luck,
> > and happy coding!
> >
> > cheers,
> > Richard
> >
> > PS. Please don't forget to attach the appropriate licence to your code.
> You
> > can copy-and-paste it from the existing classes I just committed this
> > evening.
> >
> > PPS. For those who are worried about backwards compatibility - this was
> > discussed on the lists a while back and it was made clear that BJ3 is a
> > clean break. However, the existing code will continue to be maintained
> and
> > bugfixed for a couple of years so you don't have to upgrade if you don't
> > want to - it just won't have any new features developed for it. This is
> > largely because it'll probably take just that long to write all the new
> BJ3
> > code. When we do decide to desupport the existing BJ code, plenty of
> notice
> > will be given (i.e. years as opposed to months).
> >
> >
> > --
> > Richard Holland, BSc MBCS
> > Finance Director, Eagle Genomics Ltd
> > M: +44 7500 438846 | E: holland at eaglegenomics.com
> > http://www.eaglegenomics.com/
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
>



-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the biojava-dev mailing list