[Biojava-dev] HashMap ==> LinkedHashMap in HashSequenceDB

Thomas Down thomas.a.down at googlemail.com
Wed Nov 18 12:14:39 UTC 2009


I'd be utterly shocked if this turned out to be a noticeable difference.

             Thomas.

On Wed, Nov 18, 2009 at 12:07 PM, Matias Piipari
<matias.piipari at gmail.com>wrote:

> According to this performance benchmark:
> http://www.artima.com/weblogs/viewpost.jsp?thread=122295
>
> "*LinkedHashMap* tends to be slower than *HashMap* for insertions because
> it
> maintains the linked list (to preserve insertion order) in addition to the
> hashed data structure. Because of this list, iteration is faster."
>
> Relative to HashMap, LinkedHashMap takes  approx 2.x time with put, approx
> 1.3x with get, 0.6x with iterate.
> I wouldn't expect real life performance changes from this (key-value
> insertion to a sequence database object probably not a common performance
> bottleneck in apps using biojava - am I correct?)
>
> On Mon, Nov 16, 2009 at 2:59 PM, Richard Holland
> <holland at eaglegenomics.com>wrote:
>
> > Sounds like a plan - are there any performance implications?
> >
> > On 16 Nov 2009, at 14:53, Matias Piipari wrote:
> >
> > > Hello
> > >
> > > I'd like to propose a change to org.biojava.bio.seq.db.HashSequenceDB:
> > swap
> > > the map implementation used to store the sequence-by-id map from
> HashMap
> > to
> > > LinkedHashMap. This would allow iterating the sequences in a defined
> > order
> > > (the order in which they were added to the map).
> > >
> > > Best wishes
> > > Matias
> > >
> > >
> > > An example patch relative to rev 7059:
> > >
> > > --- src/org/biojava/bio/seq/db/HashSequenceDB.java      (revision 7059)
> > > +++ src/org/biojava/bio/seq/db/HashSequenceDB.java      (working copy)
> > > @@ -22,8 +22,8 @@
> > > package org.biojava.bio.seq.db;
> > >
> > > import java.io.Serializable;
> > > -import java.util.HashMap;
> > > import java.util.Iterator;
> > > +import java.util.LinkedHashMap;
> > > import java.util.Map;
> > > import java.util.Set;
> > >
> > > @@ -203,6 +203,6 @@
> > >   public HashSequenceDB(org.biojava.bio.seq.db.IDMaker idMaker, String
> > > name) {
> > >     this.idMaker = idMaker;
> > >     this.name = name;
> > > -    this.sequenceByID = new HashMap();
> > > +    this.sequenceByID = new LinkedHashMap();
> > >   }
> > > }
> > > _______________________________________________
> > > biojava-dev mailing list
> > > biojava-dev at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
> > --
> > Richard Holland, BSc MBCS
> > Operations and Delivery Director, Eagle Genomics Ltd
> > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
> > http://www.eaglegenomics.com/
> >
> >
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list