[Biojava-dev] HashMap ==> LinkedHashMap in HashSequenceDB

Matias Piipari matias.piipari at gmail.com
Wed Nov 18 12:07:40 UTC 2009


According to this performance benchmark:
http://www.artima.com/weblogs/viewpost.jsp?thread=122295

"*LinkedHashMap* tends to be slower than *HashMap* for insertions because it
maintains the linked list (to preserve insertion order) in addition to the
hashed data structure. Because of this list, iteration is faster."

Relative to HashMap, LinkedHashMap takes  approx 2.x time with put, approx
1.3x with get, 0.6x with iterate.
I wouldn't expect real life performance changes from this (key-value
insertion to a sequence database object probably not a common performance
bottleneck in apps using biojava - am I correct?)

On Mon, Nov 16, 2009 at 2:59 PM, Richard Holland
<holland at eaglegenomics.com>wrote:

> Sounds like a plan - are there any performance implications?
>
> On 16 Nov 2009, at 14:53, Matias Piipari wrote:
>
> > Hello
> >
> > I'd like to propose a change to org.biojava.bio.seq.db.HashSequenceDB:
> swap
> > the map implementation used to store the sequence-by-id map from HashMap
> to
> > LinkedHashMap. This would allow iterating the sequences in a defined
> order
> > (the order in which they were added to the map).
> >
> > Best wishes
> > Matias
> >
> >
> > An example patch relative to rev 7059:
> >
> > --- src/org/biojava/bio/seq/db/HashSequenceDB.java      (revision 7059)
> > +++ src/org/biojava/bio/seq/db/HashSequenceDB.java      (working copy)
> > @@ -22,8 +22,8 @@
> > package org.biojava.bio.seq.db;
> >
> > import java.io.Serializable;
> > -import java.util.HashMap;
> > import java.util.Iterator;
> > +import java.util.LinkedHashMap;
> > import java.util.Map;
> > import java.util.Set;
> >
> > @@ -203,6 +203,6 @@
> >   public HashSequenceDB(org.biojava.bio.seq.db.IDMaker idMaker, String
> > name) {
> >     this.idMaker = idMaker;
> >     this.name = name;
> > -    this.sequenceByID = new HashMap();
> > +    this.sequenceByID = new LinkedHashMap();
> >   }
> > }
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
>
> --
> Richard Holland, BSc MBCS
> Operations and Delivery Director, Eagle Genomics Ltd
> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/
>
>



More information about the biojava-dev mailing list