Collection · Java

HashMap – How it works?


On very high level, Hash Map is used to store data in form of key value pair.

Hash Map is a Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.” … from Java API

After reading this definition, some question comes in my mind:

  1. How hash map store data internally?
  2. What happen when I try to store some new information in map?
  3. How hash map find my data?

And when I tried to explore it, I find it more and more interesting.

HashMap has a static class named Entry which implements Map.Entry interface. The Entry class looks like:

static class Entry implements Map.Entry {
    final Object key;
    Object value;
    final int hash;
    Entry next;
    Entry (int i, Object obj, Object obj1, Entry entry) {
        value = obj1;
        next = entry;
        key = obj;
        hash = i;
    } 
    // Other methods
}

Every time we insert a <key,value> into hashmap using .put() method, a new Entry object is created (not true is some cases. if key already exists, then it just replace the value).  Map internally used two data structures to manage/store data:

  1. Array
  2. Link List

This image shows how hashmap manage data. Here

  • Each index of array is a bucket
  • To identify the bucket for any <key,value>, Hash map use key.hashCode() and perform some operation:
    Bucket (index) =HashMap.indexFor (HashMap.hash(key.hashCode()), entryArray.length)
    It means, two keys with different hashCode can fall under same bucket.
  • If a bucket is empty (table[i] is null) then the Entry object is simply inserted at ith position
    table[i] = new Entry(hash, key, value, null)
  • If a bucket has some values, then the following algo is used:
    Entry entry = table[i]Table[i] = new Entry(hash,key,value,entry)It means, the latest entry resides on the top of the bucket.
  • If a key already exists in a map, then it just replace the value and return the old value
    If (entry.key == key || key.equals(entry.key)
    Then
        Object oldValue = entry.value;
        Entry.value = newValue;
        Return oldValue;
    End if
    
  • In case of new key,  map add the key, value (a new Entry object with key value) and return null
  • Entry.hash is not the hashCode of key, but HashMap use its own hash algo to create hash based on key.hashCode(). So it’s like:
    entry.hash = HashMap.hash(key.hashCode());
  • HashMap allow single null as key, in case of null key, it create a dummy object and use it as key.
    static final Object NULL_KEY = new Object();
  • hashMap.get(Obj) and hashMap.containsKey(obj), both method iterate through the map. Avoid to use extra call of containsKey().
    If (null != map.containsKey(obj)) {
        myObj = map.get(obj);
        //Perform other operations
    }

    This can be easily replaced by:

    myMap  = map.get(obj);
    if (null != myMap) {
        // Perform other operations
    }
    

     

24 thoughts on “HashMap – How it works?

    1. @Javin,
      What use is there in exposing a candidate who doesnt know the details of hash map? Who really cares when they know when and where to use it. Sounds like self righteous bs to me

  1. Can u provide the realistic example of how the key value pairs are arranged in the image given by u above. That would help me better understanding.

  2. Pingback: JavaPins
  3. Nice article. Thanks!

    Why is hash also stored in Entry? Is it to avoid recomputing during rehashing?

    >> HashMap.indexFor (hash, entryArray.length)

    Do you know what does the indexFor implementation do? Does it just compute “mod (hash, length)”?

    1. Because of its datastructure, we need some object which can contain multiple values (#, key, value, next), so a entry class is used for same to represent a single entry in hashmap.

  4. How two keys with different hashcode could fall in the same bucket?? Could you please elaborate? It’s not possible..

    1. The .hashCode() method return a key which is used as bucket identifier. Multiple objects can have same hashcode (Though you can override this behavior and can implement your own algo to generate a unique key).
      The intent is to reduce the length of first level array, otherwise we would have endup with a flat datastructure (a arraylist of objects) which means poor performance in lookup of objects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s