HashMap vs HashTable vs ConcurrentHashMap
This article is more-or-less like the pre-requisite to understand the ConcurrentHashMaps and why were they introduced when we already had HashTables and HashMaps. All these 3 are the Data structures to store the key-value pairs, with the difference of their behavior in multi-threading systems. Let's study each of them briefly:
HashMaps:
Stores key-value pairs, and is best suitable in non-threaded applications, as this is not synchronized and hence not thread-safe. For more details of its working, see https://goo.gl/tMVF3g. Its methods are not synchronized, so please resort to other Maps for multithreaded applications, as explained below.
Collections.SynchronizedMap:
As explained above, the HashMap is not synchronized, so what if we can wrap its methods in the synchronized block? That is exactly what Collections.SynchronizedMap does for us. So to still use HashMaps and get some synchronization, we can use Collections.synchronizedMap(HashMap), which does nothing fancy but provides the Synchronization wrapper on its methods as:
HashTables:
As the java documentation states: Hashtable is a synchronized implementation of Map. Think of this as same as Collections.SynchronizedMap, as they both do provide the same degree of synchronization and the internal implementation is also almost the same.
So it looks the same as Collections.SynchronizedMap(Map) implementation.
If they both do the same function and have the same implementation, then why was Collections.SynchronizedMap introduced?
The method Collections.SynchronizedMap(Map) returns an instance of SynchronizedMap which is an inner class to the Collections class. This class has all its methods in a Synchronized block with a mutex. The difference lies in the mutex here. The inner class SynchronizedMap has two constructors, one which takes only Map as an argument and another which takes a Map and an Object (mutex) as an argument. By default, if one uses the first constructor of passing only a Map, this is used as a mutex. Though, the developer is allowed to pass another object of the mutex as a second argument by which the lock on the Map methods would be only on that Object and hence less restrictive than HashTable.
Hence, HashTable uses method level synchronization but Collections.SynchronizedMap(Map) provides flexibility to developer lock on provided mutex with Synchronized block.
There are other differences in the HashMap/Hashtables also like Hashtable does not allow Null Values, is the part of legacy Dictionary class, etc. but we are focusing only in terms of multithreading. Can refer to https://stackoverflow.com/questions/40471/differences-between-hashmap-and-hashtable and related posts for more differences.
ConcurrentHashMap:
Finally comes the ConcurrentHashMapwhich provides the best of synchronization among all the different Maps. ConcurrentHashMap is a thread-safe collection and intended to be used as primary Map implementation especially for multi-threaded and Concurrent environments. Thread-safety is achieved by dividing the whole Map into different partitions based upon Concurrency level and only locking particular portion instead of locking the whole Map, as done in HashTables/SynchronizedMaps.
As opposed to the HashTables where every read/write operation needs to acquire the lock, there is no locking at the object level in ConcurrentHashMaps and is much finer granular at a hashmap bucket level.
It never locks the whole Map, instead, it divides the map into segments and locking is done on these segments. ConcurrentHashMap is separated into different regions(default-16) and locks are applied to them. When setting data in a particular segment, the lock for that segment is obtained. This means that two updates can still simultaneously execute safely if they each affect separate buckets, thus minimizing lock contention and so maximizing performance.
The internal implementation of ConcurrentHashMaps will be discussed in the upcoming thread. Stay tuned :)