ConcurrentHashMap Internal Working in Java

Anmol Sehgal
3 min readDec 24, 2018

--

Suggest going through https://goo.gl/GQ1cQF before digging into the internal working of ConcurrentHashMap. Have explained there about why we need to resort to ConcurrentHashMap and its alternatives present in Java.

As opposed to the HashTables where every read/write operation needs to acquire the lock, there is no locking at the object level in CHM and locking is much granular at a hashmap bucket level.
CHM never locks the whole Map, instead, it divides the map into segments and locking is done on these segments. CHM is separated into different regions(default-16) and locks are applied to them. When setting data in a particular segment, the lock for that segment is obtained. This means that two updates can still simultaneously execute safely if they each affect separate buckets, thus minimizing lock contention and so maximizing performance.

Let's discuss the Segments:

The segment in CHM is nothing but “a specialized hash table”.

This is the pruned down version of the Segments. Observe that it is a reentrant lock.

So any write operation(remove/put/clear etc) will work in 3 steps:
1. Wait until it gets the lock on that Segment.
2. Do the operation.
3. Release the lock after it.

So the working of the Segments is pretty easy, and straightforward. It acts as the core part of CHM, having the functionality to do different operations of Map, and handling the locking mechanism also.

Now we only need to integrate these segments into the CHM, so that whenever CHM methods are called, they act in accordance with different segments, supporting multi-level write Operations- each segment obtaining its own lock and doing its operation independently of other Segments.

CHM Internal Implementation:

As discussed above, the Segment is itself responsible for write operations, wherein it obtains the lock, performs the write operation, and releases the lock after it. Simultaneously many different segments can do the same operation, each taking its own lock.

The CHM has the array of these segments, signifying the degree of synchronization it can handle. The default size is 16, meaning max 16 threads can work at a time.
Each thread can work on each segment during high concurrency and at most 16 threads can operate at max which simply maintains 16locks to guard each bucket of the ConcurrentHashMap.
For every Write operation, CHM methods will find which segment to go to and will call that segment’s method. Line: 40–54 are all write operations.

For the read operations, like Get/Size/isEmpty etc, it does not require any locks, so the CHM itself has their implementation.
i.e. for the write operations, the Segment(inner class) has real implementations, where the CHM methods internally call the segment methods.
But for Read operations, the CHM itself has the real implementation.

For example, the put operation in CHM is:

where it calls the put method implemented in Segment.

Whereas the get method in CHM is like:

To simplify this code, just think of it like its calculating the Segment and then iterating over all the Entries in that Segment until it founds the match. i.e. it itself has the implementation, thus requires no lock for the Read operation.

Summing up:

Different threads can work on the corresponding maintaining different locks to guard their bucket.
The read operations do not require any lock, and hence are faster.
The write operations first wait until they get the lock, then releases the lock after the operation is performed. Note that many write operations can take place at the same time, depending on the length of the segments[].

--

--

Responses (1)