AtomicInteger, Volatile, Synchronized and Raw: Performance and safety

Not getting race conditions worked out can
cause a bit of trouble...
Synchronizing a single value between threads is the corner stone of  a lot of high performance thread safe algorithms; but which is the best way to do it?

The super naive programmer might just define a shared int field. This does not work because the JVM makes no guarantees at all about updates to such a field from different threads.

A slightly more experienced developer might realise that if a field is marked 'volatile' then the JVM is guaranteed to synchronise its value between all threads before any read or write. This can solve a lot of race conditions but not all. The code below increments a shared field from multiple threads (10 actually). It does this for the volatile integer using the ++ precrement operator. But it does not work!

The reason it does not work is that the unary ++ precrement operator is not actually one operation even though it looks like it is in the source code. What actually happens is something like this:

  1. get the value of x
  2. increase the value gotten
  3. set the value of x to the new value

What looks in the code like a single operation is actually three. Consider two threads performing this operation. Something like this can happen:

  1. Thread A gets x the value 20
  2. Thread B gets the value 20
  3. Thread A updates the value to 21
  4. Thread B updates the value to 21
  5. Thread A sets x value to 21
  6. Thread B sets the x value to 21

The code's intent is that x should end up at 22 but it only makes it to 21!

The old solution to this was to use synchronization. By placing the increment in a synchronized block or method we ensure that only one thread at a time tries to increment the value and so the race does not happen. This is much slower than using a volatile. However, now that there is the java.util.concurrent.atomic.AtomicInteger class we can use that which has a guaranteed thread consistent and thread safe incr method.

So, here is the code to compare the effectiveness and performance of these four approaches:

package com.nerdscentral.threading;

import java.util.ArrayList;
import java.util.concurrent.atomic.AtomicInteger;

public class CounterDemo {

    private static AtomicInteger safeCounter;
    private static volatile int unsafeCounter;
    private static int brokenCounter = 0;
    private static int lockedCounter = 0;
    private static volatile boolean print;

    private static void reset(){
        safeCounter = new AtomicInteger(0);
        unsafeCounter = 0;
        brokenCounter = 0;
        lockedCounter = 0;
    }
    
    private static synchronized void incr(){
        ++lockedCounter;
    }
 
    /**
     * @param args
     */
    public static void main(String[] args) {
        for (int i = 0; i < 2; ++i) {
            run10(new Runnable() {
                @Override
                public void run() {
                    doSafeCount();
                }
            });

            run10(new Runnable() {
                @Override
                public void run() {
                    doUnSafeCount();
                }
            });

            run10(new Runnable() {
                @Override
                public void run() {
                    doBrokenCount();
                }
            });
            
            run10(new Runnable() {
                @Override
                public void run() {
                    doLockedCount();
                }
            });
        }
    }
    private static void run10(Runnable r){
        print=false;
        reset();
        for(int i=0;i<10;++i)run10Inner(r);
        print=true;
        reset();
        System.out.println("Running:");
        run10Inner(r);
    }
    private static void run10Inner(Runnable r){
        ArrayList<Thread> l=new ArrayList<>();
        for(int i=0;i<10;++i){
            Thread t=new Thread(r);
            l.add(t);
            t.start();
        }
        for(Thread t:l){
            try {
                t.join();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
    
    private static void doSafeCount(){
        long t = System.currentTimeMillis();
        for(int i=0;i<1000000;++i){
            safeCounter.incrementAndGet();
        }
        if(print)System.out.println("Time for safe: " + (System.currentTimeMillis()-t) + " ms count: " + safeCounter.get());
    }

    private static void doUnSafeCount(){
        long t = System.currentTimeMillis();
        for(int i=0;i<1000000;++i){
            ++unsafeCounter;
        }
        if(print)System.out.println("Time for unsafe: " + (System.currentTimeMillis()-t) + " ms count: " + unsafeCounter);
    }


    private static void doBrokenCount(){
        long t = System.currentTimeMillis();
        for(int i=0;i<1000000;++i){
            ++brokenCounter;
        }
        if(print)System.out.println("Time for unsafe: " + (System.currentTimeMillis()-t) + " ms count: " + brokenCounter);
    }

    private static void doLockedCount(){
        long t = System.currentTimeMillis();
        for(int i=0;i<1000000;++i){
            incr();
        }
        if(print)System.out.println("Time for locked: " + (System.currentTimeMillis()-t) + " ms count: " + lockedCounter);
    }
    
}

On my I5 dual core (hyperthreading) laptop using Java 7 64 bit (Windows 7) I get these results:

Running:
Time for safe: 167 ms count: 4077610
Time for safe: 277 ms count: 6840788
Time for safe: 237 ms count: 7414475
Time for safe: 224 ms count: 8673310
Time for safe: 332 ms count: 8959272
Time for safe: 390 ms count: 9565839
Time for safe: 394 ms count: 9651832
Time for safe: 362 ms count: 9690484
Time for safe: 265 ms count: 9703624
Time for safe: 269 ms count: 10000000
Running:
Time for unsafe: 83 ms count: 1904174
Time for unsafe: 88 ms count: 2055691
Time for unsafe: 74 ms count: 2622605
Time for unsafe: 77 ms count: 2689059
Time for unsafe: 115 ms count: 2630179
Time for unsafe: 123 ms count: 2964788
Time for unsafe: 95 ms count: 3116405
Time for unsafe: 63 ms count: 3118322
Time for unsafe: 105 ms count: 3371756
Time for unsafe: 74 ms count: 3421776
Running:
Time for unsafe: 4 ms count: 1100794
Time for unsafe: 5 ms count: 1262734
Time for unsafe: 5 ms count: 1321789
Time for unsafe: 5 ms count: 1386841
Time for unsafe: 5 ms count: 2287001
Time for unsafe: 4 ms count: 2434221
Time for unsafe: 5 ms count: 2492480
Time for unsafe: 5 ms count: 2540790
Time for unsafe: 4 ms count: 3544195
Time for unsafe: 3 ms count: 3687263
Running:
Time for locked: 2209 ms count: 8424848
Time for locked: 2219 ms count: 8463942
Time for locked: 2548 ms count: 9684650
Time for locked: 2558 ms count: 9723639
Time for locked: 2565 ms count: 9744971
Time for locked: 2607 ms count: 9899460
Time for locked: 2608 ms count: 9908236
Time for locked: 2619 ms count: 9956980
Time for locked: 2620 ms count: 9963695
Time for locked: 2624 ms count: 10000000
Running:
Time for safe: 277 ms count: 7006247
Time for safe: 292 ms count: 7393084
Time for safe: 274 ms count: 7726587
Time for safe: 292 ms count: 8187426
Time for safe: 352 ms count: 8833463
Time for safe: 357 ms count: 8962597
Time for safe: 356 ms count: 9702507
Time for safe: 327 ms count: 9798052
Time for safe: 364 ms count: 9969450
Time for safe: 332 ms count: 10000000
Running:
Time for unsafe: 85 ms count: 1449270
Time for unsafe: 87 ms count: 1937214
Time for unsafe: 74 ms count: 2419149
Time for unsafe: 78 ms count: 1957854
Time for unsafe: 116 ms count: 2288905
Time for unsafe: 117 ms count: 2296963
Time for unsafe: 94 ms count: 2444976
Time for unsafe: 66 ms count: 2552693
Time for unsafe: 69 ms count: 2720608
Time for unsafe: 101 ms count: 2720608
Running:
Time for unsafe: 3 ms count: 945521
Time for unsafe: 4 ms count: 1359628
Time for unsafe: 5 ms count: 1517924
Time for unsafe: 3 ms count: 2151517
Time for unsafe: 5 ms count: 2702603
Time for unsafe: 4 ms count: 2826532
Time for unsafe: 5 ms count: 3370870
Time for unsafe: 6 ms count: 3437013
Time for unsafe: 4 ms count: 3911118
Time for unsafe: 4 ms count: 4094906
Running:
Time for locked: 2751 ms count: 9272425
Time for locked: 2909 ms count: 9795518
Time for locked: 2913 ms count: 9809248
Time for locked: 2945 ms count: 9911106
Time for locked: 2949 ms count: 9926976
Time for locked: 2951 ms count: 9928963
Time for locked: 2958 ms count: 9957494
Time for locked: 2960 ms count: 9964128
Time for locked: 2964 ms count: 9977498
Time for locked: 2967 ms count: 10000000

These show that only the AtomicInteger and locked (synchronized) approaches actually work. However, there is a price to pay. The simplest (and completely broken) approach is 10 times faster than using a volatile. The first approach which actually works is AtomicInteger which is 3 to 4 times slower than volatile. Using a synchronized method is very much slower (around 7 times) again.

This really shows the benefit of the AtomicInteger class. Not unsurprisingly, if we drop the thread count to 1 to remove contention and increase the loop count by 10 the results are very different. The order of approaches is the same but the difference is less. Also, the synchronized method performs massively better. This is expected as the JVM uses fast user space locks where possible (which is possible on Windows) which do not require a context switch for an uncontested point of mutual exclusion. Here are the results:

Running:
Time for safe: 113 ms count: 10000000
Running:
Time for unsafe: 94 ms count: 10000000
Running:
Time for unsafe: 21 ms count: 10000000
Running:
Time for locked: 259 ms count: 10000000
Running:
Time for safe: 113 ms count: 10000000
Running:
Time for unsafe: 88 ms count: 10000000
Running:
Time for unsafe: 21 ms count: 10000000
Running:
Time for locked: 251 ms count: 10000000

In conclusion: if you are are wanting to use an integer value for some form of synchronization between threads, use AtomicInteger!