ConcurrentCollections Can Be Prohibitively Expensive

Originally posted on: http://geekswithblogs.net/akraus1/archive/2015/01/26/161276.aspx

Since most of us use multiple threads we get all sorts of race conditions which we can solve with proper locking and concurrent data structures. A simple issue is if you have concurrency on an object which exposes a Dictionary<string,string> from where you get exceptions back if concurrent writers add the same data to it. You would then either need to coordinate the writers or use a different dictionary which allows concurrent adds. As it turns out ConcurrentDictionary is just the collection you were searching for and your exceptions are gone. You let the tests run. All green, check in and submit it to the main branch.

A long time later at the end of the project until now ignored performance issues tend to get identified and fixed. Someone calls you that your checkin from a long time ago is the root cause of a major performance issue and that you are showing up as mount Everest in the VirtualAlloc reports.

How can that be? The problem with the Concurrent collections which claim to be largely lock free is that to achieve uncontended writes you have to allocate your key data structures per core. This means that you are allocating memory proportional to the number of your cores. This is not a problem if you have relatively few instances around but if you are exchanging a dictionary in key data structures which are heavily allocated you will get much slower. You can test it for yourself with a small sample app like this:

class DataObject // Original Version
    {public IDictionary<string, string> Data { get; private set; }public DataObject()
        {
            Data = new Dictionary<string, string>();
        }
    }class FixedDataObject // Fixed version with ConcurrentDictionary
    {public IDictionary<string, string> Data { get; private set; }public FixedDataObject()
        {
            Data = new ConcurrentDictionary<string, string>();
        }
    }class Program
    {staticvoid Main(string[] args)
        {
            List<DataObject> dataList = new List<DataObject>();
            List<FixedDataObject> dataListFixed = new List<FixedDataObject>();

            var sw = Stopwatch.StartNew();
            constint Runs = 300*1000;for(int i=0;i<Runs;i++)
            {if( args.Length == 0 ) // No args use old
                {
                    dataList.Add(new DataObject());
                }else// some args passed use fixed version
                {
                    dataListFixed.Add(new FixedDataObject());
                }
            }
            sw.Stop();
            Console.WriteLine("{0:N0} did take {1:F2}s", Runs, sw.Elapsed.TotalSeconds);

            GC.KeepAlive(dataList);
            GC.KeepAlive(dataListFixed);
        }
    }

Old Version

300.000 did take 0,14s

Fixed Version with Concurrent Dictionary

300.000 did take 1,08s

Update

On a 24 core server it takes 3,3s and it consumes 1,4GB of memory. Data structures which allocate proportional to the number of cores for the sake of thread performance do not sound like a good idea to me.

This is over 7 times slower and introduces tons of GCs in your application limiting what you can do in parallel even further. The worst part of this is that the bigger the server you let it run on the slower (more cores means more memory to allocate!) it will get. On a slow dev machine with 2 or 4 cores you might not notice it much. But if you deploy this on real server hardware with 40 cores you have a really bad performance problem at hand. That is the reason why you need to do performance testing on virtually every hardware configuration you support.

The fix was to go back to a plain dictionary with a lock. ConcurrentDictionary supports a loadfactor which controls how many arrays for isolated writes is should allocated but it was still a factor two slower.

So beware by simply replacing non concurrent data structures with their lock free but much more memory hungry counterparts you might be making your system much slower. In effect we were loading data slower the bigger the server did get. This is definitively not what the customer has paid for.

ConcurrentCollections Can Be Prohibitively Expensive

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112