Beware of SortedSet<T> and a custom Comparer

Originally posted on: http://geekswithblogs.net/akraus1/archive/2016/03/22/173599.aspx

When dealing with custom data structures funny things can happen. This is especially true if multiple threads can mutate state and you have several locks playing in the mix. Doing a code review is quite challenging with such code but still some bugs slip through. After quite a lot of debugging the final bug turned out to have nothing to do with multithreading but a problematic IComparable implementation. What will the following small program print as output?

using System;using System.Collections.Generic;namespace SortedSetTest
{class Data : IComparable<Data>
    {publicbyte[] Payload
        {
            get;
            set;
        }publicint CompareTo(Data other)
        {returnthis.Payload == other.Payload ? 0 : 1;
        }
    }class Program
    {staticvoid Main(string[] args)
        {
            var set = new SortedSet<Data>();

            var payload = newbyte[10];
            var payload2 = newbyte[20];
            var payload3 = newbyte[30];

            var d1 = new Data { Payload = payload };
            var d2 = new Data { Payload = payload2 };
            var d3 = new Data { Payload = payload3 };

            set.Add(d1);
            set.Add(d2);
            set.Add(d3);

            set.Remove(d1);
            set.Remove(d3);
            set.Remove(d2);

            Console.WriteLine($"Final Count: {set.Count}");
        }
    }
}

You can choose from

0
1
2
3
Other
Exception

And the winner is:

Final Count: 1

This was unexpected. If you break the rules of the IComparable interface you can get from never terminating sorts up to silent data structure corruption everything. The generic interface description is not particularly helpful but the non generic version spells it out explicitly:

1) A.CompareTo(A) must return zero.
2) If A.CompareTo(B) returns zero, then B.CompareTo(A) must return zero.
3) If A.CompareTo(B) returns zero and B.CompareTo(C) returns zero, then A.CompareTo(C) must return zero.
4) If A.CompareTo(B) returns a value other than zero, then B.CompareTo(A) must return a value of the opposite sign.
5) If A.CompareTo(B) returns a value x not equal to zero, and B.CompareTo(C) returns a value y of the same sign as x, then A.CompareTo(C) must return a value of the same sign as x and y.
6) By definition, any object compares greater than (or follows) null, and two null references compare equal to each other.

If you break any of the rules you will get undefined behavior which depends entirely on the collection class you use. You must remember to never ever take shortcuts while implementing the IComparable interface. In my case the rule number 4 and 6 were violated.

Proof 4: A,B are non null arrays

A.CompareTo(B) = 1

B.CompareTo(A) = 1 Cannot be according to rule 4!

The spirit of this rule is that in order to sort things you need to support the < operator which requires that property. The problem with SortedSet<T> is that it uses a red black tree for data storage. Since the < comparison is broken the tree operations which rely on a working comparison operator can break it subtle ways like sometimes to forget to remove an existing item from the collection.

So how can this be fixed? That all depends on what you treat as equal. If you care only about array sizes and not their contents then your CompareTo method becomes

publicint CompareTo(Data other)
        {int lret = 0;if( Object.ReferenceEquals(Payload, other.Payload))
            {
                lret = 0;
            }elseif (other.Payload == null)
            {
                lret = 1;
            }elseif (Payload == null)
            {
                lret= -1;
            }else
            {
                lret = Payload.Length.CompareTo(other.Payload.Length);
            }return lret;
        }

That fixed version will also follow rule 6 for null values which is also good practice. For some time I have thought that SortedSet<T> was broken but as usual the BCL classes are ok but the fault was in our code. How hard can it be to write a comparison method? It turns out there are 6 rules to be followed which are quite a lot for a seemingly simple function.

Beware of SortedSet<T> and a custom Comparer

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112