Dictionaries C# Help

Dictionaries represent a sophisticated data structure that allows you to access an element based on a key. Dictionaries are also known as hash tables or maps. The main feature of dictionaries is fast lookup based on keys. You can also add and remove items freely. a bit like a List<T>, but without the performance overhead of having to shift subsequent items in memory.

It shows a simplified representation of a dictionary. Here employee-ids such as B4711 are the keys added to the dictionary. The key is transformed into a hash. With the hash a number is created to associate an index with the values. The index then contains a link to the value. The figure is simplified because it is possible that a single index entry can be associated with multiple values, and the index can be stored as a tree.

Capture

The .NET Framework offers several dictionary classes. The main class you can use is Dictionary<TKey. TVal ue>. This class offers nearly the same properties and methods as SortedList<TKey. TVeolue> discussed earlier; that’s why they are not repeated here.

Key Type

A type that is used as a key in the dictionary must override the method GetHashCode () of the Obj0ect class. Whenever a dictionary class needs to work out where an item should be located, it calls the GetHashCode () method. The int that is returned by GetHashCode () is used by the dictionary to calculate an index of where to place the element. Wedon’t go into this part of the algorithm. What you should know is that it involves prime numbers, so the capacity of a dictionary is a prime number.

The implementation of GetHashCode () must satisfy these requirements:

  1. The same object should always return the same value.
  2.  Different objects can return the same value.
  3.  It should execute as quickly as possible; it must be inexpensive to compute.
  4.  It must not throw exceptions.
  5.  It should use at least one instance field.
  6.  The hash code value should be evenly distributed across the entire range of numbers that an int can store.
  7.  At best, the hash code should not change during the lifetime of the object.

A good performance of the dictionary is based on a good implementation of the method GetHashCode ( ) 

What’s the reason for having hash code values evenly distributed across the range of integers? If two keys return hashes that give the same index, the dictionary class needs to start looking for the nearest available free location to store the second item – and will have to do some searching in order to retrieve this item later on. This is obviously going to hurt performance, and clearly, if lots of your keys are tending to give the same indexes for where they should be stored, this kind of clash becomes more likely. However, because of the way that Microsoft’s part of the algorithm works, this risk is minimized when the calculated values are evenly distributed between int .Min value and int .Max Value.

Besides having an implementation of GetHashCode ( ), the key type also must implement the I Equality. EquaLs () method or override the Equal s () method from the Object class. Because different key objects may return the same hash code, the method Equals () is used by the dictionary comparing keys. The dictionary examines if two keys A and Bare equal; it invokes A. Equals (B). This means that you must ensure that the following is always true:

If A. Equals (B) is true, then x, GetHashCode () and B. GetHashCode () must always return the same hash code.

This probably seems’a fairly subtle point, but it is crucial. If you contrived some way of overriding these methods so that the preceding statement was not always true, a dictionary that uses instances of this class as its keys would simply not work properly. Instead, you’d find funny things happening. For example, you might place an object in the dictionary and then discover that you could never retrieve it, or you might try to retrieve an entry and have the wrong entry returned.

For this reason, the C# compiler will display a compilation warning if you supply an override for Equals () but don’t supply an override for GetHashCode ( ).

For System. Object this condition is true, because Equals () simply compares references, and GetHashCode () actually returns a hash that is based solely on the address of the object. This means that hash tables based on a key that doesn’t override these methods will work correctly. However, the problem with this way of doing things is that keys are regarded as equal only if they are the same object. That means that when you place an object in the dictionary, you then have to hang onto the reference to the key. You can’t simply instantiate another key object later with the same value. If you don’t override Equals () and GetHashCode (), the type is not very convenient to use in a dictionary.

Incidentally, System. String implements the interface IEquatable and overloads GetHashcode () appropriately. Equals () provides value comparison, and GetHashCode () returns a hash based on the value of the string. Strings can be used conveniently as keys in dictionaries.

Number types such as Int32 also implement the interface IEquatable and overload GetHashCode (). However, the hash code returned by these types simply maps to the value. If the number you would like to use as a key is not itself distributed around the possible values of an integer, using integers as keys doesn’t fulfill the rule of evenly distributing key values to get the best performance. Int32 is not meant to be used in a dictionary.

If you need to use a key type that does not implement IEquatable and override GetHashCode according to the key values you store in the dictionary, you can create a comparer implementing the interface IEquali tyComparer<T>. IEquali tyComparer<T> defines the methods GetHashCode () and Equals () with an argument of the object passed, so you can offer an implementation different from the object type itself. An overload of the Dictionary<TKey, TVal1le> constructor allows passing.an object implementing IEquali tyComparer<T>. If such an object is assigned to the dictionary, this class is used to generate the hash codes and compare the keys.

Dictionary Example

The dictionary example is a program that sets up a dictionary of employees. The dictionary is indexed by Employee ld objects, and each item stored in the dictionary is an Employee object that stores details of an employee.

The struct Employee ld is implemented to define a key to be used in a dictionary. The members of the class are a prefix character and a number for the employee. Both of these variables are read-only and can be initialized only in the constructor. A key within the dictionary shouldn’t change, and this way that is guaranteed. The fields are filled within the constructor. The ToString () method is overloaded to get a string representation of the employee ID. As required for a key type, Employee ld implements the interface IEquatable and overloads the method GetHashCode ().

Capture

Capture

The Equals () method that is defined by the IEquatable<T> interface compares the values of two Employee ld objects and returns true if the both values are the same. Instead of implementing the Equals () method from the IEquatable<T> interface, you can also override the Equals () method from the Obj ect class:

public bool Equals (Employeeld other)
{
if (other == null) return false;
return (prefix == other.prefix &&
number == other.number);

}

With the number variable, a value from 1 to around 190,000 is expected for the ’employees. This doesn’t fill the range of an integer, The algorithm used by GetHashCode () shifts the number 16 bits to the left, then does an XOR with the original number, and finally multiplies the result by the hex value 15051505. The hash code is fairly distributed across the range of an integer.

public override int GetHashCode()
{
return (number A number « 16) • Ox15051505;

}

On the Internet, you can find a lot more complex algorithms that have a better distribution across the integer range. You can also use the GetHashCode () method of a string to return a hash.

The Employee class is a simple entity class containing the name, salary, and ID of the employee. The constructor initializes all values, and the method ToString () returns a string representation of an instance. The implementation of ToString () uses a format string to create the string representation for performance reasons.

Capture

In the Main () method of the sample application, a new Dictionary<TKey, TValue> instance is created, where the key is of type Employeeld and the value is of type Employee. The constructor
allocates a capacity of 31 elements. Remember, the capacity is based on prime numbers. However, when you assign a value that is not a prime number, you don’t need to worry. The Dictionary<TKey, TValue> class itself takes the next prime number that follows the integer passed to the constructor to allocate the capacity. The employee objects and IDs are created and added to the dictionary with the Add () method. Instead of using the Add () method, you can also use the indexer to add keys and values to the dictionary, as shown with the employees Carl and Matt:

Capture

After the entries are added to the dictionary, inside a while loop employees are read from the dictionary. The user is asked to enter an employee number to store in the variable userlnput. The user can exit the application by entering X. If the key is in the dictionary, it is examined with the TryGetValue () method of the Dictionary<TKey, TValue> class. TryGetValue () returns true ifthe key is found and false otherwise. If the value is found, the value associated with the key is stored in the employee variable. This value is written to the console.

You can also use an indexer of the Dictionary<TKey, TValue> class instead ofTryGetvalue ( ) to access a value stored in the dictionary. However, if the key is not found, the indexer throws an exception of type KeyNotFoundException.

while (true)
{
Console.Write(
“Enter employee id (X to exit» ‘);
string userInput = Console.ReadLine();
userInput = userInput.ToUpper();
if (userInput == “X”) break;
EmployeeId id = new EmployeeId(userInput);
Employee employee;
if (!employees.TryGetValue(id,
out employee))
Console.WriteLine(“Employee with id ” +
“{O} does not exist”, id);
}
else
Console.WriteLine(employee);

}

}

}

Running the application produces the following output:

Enter employee 10 (format:A999999, X to exit» C7102
C007102: Jeff Gordon $5,164,580.00
Enter employee ID (format:A999999, X to exit» F7908
F007908: Carl Edwards $3,285,710.00
Enter empltyee 10 (format:A999999, X to exit)> X

Lookup

Dictionary<TKey, TVal ue> suppo~ts only one value per key. The new Class Lookup<TKey, TElement> that is part of .NET 3.5 resembles a Dictionary<TKey, TValue> but maps keys to a collection of values. This class is implemented in the assembly System. Core and defined with the namespace System. Linq.

Properties and methods of Lookup<TKey, TElement> are described in the following table:

Capture

Lookup<TI<ey, TElement> cannot be created like a normal dictionary. Instead, you have to invoke the method ToLookup () that returns a Lookup<TKey, TElement> object. The method ToLookup () is an extension method that is available with every class implementing IEnumerable<T>. In the following example, a list of Racer objects is filled. Because List<T> implements IEnumerable<T>, the ToLookup () method can be invoked on the racers list. This method requires a delegate of type Func<TSource, TKey> that defines the selector of the key. Here the racers are selected based on the country by using the Lambda expression r => r. country. The foreach loop accesses only the racers from Australia by using the indexer.

Capture

The output shows the racers from Australia:

Alan Jones 
Jack Brabham.

Other Dictionary Classes

Dictionary<TKey, TValue> is the major dictionary class from the framework. There are some more classes, and of course there are also some non-generic dictionary classes.

Dictionaries that are based on the Object type and are available since .NET 1.0 are described in the following table.

Capture

Capture

Since .NET 2.0, generic dictionaries are preferred over object-based dictionaries:

Capture

SortedDictionary<TKey, TValue> and SortedList<TKey, TValue> have similar functionality. But because SortedList<TKey, TValue> is implemented as a list that is based on an array and SortedDictionary<Tkey, TValue> is implemented as a dictionary, the classes have different characteristics:

  1.  SortedList<Tltey, TValue> uses less memory than SortedDictionary<TKey, ‘P’lalue>.
  2.  SortedDictio~ary<TKey, TValue> has faster insertion and removal of elements.
  3.  When populating the collection with already sorted data, SortedList<TKey, TVa:ue> is faster, if capacity changes are not needed.

SortedList consumes less memory than SortedDictionary. SortedDictionary is faster for inserts and the removal of unsorted data.

Posted on October 29, 2015 in Collections

Share the Story

Back to Top