Dictionaries in C#: How To Use The Different Variations

In our journey through the world of dictionaries in C#, we’ve covered the basics and moved beyond to intermediate concepts. Now, it’s time to dive into the deep(er) end and explore some of the more advanced features and performance considerations of dictionaries.

This article will shed light on more of the intricacies of dictionaries, from working with sorted dictionaries and custom objects as keys to understanding the performance implications of various operations. We’ll also introduce the ConcurrentDictionary, a thread-safe variant of the dictionary. But if you need a refresher on some dictionary basics, check this video out:

YouTube player

So, buckle up as we delve into more advanced techniques and introduce some performance aspects of dictionaries in C#!


Working with Sorted Dictionaries

A SortedDictionary<TKey, TValue> is a collection of key/value pairs that are sorted on the key. Unlike the standard Dictionary<TKey, TValue>, which doesn’t guarantee any specific order of elements, a SortedDictionary ensures that the pairs are in ascending order based on the key.

Example:

SortedDictionary<string, int> ages = new SortedDictionary<string, int>();
ages.Add("John", 25);
ages.Add("Anna", 30);
ages.Add("Zoe", 20);

foreach (var pair in ages)
{
  Console.WriteLine($"{pair.Key}: {pair.Value}");
}

Output:

Anna: 30 John: 25 Zoe: 20

Using Custom Objects as Keys For Dictionaries in C#

While it’s common to use primitive types like string or int as dictionary keys, you can also use custom objects. However, when doing so, it’s crucial to override the GetHashCode and Equals methods to ensure proper functioning.

Example:

public class Person
{
  public string Name { get; set; }
  public int Age { get; set; }

  public override bool Equals(object obj)
  {
    Person other = obj as Person;
    return other != null && Name == other.Name && Age == other.Age;
  }

  public override int GetHashCode()
  {
    return Name.GetHashCode() ^ Age;
  }
}

Dictionary<Person, string> personJobMap = new Dictionary<Person, string>();

Overriding GetHashCode and Equals are extremely error-prone, and even if done without errors, it’s common for developers to use ineffective hash code calculations. Consider this for homework: the logical exclusive OR operator (denoted by ^) is used between Name.GetHashCode and the Age property, but will this result in a good key?

Something to consider to help avoid this issue is looking at record types. While it still might not be ideal to actually use more complex data for dictionary keys, the record type has some comparison logic built in for you so you can avoid some of these pitfalls.


Handling Collisions and Understanding Hashing

Every key in a dictionary has an associated hash code, which determines its position in the underlying data structure. Sometimes, two different keys might have the same hash code, leading to a collision. The dictionary handles this internally by using a linked list for such keys. Understanding this mechanism is crucial when using custom objects as keys since the quality of the hash function can directly impact the dictionary’s performance.

In essence, a good hash function should distribute keys uniformly across the underlying array, minimizing collisions and ensuring efficient operations. If we’ve incorrectly implemented GetHashCode or Equals, or we’ve got hash codes that are not well distributed, we can certainly encounter issues with our dictionaries.


How Dictionaries in C# Handle Large Amounts of Data

Dictionaries use a technique called hashing to quickly locate a data record given its search key. When you add an item, the dictionary calculates the hash of the key and uses it as an index to store the value. This means that, in theory, the time it takes to find a value (or to add a new one) is constant, regardless of the number of items in the dictionary.

However, as the dictionary grows and more items are added, the chances of collisions (where two keys have the same hash) increase. When a collision occurs, the dictionary needs to find a new slot for the key, which can slow down the insertion time.

Example:

Dictionary<int, string> largeDictionary = new Dictionary<int, string>();

for (int i = 0; i < 1000000; i++)
{
  largeDictionary.Add(i, $"Value {i}");
}

In the above example, even though we’re adding a million items, the Add operation remains relatively fast because of the hashing mechanism.


Performance Implications of Various Operations

  • Addition: As mentioned, the Add operation is generally fast. However, if the dictionary needs to resize (because it’s reached its capacity), there can be a performance hit as it creates a new, larger underlying array and copies the items.
  • Retrieval: Retrieving values using keys is incredibly efficient due to the hashing mechanism. The dictionary can quickly compute the hash of the key and jump directly to the memory location where the value is stored.
  • Removal: Removing items is also efficient. However, it’s worth noting that the dictionary doesn’t shrink when items are removed. If you’re adding and removing items frequently, the dictionary might end up using more memory than necessary.
  • Traversal: Iterating over the dictionary using a foreach loop is linear in time, i.e., the time taken is proportional to the number of items.

Introducing The ConcurrentDictionary For Multi-Threading

In multi-threaded applications, where multiple threads might be accessing and modifying a dictionary simultaneously, using a standard Dictionary<TKey, TValue> can lead to race conditions and unexpected behaviors. Enter ConcurrentDictionary, a thread-safe variant of the dictionary.

ConcurrentDictionary is part of the System.Collections.Concurrent namespace and is designed to handle multiple threads. As the class name suggests, it’s a concurrent dictionary variation designed specifically for this situation. It uses fine-grained locking, ensuring that only a small portion of the dictionary is locked at a time, leading to better performance in multi-threaded scenarios. We’ll take a closer look at this later.

Example:

ConcurrentDictionary<int, string> concurrentDict = new ConcurrentDictionary<int, string>();
concurrentDict.TryAdd(1, "One");

The same operations that we perform on a normal dictionary are available to us, but they will all have slight variations to them due to the current nature of the class. That means you will notice slight differences in naming and return types, but ultimately, a concurrent dictionary has the same functionality.


Why and When to Use A Concurrent Dictionary

If you’re developing an application where multiple threads might access the dictionary, using a ConcurrentDictionary is a safer bet. The word safer is used here and not safe because while the class might be thread-safe, you’ll still need to understand the performance implications of using such a collection across multiple threads.

For instance, in web applications where multiple user requests can access shared data, or in parallel processing scenarios, a ConcurrentDictionary can prevent potential data corruption or race conditions. Instead of developers trying to implement their own locking and mutual exclusion for the dictionary class, the ConcurrentDictionary class handles this for us.


Differences between Dictionary and ConcurrentDictionary

  • Thread Safety: The most significant difference is that ConcurrentDictionary is designed to be thread-safe, while a regular Dictionary is not.
  • Methods: ConcurrentDictionary has methods like TryAdd, TryRemove, and TryUpdate, which are atomic. This means that the check for the key’s existence and the addition/removal/update of the key-value pair happen as a single, uninterrupted operation.
  • Performance: In single-threaded scenarios, a regular Dictionary is faster because it doesn’t have the overhead of locking. However, in multi-threaded scenarios, ConcurrentDictionary can offer better performance due to its fine-grained locking mechanism.

Example:

Parallel.For(0, 100000, i => { concurrentDict.TryAdd(i, $"Value {i}"); });

In the above example, we’re using Parallel.For to add items to the ConcurrentDictionary from multiple threads simultaneously. This operation would be unsafe with a regular Dictionary. Another question that we’d want to ask ourselves here is: is this actually faster? Keep in mind that yes, this dictionary implementation is thread-safe, but it does have to lock to provide that safety!


Wrapping Up Dictionaries in C#

Dictionaries in C# are undeniably powerful tools, offering both beginners and seasoned developers a versatile data structure to efficiently store and manage key-value pairs. From the foundational concepts of a simple Dictionary to the advanced, thread-safe capabilities of ConcurrentDictionary, we’ve covered a majority of typical use cases for dictionaries in C#.

However, understanding the nuances of dictionaries, such as their performance characteristics and the intricacies of hashing, can significantly elevate your usage of them. The introduction of ConcurrentDictionary showcases the adaptability of the .NET framework, ensuring developers have the right tools for both single-threaded and multi-threaded scenarios, but there’s still work to use it efficiently.

As with any tool or concept in programming, the real mastery lies in practice. So, whether you’re building a simple application or a complex, multi-threaded system, take the time to experiment with dictionaries. Now that you know several varieties of dictionaries to use, you can determine when their usage is appropriate. And remember: when in doubt about performance, you can always benchmark your code.

If you thought this was helpful, consider subscribing to my weekly newsletter to stay up to date on future articles!

author avatar
Nick Cosentino Principal Software Engineering Manager
Principal Software Engineering Manager at Microsoft. Views are my own.

Leave a Reply