BrandGhost

UNEXPECTED 87% Performance Boost! - C# Collection Initializers

In C#, we have multiple ways that we can create collections. We have collection initializers in C# as well as collection expressions in C# - both of which create new collections with items in them. But have you ever wondered about the performance of these two? Let's use BenchmarkDotNet to performance profile our C# code and see if collection expressions have different behavior!
View Transcript
if you've been working with C for a while you know that there's a handful of different ways that we can initialize Collections and more recently we've been given even more options to do so now recently I saw a post on LinkedIn from Dave Ken that made me think wait how the heck is it possible there can be that big of a difference between just instantiating collections just because the syntax looks a little different hi I'm Nick centino and I'm a principal software engineering manager at Microsoft in this video we're going to be going over benchmarks for collection initializers and I know that might not sound so interesting but when you see the performance differences between these different ways that you can instantiate collections you might change your mind now as always the goal with these types of videos is not to tell you that you have to go change how you're coding it's really to show you that you can go Benchmark things to go learn about them now before I jump over to this claim about this performance just a quick reminder to check that pin comment for a link to my newsletter and my courses on dome train but that said let's go check this out all right so I've created a Blog article about this because I also wanted to document this in written form and the original Poe from Dave Ken has these two examples here where he has this Baseline for regular list where he's just creating this Benchmark that makes apple banana and orange three elements inside of a list using this traditional collection initializer now when we look at the collection expression for a list we see apple banana and orange the same three exact elements it's returning a list here and the only difference is that we're using the tax that has Square braces that's it right like we don't use the new list string with curly braces it's just the square brackets both of these things go make a list of strings with the same three items in it now here's the catch if I scroll up a little bit here's the Benchmark result that he posted and Dave is awesome for posting really interesting benchmarks I love looking at his content because every time I'm looking and I'm like man I really didn't expect to see that there's always something really cool and in this case if we look at the ratio column right the regular list being the Baseline his claim from his benchmarks was that the collection expression list was 1.58 times faster like that's a dramatic Improvement I know we're talking about really small lists here I know that this could be maybe outside of context for many of your applications this is why I said I'm not trying to tell you to do it a different way but this is really interesting to see and that got me curious because I said look I'm going to go try to Benchmark a whole bunch of these and see what the difference is so I'm going to jump over to visual studio now we'll go through some of the Benchmark code so you can see how it's set up we'll look at the lists in particular so we can see similar results or not to what Dave has here and then we'll talk about those and again the point is I want you to see how the benchmarks are set up and just remind you you can go do this on your own code and try and prove out different things I'm encouraging you to do that I'm not going to walk through this setup for benchmark net I'm just going to point out that's what I'm using and we have the Benchmark runner at the top and then I'm going to go through the list benchmarks that we have here so the few things that I want to point out these are going to be reused in a few different benchmarks so this is just going to be the same set of three elements just like Dave had in his example and I have a couple of different variations that aren't the same as what Dave had but ultimately they're list variations right so I'm starting with a classic collection initializer this is exactly what Dave had I just put it on different lines right like nothing different there but then I have a variation of that where we explicitly set the capacity right so there's three elements here I'm going to set it to three in this particular case above we don't set the capacity at all I don't think there should be a difference in that but why not try it right like I'm curious now that Dave's posted this I didn't think those two things would be different anyway scrolling a little lower we have the collection expression so this is again the second example that Dave had that's the one that's supposed to go really fast in comparison and then I wanted to try out a few variations that aren't the exact same thing right like in this case we are starting with three explicit items I wanted to say hey look if we went and copied that data that we had and we were copying it from an array could we go make a list and what would the performance characteristics of that be like especially if we compare it to the initial case or the Baseline case that we had what what does it look like and the same thing very similar right if instead of copying from an array do we have different characteristics if we're using an iterator and I would expect that this should be slower because just by creating and using an iterator it should be slower there's overhead for doing that especially with only three items right so if I scroll back up you can see the iterator here like it's an enumerable but it is truly an iterator because it's yield returning those things back so go back down those are the two copy Constructors that we have and then I was like look Dave's example it blew my mind right I those those two things should be the exact same so I wanted to say hey look if we go make a list using that same syntax but not putting the items in it in the first place if we go manually add them what's the performance of that like and similarly if we go do it with new and we put in the capacity what's that going to look look like I can't put a capacity up here so if I try to put three it doesn't allow it you can see that it's put the squiggly line here so I can't do an exact comparison but when I knew it up I can provide the capacity so I did do that and then add those three things manually so a few different flavors we're looking at we have the two that Dave was calling out I've added a variation of that with capacity included then I did two copy Constructor variations and then I did these two variations where we're manually adding stuff because why not we're already here we might as well run the benchmarks okay now you've seen the setup for it this is the part where I go run this video for another 25 minutes you can hang around here wait for these benchmarks to finish I'm going to go check the food in the oven and I'll be right back while you hang out here let's go ahead and check out the numbers so the important column that we want to focus on is this ratio column mine is a little bit different than Dave's m doesn't say the percentage faster but it says the fraction of the time that it took right so a a higher number in the ratio column is worse it means that it took longer to run but we are interested in this middle column here that says ratio the very top we have the default case so no capacity being set but we provide three items this is the first example that Dave had and if we look at that that's a ratio of one that's our Baseline the first version of this that we run just by setting a capacity of three is almost half the time and really that's the punchline of this whole video right like this one just by setting the capacity you're almost doubling the performance this is faster than all of the other variations that we have here but I still want you to stick around so we can talk through this and make more sense of it because if you leave now you won't see if Dave was right or wrong right and that's what you really want to see so if we look next we can see the collection expression this is the one that Dave also included so maybe I shouldn't have put these one after the other but still hang around we can see that we have a 64 ratio this is going to be basically right on par with what Dave had if we invert it to get the the speed multiplier so 1.5625 in his example if I scroll back up through my blog sorry for your eyes he had 1.58 almost the exact same number right so Dave was spoton you can go way faster than just the normal collection initializer by using the square brackets but you can also use the traditional one and just put the capacity in two that's going to be even faster apparently now copy Constructors they're slower we can see these two in a row they're slower I figured that was probably going to be the case but here's what I thought might be interesting to consider I knew that the iterator one was going to be slower I didn't know how much slower but I knew it was going to be slower because iterators can suck when you have a small amount of data and you need to be able to set up iterator in the first place so slower not surprised the copy Constructor using an array as a data source to me is a little bit interesting I would have thought that it being a concrete collection where we know the count that there might be an optimization behind the scenes that sets the capacity and therefore makes adding items in pretty fast so I would have expected it to be maybe a little slower because it's going to iterate over it but I also thought that maybe there was some type of a aray optimization behind the scenes I didn't want to go spoil it for myself and check it out I just wanted to run the Benchmark and see so I am quite surprised that it's almost twice as slow not quite but it's slower for sure than the Baseline now these next two results are interesting I'm pretty shocked the manually adding of items into a collection right so if we use this no capacity one it did use the new format with the square brackets it's slower right we're manually adding stuff no capacity set and I am not really surprised that it's slower it's about 10% slower right we have a 1.1 ratio here so okay I'm not shocked by that but look at this next line This is manually adding things in right we are adding the three items one by one we did set a capacity though and it's half the time almost half the time right it's a little bit over of the counterpart right above it that means that manually adding three things is actually faster than a traditional collection initializer without a capacity in fact according to these benchmarks it's even faster than the example that Dave gave these two numbers 61 and 64 are extremely close I would say like if I ran these again they might fluctuate and maybe they're right on par with each other but the interesting part is in this particular case it came out to be a little bit bit faster right regardless the fact that it's on par with it is shocking to me I would not have expected that by manually adding three things after even with the capacity set would be faster to me that's just shocking so if we look through the three fastest ways that we have here it is a classic collection initializer with setting a capacity that is kind of weird to me and I want to call that out because I'm not sure at compile time why this would be complicated to do but I would figure that if we listed out a few things and it was constants right if we had a constant set of things being added we should know what that capacity is so I would think that maybe there's some type of compiler optimization where we could go inject that number as the capacity I don't work on the net team I'm sure net people that work on the team know why that can't be done or there's some other reason for it I'm sure someone's thought of it but I don't know the answer so to me it's a little bit surprising that is the fastest way in this list I'm not saying there aren't faster ways the next one that we have is one of the last ones we looked at and that was declaring the list so classic list declaration setting the capacity and then manually adding in the three items that was literally the next fastest way that we had and the third fastest way is what Dave showed in his example that got me to put this together in the first place I still find it completely fascinating that just declaring it with a different syntax right using square brackets instead of curly braces you know 64 compared to one it's pretty dramatic and look I need to say it again because I'm sure someone watching this or someone will get into the comments saying well this is completely contrived like who cares about these little optimizations like this you know you shouldn't be wasting your time trying to micro optimize like this like I get it that's that's not my point right I thought this was was interesting data I wanted to go explore and try it out like I said in the beginning of this video the whole point is not to tell you you better go switch all your code to square braces instead of curly braces the point of the video was to say look if you're curious you have all of the tools at your disposal to be able to go measure this stuff in fact that's exactly what I recommend you do if you're trying to get more speed out of what you're building right profile your code and Benchmark it with the Alternatives and if you want to understand how to do that better you can watch this video next thanks and I'll see you next time

Frequently Asked Questions

What are collection initializers in C# and why are they important?

Collection initializers in C# allow you to create and initialize collections in a more concise way. They are important because they can improve code readability and, as I demonstrated in this video, can also have significant performance implications depending on how they're used.

How did you benchmark the performance of different collection initialization methods?

I used BenchmarkDotNet to set up various tests comparing different ways to initialize collections. I created several scenarios, including traditional collection initializers, collection expressions, and variations with explicitly set capacities, to see how they performed against each other.

Should I change my existing code to use collection expressions instead of traditional initializers based on your findings?

I'm not suggesting you should change your existing code just for the sake of optimization. The goal of this video was to highlight the performance differences and encourage you to benchmark your own code. If you find that using collection expressions or setting capacities improves performance in your specific case, then it might be worth considering.

These FAQs were generated by AI from the video transcript.
An error has occurred. This application may no longer respond until reloaded. Reload