Shocking Iterator Performance Benchmarks in C# dotnet
March 17, 2023
• 980 views
Should you be using an iterator or a materialized collection for your data-fetching API? In this video, we look at the performance characteristics of both by examining iterator benchmarks in C# dotnet compared with materialized collections.
When you're done with this video, you should follow up with this one:
https://www.youtube.com/watch?v=0s_VMhZSOwQ
You can also follow along in the corresponding blog post here:
https://www.devleader.ca/2023/03/17/shocking-results-from-collection-and-iterator-benchmarks/
Want the source for this video? Check it out here:
https://github.com/ncosentino/DevLeader/tree/master/AllAboutEnumerables/AllAboutEnumerables.BasicIteratorBenchmarks
For more information about enumerables, iterators, and collections, see:
https://www.youtube.com/watch?v=RR7Cq0iwNYo&list=PLzATctVhnsgjE3qOsbkPaC1NxXD605wOC
View Transcript
I ran some benchmarks on iterators versus materialized Collections and the results were shocking okay but in all seriousness some of the results were exactly as I expected but actually I was pretty surprised by some of the other results that I saw in the output if you haven't checked out my video that compares iterators materialized Collections and the pitfalls they both have that I've seen in production code stay right to the end because you'll want to check that out for more context so before I walk us through the Benchmark setup and The Benchmark output I want to hear from you in the comments what your thoughts are about iterators versus materialized Collections and what their performance characteristics are going to look like and without wasting any more time let's just jump right into the code alright I have the all about innumerables solution pulled up from
the dev leader GitHub repository I'll have a link for that in the description so if you want to go check it out clone it run the code yourself you're welcome to go do that and really I wanted to be able to put this together because I was following up on some of the other content I was creating where I was looking at iterators and materialized Collections and actually some of the challenges that you can encounter with both of these but full disclaimer in my own personal and professional use I almost always stick to having a streaming iterator based API in my experience based on the trade-offs that I've seen in production code having a streaming API leveraging iterators has led to more flexible usage of apis of course I'm not saying it's perfect but this is the pattern that I've generally stuck to when I
put these benchmarks together I was very confident that I was going to see a particular output and when I was looking through the results I did confirm what I expected to see which was great it was a nice little confidence boost however there was one characteristic that I was extremely surprised about legitimately and I want to be able to walk through these examples and then go through the output so that you can see what I saw so let's go ahead and scroll through what we have going on here if you're not familiar with it I am using benchmark.net in my opinion it's an awesome way to set up benchmarks that you can keep re-running it's highly configurable and even takes care of things like warm-up runs so you don't have to worry so much about consistency if you have a one-off thing that just seemed
to perform better when you ran a benchmark alright so I'm using top level statements here and you can see that there's no program class I just have the actual code kind of starting off here so I do have a configuration here that I'm creating for benchmark.net and then basically I'm going to be just running the benchmarks that I have and the benchmarks are a class so you can see below right here I just have this little check here that runs after because I've seen particular cases where the Benchmark ends up finishing really quick and it's because there wasn't actually an error in how I configured it and then it looks like it succeeded but there was actually nothing wrong so the more interesting part is actually this Benchmark class just to call it out I am going to be using.net 7.0 and you can see
that I just have the benchmarks class flagged with that attribute now the different scenarios we're going to be looking at are an iterator we're going to be looking at a method that populates a list and returns a list and then we're going to look at a very similar method but the return type is not going to be a list it will actually be an I read only collection I added in this case after because I was speculating about some of the performance characteristics I was seeing between these two things next I just have this configured to run a thousand a hundred thousand and ten million in terms of the size of the data set and then this other property here is just such that it will iterate through the different types of implementations that I have but benchmark.net will basically go through the permutations of
these for me automatically next you'll see that we have this setup it's marked as Global so it's going to run once for the duration of the entire Benchmark and really I'm just going to be creating an array called data set it's an array of integers and I fill it with random numbers and I've seeded it with this constant seed just so that between runs there's no variability it's just one less thing for me to point at and say I wonder if this was any different in this particular case I don't really think it matters I'm just trying to practice doing that okay the three different methods that we're going to be running are now on my screen so you can see right here highlighted the first one that we're going to have up here is the one that returns as a read-only collection and you
can see that it will populate a list right right here when it Returns the results the return type is I read only collection the next one right below is almost the exact same literally the only difference is that the return type is a list of integers and finally the third one is similar but it's an iterator so instead of populating and materializing a list we would actually just be iterating through okay so let's look at the different scenarios that we're going to run against these three different methods the first scenario that we're going to look at is actually just calling the link method any on the result of these functions next we'll actually run the link count method on these the third one we have here is calling to array which is also a link extension method what this one should be doing is actually
making a copy of the result set and putting it into an array an interesting variation of the previous one that I just wanted to experiment with was actually taking half of the size of the collection and then trying to copy that to an array using two array from link why did I do this I was just kind of curious to see if there would be any different performance characteristics so we might be able to observe between this and the previous one and finally I'm just going to run a simple for each Loop across these three different methods you might be asking why didn't I go with just a for loop with an index and that's because with an iterator I actually can't do that so I just wanted to pick something that would be common across the three all right at this point cue up
your drum roll because I'm not going to make you sit through running these benchmarks I already have that done if you do try running this locally you'll notice that it takes a few minutes so instead of making you stare at a console I will edit this video and we'll jump right to the results now so if you've never seen the output from benchmark.net dropped into the console like this it's nice and colorful there's a ton of columns with a ton of numbers so let's walk through it slowly to see what we have going on here the first part of the table that's output here is going to be the for each test that we're going to be running across the three different scenarios if we look over at the implementation column I apologize but the name that I chose to use in my enum does
look pretty messy but this one that actually is trimmed is the one that has a read-only The Collection return type and it's important to note that as we walk through these you'll notice a similar pattern where we're going to be testing with a one thousand one hundred thousand and ten Million result set and then each of the different implementations kind of in the same order so this will repeat as we go through and all that's going to be changing if you look at the left side is the method that we're using all right so when we look at the first little sample that we have here I'm looking at the mean column and if we look at the performance characteristics it looks like populate as list is actually the most performant in fact it's half the time that an iterator took you'll also notice that
it's significantly faster than when we returned as an I read only collection which is very interesting and I'll explain that in a moment the other piece of data that I'm really interested in is actually this far right column that says allocated and this is going to be the amount of memory that was allocated to run the scenario so at a glance we said that the I read only collection return type is definitely the slowest and if we look far to the right in terms of memory usage it's right on par with the listings example so that means from an implementation perspective if we're considering runtime performance and memory usage right now we're kind of really just comparing the iterator versus the materialize list so while the materialize list in this example was twice as fast as the iterator it was two orders of magnitude more
memory allocated so right when I saw this first example I was already pretty shocked because I totally expected that the memory allocated was going to be significantly better for the iterator than the list however the runtime performance when we look at the mean column is really what shocked me here if we look a little bit lower at the larger data sets the hundred thousand count is actually kind of interesting because the iterator and list are actually quite comparable if we look at the amount of memory allocated the iterator still only allocates 48 bytes but the list is allocating significantly more memory once we get up to a 10 million data set size the list continues to outperform the iterator but the amount of memory allocated just keeps ballooning while the iterator Still Remains under 100 bytes allocated so why is the performance of that so
shocking to me and what do I think is actually happening here well let's jump back to the code and I want to go look at the iterator implementation and the get results using list as list implementation so like I said the amount of memory allocated is not the thing that's surprising me here it's the performance characteristics when we look at the iterator we have to go enumerate the entire data set and yield back each item that's going to mean that when we have a 10 million data set size we have to do 10 million iterations to return these back out and because this is an iterator and we have a for each Loop wrapping it it kind of acts like a pipeline so really you're only doing 10 million iterations in total however if we go look at the list implementation this is going to
explain why it's shocking to me we basically still would have to do the same 10 million iterations even before we leave this method and I'll say that again we have to do 10 million iterations when the data set is 10 million items just to populate the list of results that we want to return however once we have that entirely new populated list that took 10 million iterations we're gonna go run a 4-H loop on that 10 Million result set and do another 10 million iterations this will mean that in total for the list we're going to do 20 million iterations and it's still more performant than the iterator to me that's completely baffling because we're doing twice as many iterations in total but I think we have two things going on here the first part is that iterators in general are less performant because of
how they're structured with the yielding mechanism in my opinion I wouldn't think that alone is enough that it would be slower than when you're doubling up the total number of iterations but I think the other thing that's going on here is that we have a little bit of magic going on with the list return type in the latest versions of dot net we're actually able to use for each loops on lists and arrays and they're treated as read-only spam fans and I think that's where we're getting an enormous performance boost because when we're actually doing that outer for each Loop that's also going to be using a read-only span and going super fast in the case of the iterator for the outer for each loop we're not going to have the luxury of using read-only spans because it only sees it as an innumerable and
this is actually where I wanted to try proving my hypothesis by adding this other method that looks very similar but returns as an I read only collection the mechanism for using read only spans as that performance optimization in for loops and for each Loops works on lists and arrays however the return type here is an I read only collection of ins it is not a list and it's not an array so if we jump back over to the numbers and we look and we compare the performance of the I read only collection return type versus the list you can see that it's significantly faster when it's returned as a list and I think that that's proving my hypothesis that it's able to use read-only spans on the list return type getting a huge performance boost when we compare it to the I read only collection
return type now let's go ahead and start checking out the other methods that we wanted to exercise here so something that's really powerful when we're contrasting the iterator approach versus the materialize collection approach is the link method any if you've watched my other videos on this I'm not suggesting that if you have code that's returning enormous data sets when you're accessing something like a database that the only way to make that feel good is that you have to basically move to an iterator instead of materializing a collection because you can absolutely make the argument that you should just reduce the size of the query and do paging or something like that however in practice I often see that people are doing things like calling link methods like any count other things like that so my goal here is not to tell you that one is
always better than the other I just wanted to show you with some data points about how these behave okay so the performance characteristics of the any link method are actually exactly what I would have expected and that's because if we look at the iterator approach it is always way faster than the other implementations the reason that's the case is because it's using a streaming API and as soon as it recognizes that it has a single item it's able to terminate doing the iteration but when you're materializing an entire list you actually have to fully populate that list before you can even ask if it has anything in it this is also Illustrated if we look at the memory allocated column because just to check if we have anything in the results that are returned from those methods both methods that have to fully populate a
list allocate way more memory than the iterator does so this one checks out for my expectations the count method from the link name space was also kind of interesting and I think it's quite comparable to the for each example that we looked at in the beginning as expected the iterators don't allocate much memory at all while the materialized collections of course have to make an entire copy of the data set and allocate way more memory however the runtime performance results for this were also a little bit surprising to me when I think about the implementation of the iterator approach again because it's like a streaming pipeline I would have expected that it does one full enumeration and counts as it goes and actually in hindsight I should have caught this about the list and I read only collection approaches but I was thinking that it
would have to do two full enumerations one to populate the list one to actually count through the list but I totally forgot that the link implementation has an optimization for collections and if those collections actually have a length or a count on them it's able to use that and short circuit instead of actually doing an entire full enumeration of that data set so yes across the board you will notice that the iterator is less performant than the list and the I read-only collection return types but given that they're very comparable in terms of runtime if we're looking at the amount of memory allocated I would say that the amount of memory allocated is a way bigger deal breaker when you're considering materializing an entire collection versus just being able to use an iterator now I want to skip to the bottom one that actually calls
two array from the link name space before we go to the one right above that that does half to an array so this example was interesting to me because in my head I'm thinking that if we have an iterator or a materialized collection we're going to be making an entire copy of that data set using two array this should imply that we essentially have to do a full enumeration of that data set and allocate that memory for the new collection so yes the iterators have been performing really well in terms of memory allocation but this would be an example where we actually have to allocate a bunch of memory just for the result of what we're trying to perform over that iterator and then again my expectation here would have been that when we're using materialize Collections and calling to array it would have to
do an entire second enumeration over that materialized collection if we jump over to the memory allocated across the board when we're looking at iterators they're always going to be allocating way less memory for this and that's of course because inside the iterator itself it is never materializing another additional collection whereas the other implementations have to materialize a full collection before even returning that part was totally as expected the performance characteristics absolutely were not what I was expecting here the iterator approach for the 1000 element data set is three times as long to run as the other two when we bump that up to a hundred thousand items it is three times as long to run the iterator than the other two and finally when we're up at 10 million items it's just less than twice the time it takes to run the other two one
hundred percent not what I expected and that's because again I was expecting that the amount of iterations we would have to do on the materialize collection implementations would be literally twice as many enumerations as with the iterator and even if that is the case for how it's working under the hood I think again we're getting lucky because the implementation of two array is probably heavily optimized for collections that already have memory allocated in the case of the iterator we're paying the performance hit of using the iterator in general to yield back items but then we also have to go construct a collection and because we don't actually know the entire size of the data set I'm assuming that it's paying a performance penalty to keep resizing that collection to be able to fit the results of the iterator alright if we jump to the very
last example that I want to talk through today and we look at the take half so this is going to use the take Link method to take 50 percent of the size of the collection and then call to array on that we get an interesting blend of results here the reason I wanted to implement this one and examine the characteristics of it is because I think that this type of thing even though it might be a little bit contrived in terms of taking half there's actually a pattern that I see coming up a lot in real code and that's actually that we get a materialized collection then we're running link methods on it and even though you had that materialized collection once you start running link methods on it you're now dealing with iterators again everything in link is leveraging these streaming iterator apis and
it kind of proved that to you I think these results actually speak to that you'll recall that when we called two array on the iterator it was way less performant than the other two but when we look at the take half and then call into array all of these perform roughly the same if we look at the larger result sets you'll notice that at a hundred thousand items again they're all almost the exact same in terms of runtime once we got up to about 10 million then yes we start to see the iterator pull ahead and if we jump over to the memory allocated across the board we're always going to have the iterator allocating far less memory but it still has to allocate some because the result of what we're doing is copying that output to an array so again to highlight why I
think this one is interesting is even if you take into consideration all the other stuff that I was talking about in terms of hey look it looks like using iterators is allocating less memory but if you're going to be using a list you're getting these performance boosts in some cases regardless of what you're doing when you think about how people actually consume the API that you're creating you might get completely different performance characteristics than you would have been expecting so sure you might jump right to well I guess that means we can't use any link and I'm not here to tell you how to go design your code either all that I'm trying to illustrate to you is that if you have use cases where your API is being consumed and it was a materialized collection under the hood and you were doing that for
some performance benefits if the other developer is calling your API end up running link on the result of that they're essentially just undermining a lot of the performance gains that you were trying to create so in summary we got to look at the performance characteristics and memory characteristics of using iterators versus materialized collections When I Was preparing for all of this I went in super confidently that I was going to see iterators below the performance and and memory characteristics completely out of the water compared to materialized collections however yes I was pleasantly surprised to know that I was right about the amount of memory that was allocated for iterators but I was completely shocked about the performance characteristics when we looked at iterators and materialized collections again I'm not here to tell you that one is right and one is wrong I'm just trying to
provide you with some data so that you can design your software in a way that makes sense for you thanks so much for watching if you like this video please give it a thumbs up subscribe to the channel and of course as promised for sticking right to the end you can check out my other video about the pitfalls of both iterators and materialize collections by checking this out here foreign
Frequently Asked Questions
What were the main findings from the benchmarks comparing iterators and materialized collections?
I found that while iterators allocate significantly less memory than materialized collections, their performance in terms of execution time was surprisingly not as favorable as I expected. In fact, the materialized lists performed better in terms of speed despite requiring more iterations overall.
Why do iterators seem to perform worse than materialized collections despite using less memory?
The performance of iterators was surprising because even though they allocate less memory, they require yielding each item one at a time, which can slow down execution. In contrast, materialized collections can leverage optimizations in the .NET framework that allow for faster iteration, especially when using read-only spans.
How can the results of these benchmarks impact the design of my APIs?
These results suggest that if you're designing APIs that return collections, you should consider how consumers of your API might use those collections. If they end up running LINQ methods on materialized collections, it could negate the performance benefits you were aiming for, so it's essential to think about the overall usage patterns.
These FAQs were generated by AI from the video transcript.