C# Iterators Are AWESOME But Why I HATE Them
December 22, 2023
• 558 views
In this video, we're going to learn how to create a lazy enumerable in C# called an iterator. An iterator is a collection of items that are evaluated only when they're accessed, instead of being computed immediately.
This is a useful technique if you want to avoid expensive computations, or if you need to store data in a memory-efficient format. I'll demonstrate how to create an iterator in C#, and then use it to list all the items in a collection.
Have you subscribed to my weekly newsletter y...
View Transcript
and I've Loved using I inumerable for the lazy Behavior but I'm moving away from I inumerable for this one simple reason all right so today we're going to be talking about I enumerables in C and specifically we're going to dive into how they're used as return types I'll explain how their usage can lead to La Behavior but also how it can lead to eager Behavior as well as we go through some examples together I'll elaborate on why I've liked using them in the past and also why I'm trying to move away from them going forward before moving on just a quick note to check that pin comment for a link to my free Weekly Newsletter there's an exclusive article every week and early access to some of these videos that you see on YouTube all right let's jump over to visual studio all right on
my screen I just have a really simple application that's going to be looking at this number getter and this number getter we're going to check out the implementation in just a moment but I want to walk through what this highlevel program does as you can see there's a stopwatch here and what we're going to do is check out an implementation of a method that uses I inumerable and we're going to check out the lazy version of that we're just going to write that time out to the console and then basically do the exact same thing except we're going to look at an eager implementation of the same method so let's go check out what this number getter does on my screen we can see both implementations within the number getter class and we have the lazy implementation at the top and the eager one below
to explain what I mean by lazy is that we have I enumerable as the return type but we're making it what's called an iterator and it's an iterator because we're using this yield keyword syntax here if you're not super familiar with I enumerable and how we can transform it into an iterator by using the yield keyword a super quick rundown is that once we use the yield keyword inside of an i enumerable it essentially converts it into what's called an iterator and that means that the only type of return that we can have inside of here is one that we yield back out we can also yield break to stop an i enumerable or an iterator in this case the yield keyword is what provides us the LA easy Behavior behind this I inumerable implementation here so if we go to jump down to the
bottom version here which is the eager one that I have on the screen the similarity is that it's still an i inumerable as the return type but it's not going to be lazy and it's not going to be lazy because it's just going to return all of the values at once 1 2 and three inside of an array now you might be asking why I have a thread dot sleep on here and that's because when we go to run this I want to walk through some different examples of this behavior and I'm just using a thread. sleep to give us a little bit more of a delay so we can see what's happening as we run the code what I want you to keep in the back of your mind is that if we're considering loading some data from a data source like a database
or even reading things from a file there's going to be a little bit of a delay when we're doing that compared to just having stuff in memory so as we're going through this consider that we're getting some records from a database or reading some files and the different records that we're pulling back all have a little bit of a delay when we're doing that so this is going to be exaggerated because hopefully reading a single record from a database is not going to take you a full second and that would mean in this case a whole three seconds to get three records from a database but we're exaggerating it just so that we can see it in reality there will be some amount of delay hopefully very small but still something that exists let's go run this application and see the difference in behavior for
lazy and eager all right the results are done and they're really interesting if we see the lazy implementation took 0 milliseconds to run and the eager one took a full 3 seconds and a little bit of overhead 135 milliseconds on top of that but how is it possible that the lazy one was Zero how did it take no time at all to go run the lazy one well I tricked you a little bit and I'm sorry so the reason that it didn't take any time at all for the lazy implementation is because we never evaluated the iterator and I'm going going to explain that a little bit more when we go back to the code but that's a really key point that it's going to come up a little bit later and why I enumerable as return types especially with iterators versus not can get
a little bit weird so keep that in mind as we go back to the code so the issue that we have lies right here on line six where we say get numbers lazy and again if you're not familiar with I enumerables and iterators this is something that you probably never would have noticed so I don't blame you if you were surprised by it taking zero time at all but because this is a lazy implementation because it has the yield keyword therefore making it an iterator we need to evaluate it in order for it to actually process the time that it takes to get that data simply materializing the results here by doing two array doing a for each Loop over it or even two list will in fact Force this to evaluate and that means if I go run this now we can go see
the difference in the time it takes all right the results are in and now the L is also taking about 3 seconds we can see that these extra few milliseconds here at the end are a little bit different and it's really hard to tell if this is just some overhead with the stopwatch or something else but later on we'll see that when we go to Benchmark this we'll have some differences in performance but that's going to be in a follow-up video so stay to the end and we'll have more videos in this series to explore this the point here is that when we're forcing the lazy to evaluate that means that we're in fact getting similar performance to when we're doing it eager we have to pay that penalty of the thread dot sleep that I put in for each of those records and because
there was three each being 1 second that's why we get that time that we see so now that you've seen this lazy Behavior I just want to talk about historically why I've really liked working with I inumerable this goes back to working in a product that had a lot of records that we were pulling into a user interface for users to explore and one of the challenges was that we had developers that were effectively forcing an entire data set to be materialized and then tried to put that into the user interface and when I say entire data set I mean hundreds of thousands of Records or millions of Records so this was of course problematic because we couldn't pull all of that data into the user interface all at once so yes of course we could have added different implementations of how we were fetching
the data under the hood but more on that in just a moment and I want to talk about why using I enumerables for us really made things work we saw in the most recent example that by putting two array on the lazy version of the I inumerable which is an iterator IT forced the evaluation of that enumerable but there were many cases in the application I was working in where people just wanted to test if data was there or they wanted to get the first one out of a data set and because they were looking at an i inumerable they said well it's getting the data if I just ask it for the first one it should be good to go let's go back to the code and use something like any or first these link methods that we have and see what happens between
the lazy and in the eager versions so back in the code here I'm just going to replace this two array with any and down here I'm also going to do the same thing because we're only interested in checking to see if there's anything inside of this enumerable I mentioned that you could also use first these will have similar behaviors in this conversation so I'm not going to demonstrate both but what we'll see with any is going to be similar to first as we can see the run time is very different for lazy versus eager here they're both using any on the result that we're pulling back back but when we see the lazy one to check if there's any it only took us 1 second when we did the same thing on eager it took us the full 3 seconds so why does that happen
well the answer lies in this yield keyword here so because this implementation that La version is an iterator because it has a yield keyword all that had to happen to prove if there was anything inside of this I enumerable was simply getting to this first yield we had to wait that full 1 second to be able to yield return the number one but once we did that the innumerable was able to terminate we didn't have to go through the rest of this implementation to be able to see if there was anything inside of it in the eager variation because I wanted to make these comparable instead of waiting 3 seconds individually to get each item I just waited the full 3 seconds and because we have to do that to even see this collection come back that means we're paying the full performance penalty right
up front so thinking back to the example that I was mentioning and the product I was working in the developers were calling methods that were fetching data and because it was materializing full sets of data to be able to pull back to the user interface they were paying a big penalty not only in the runtime to be able to get those records but also in terms of memory because the entire collection had to be materialized in this simple example we're not even looking at the memory footprint however in the first example up here just by having the yield we were able to effectively skip all of this code code from running once we got one yielded back so I did mention a little bit earlier that in my example from where I was working yes we could have implemented new methods to go be able
to check things like any or to get the first record from a data set we could have implemented more unique specific things for those scenarios and I've Loved using I inumerable for the lazy Behavior but I'm moving away from I inumerable for this one simple reason API readability in both of these examples that we looked at if you're the caller of the method that we have you don't have an ability to know if it's going to be eager or lazy and that means that you don't know the performance implications of calling that you need to make assumptions about it so do you always assume that it's going to be eager do you always assume that it's going to be lazy or do you have some other way to know if it's going to be one or the other the reality is you can always find
out if you have access to the code but that means that as developers we have to pay this tax we have to go dig into the code to look deeper to see what's happening so if you see I enumeral as the return type you can't really stop there you have to dig deeper to understand fully what's happening under the hood if you keep digging deeper and seeing method calls that also have I inumerable you got to keep going until you finally get to the base scenario where you can see am I materializing a full list and returning it or am I yield returning at some point because if you're yield returning at some point the entire call stack all the way up now becomes C you're dealing with iterators however if at any point you're materializing some return value that's coming back up that call
stack from an iterator to a collection or from a collection to a collection you're not going to have a lazy implementation it's not going to be an iterator because there's no yielding back individual items and if that sounds kind of complex it's because it is but it's all buried behind the same API to you as the caller of these methods all that you see is I enumerable of some type and then the method name and the parameters you have no idea under the hood if it's lazy or eager until you go to run it or until you go to dig into the code now a lot of the stuff that I share on my channel I'm trying to explain to you these different programming Concepts but at the same time I'm trying to layer in real practical software engineering guidance and for me there were
too many headaches too many instances of people not understanding this Behavior with lazy versus materialized list that it made it really hectic for people to do the right thing so yes I inumerable did afford us to be able to do things like first or any and not materialize full data sets that were ridiculously big but the other side effects that we didn't look at in detail are things like people accidentally lazily evaluating things on the UI thread it was an i inumerable the whole way up and they finally materialized it on a blocking UI thread oops or another great one is that when you have a lazily evaluated I inumerable so an iterator in this case you could be fetching data from a database when you do things like calling any or first on it and it's not materialized and stored what ends up happening
is that people end up calling the same thing in different spots and it's the same pointer that same function pointer to this iterator and they go run the database query multiple times because it was never materialized and stored these are all characteristics of a lazy I inumerable and they can be used to your advantage but when you don't know what you're dealing with they can have a lot lot of problems that you weren't expecting so we did see that in both cases we can have an i inumerable that's lazy and we can have an I inumerable return type that's also eager and I mentioned right near the beginning of this video that we do need to look at benchmarks for both of these because the run times that we were seeing they were comparable but they're at the millisecond granularity I'm doing some benchmarking kind
of in the console with a stopwatch it's not super granular and we're not looking at the memory footprint either so we will want to look at the benchmarks but before we look at the benchmarks I really think think it's important to look at using I inumerable or other collection types both as parameters and return types so in that video is ready you can check it out here next thanks and I'll see you next time
Frequently Asked Questions
What is the main difference between lazy and eager implementations of IEnumerable in C#?
The main difference lies in how they process data. A lazy implementation, which uses the yield keyword, only retrieves data when it's actually needed, meaning it can return results without processing the entire dataset upfront. In contrast, an eager implementation processes and returns all data at once, which can lead to longer wait times if the dataset is large.
Why do you dislike using IEnumerable despite its benefits?
I find that while IEnumerable provides great lazy loading capabilities, it can lead to confusion regarding performance implications. Developers often can't tell if a method will return data eagerly or lazily just by looking at the return type, which can result in unexpected performance issues and increased complexity in understanding the code.
How can I avoid performance pitfalls when using IEnumerable in my applications?
To avoid performance pitfalls, I recommend being cautious about how you use IEnumerable. Always check if the methods you call will evaluate the data eagerly or lazily. Use methods like ToArray or ToList to force evaluation when necessary, and be mindful of where you call these methods to prevent blocking the UI thread or making multiple unnecessary database calls.
These FAQs were generated by AI from the video transcript.