I Messed Up - MoreLINQ Batch Benchmarks Fixed
August 19, 2024
• 276 views
linqmorelinqprogramminglinq querylanguage integrated querylinq performancelinq tutorialyield returnyield return c#dotnetc# linqlinq c#learn linqintroduction to linq in c#linq in c#.netbatch c#ienumerableienumerable c#more linqdotnet linqlinq how tolinq tutorial C#linq tutorial for beginnersc# linq tutorial for beginnerslinq for beginnersbenchmarkdotnetbenchmarkingbenchmark dotnetbenchmarkdotnet tutorialdotnet performance
Nobody's perfect (except my wife, our dogs, and our cats) and there's bound to be errors here and there in my videos.
Fortunately, I have an awesome audience.
In a previous video, I showed some benchmarks from MoreLINQ regarding batching.
It turns out that I forgot to parameterize one of the benchmarks!
Big thanks to a viewer for pointing this out to me, so I created a follow-up video to:
1) Take some responsibility for the error
2) Show you the corrected benchmarks
3) Remind you that another pair of eyes are helpful
View Transcript
all right so it looks like I messed up again and I'm hoping that this is a good reminder that having people review your code is a very valuable thing hi my name is Nick centino and I'm a principal software engineering manager at Microsoft in a previous video I was talking about using more link with the batching operation and we were looking at benchmarks related to more link and some of the other variations like hand rolling our own implementation of this I did make a mistake in my benchmarks and a viewer pointed it out in the comments so I'm very thankful for that it's a small little bug but it unfortunately changes the results of the benchmarks so I wanted to make this follow-up video to do some justice to these benchmarks again huge thanks to the viewer for making a comment about this and not
attacking me for it because yes I am only human and this kind of thing will happen if you like these kinds of videos remember to subscribe to the channel and check out that pin comment for my courses on dome train and with that said let's jump over to visual studio look at the mistake I made and then look at the new updated benchmarks so we can compare these things proper ly all right so in the previous video which if you haven't seen yet I will go link it up top you can go check that out and come back I was introducing the more link batch operation that we can use and that just explains how we can go through collections or innumerables and essentially chunk them up into smaller batches that we want to iterate across that video talked about those details and then went
into some of the benchmarking results comparing them to some naive implementations which I do have in this code as well but I want to focus on the mistake so if we look right here on line 61 I have batch of 100 this is a fixed size batch which on the surface doesn't seem like a problem but if we look at the other benchmarks we can see that I'm using this parameter called batch size this other Benchmark same thing using this parameter called batch size and up top we can see that I have these two parameters collection size and batch size which are supposed to be adjusted for every Benchmark that we're running so when we started to look at the results we had one outlier that was always always fixed at 100 so the other two benchmarks were properly cycling through these different batch sizes
so both manual approaches had that but this one down here was fixed at 100 so this does need to say batch size so again huge thanks to the viewer for pointing that out leaving a comment and doing so in a way that wasn't attacking me because of course I want to make this right and show you the correct information I have no affiliation with more link or anything else going on here so I'm not trying to make it Stand Out is good or bad I'm just trying to give you information to work with and hopefully teach you how you can use benchmarks and things like that for your own code so yes a very good lesson here to take away is that having other people review your code can be very helpful with that said I'm not going to walk through all the benchmarks what
batching does again I just want to jump over to the Benchmark results and then we'll walk through that if there's any more analysis or sort of uh thought process for why some of the numbers look the way they do we can start comparing and talking about some of the details for what's going on with matching before we continue on this is just a quick reminder that I do have courses available on D train if you're just getting started in your programming journey and you want to learn C you can head over to dome train I have a getting started in C course it's approximately 5 hours of content taking you from absolutely no experience to being able to program in C and after that I have my deep dive course which will take you to the next level with another 6 hours of content so
that you can start building basic applications head over to D train and check it out let's head back to the video I'm going to pull up trusty Photoshop here unfortunately I didn't take a screenshot of the other metrics from the previous video so when I go to cycle through them you'll see that when I toggle this layer it's a little bit more blurry I apologize for that I will read out the numbers for you because that magenta color in particular looks a little bit fuzzy so I do apologize for that but what I've done is I've gone through with some red and highlighted where there's in my opinion some big differences what's super interesting and I'm going to start with it is the allocated column on the far right is very very comparable so if we just have a quick look through so we'll start
at the top the blurry version that's on right now is from the previous video you can see that I've just taken a screenshot so the bottom right you can see that's where my sort of head is overlaid uh if if it's not currently being overlaid with my head in the current video but uh this blurry version is going to be the uh previous video so we can see that in basically every single case uh except for a couple sort of Midway like halfway down here um the more link implementation is using significantly less memory the only time that wasn't really happening was sort of where my cursor is right now uh and otherwise a little bit lower in the very bottom right hand corner uh again there's uh it's very close but it's not um so it's a little bit more with more link but
uh it's not winning by a landslide otherwise it was either the same or winning by a landslide so the allocated column if I toggle this and we go look at the sort of the new data it's still basically always winning by a landslide or very comparable so I didn't have anything extra interesting to call out so if we just again do a quick scan more link is winning by Landslide for all of these in the top when we're doing a collection size of 10 when we start going a little bit lower here just double checking uh winning winning winning winning um by a pretty large margin going even further especially at this batch size of one for this collection size of what is that that's uh that's 1 million um obviously there's going to be a lot more allocation going on here but we can
see that it's still very significantly improved for more link so very very good U pretty comparable in this next badge in the bottom here and then half uh half the memory allocation and then we basically see that same thing repeated uh for the next set of collection sizes I did kind of trim off the bottom just cuz I I couldn't get a full screenshot from the previous video but hopefully you can kind of see the pattern going on there there is not a big difference at all from the previous implementation when we were fixed at a batch size and I should call out I think the reason that's the case is because when we're talking about how much RAM is being used here I'm allocating The Collection out side of the Benchmark method so the collection itself that we're it iterating across should not be
taking up Ram according to the Benchmark that's should be a separate thing within the Benchmark it makes sense that the size of the batch is going to be the thing that uh indicates how much memory is being used because as we're iterating and yielding back each batch we're only having one batch at a time ideally that seems to be how more link is able to get this to work so that seems pretty impressive it's pretty consistent in my implementation that was kind of naive it's not quite like that it's pretty close but I think there's some other optimizations I could do so again on the allocation side not a lot of big differences but let's jump back over to this mean column because I think we'll probably still see that more link beats me in every single situation which is what I would expect but
I do want to show like by how much because I think that's changing a whole bunch so again we'll focus on on the red here and let me get this deselected so starting with the first one we can see the more link batching I had like uh what's this this is 30,000 NS and 22,000 NS they were only at 3 and a half for this was a batch size of one but I had them at a batch size of 100 remember so if I toggle this off we can see that when it's truly a batch size of one their performance is worse but it's still better than what I had so it's better by it's like half the runtime but instead of being like way way way better so that's pretty good this next one here again it kind of works the opposite way their
batch size even though it says 1,000 is technically 100 because it was hardcoded so this was showing that they were basically on par with what I had right slightly better but basically on par but if I toggle this off we can see that when it's actually a batch size of a th000 they are sign significantly more effective so that's a huge win for them compared to my code jumping to the next one we'll see so they would have been at a batch size of 100 right cuz it was hardcoded this is now a batch size of 100,000 so huge Improvement you can see that it's not even changing the runtime really between the previous one and the next one coming up but if we put this layer back on it's again it was still way faster than what I had when it was fixed at
100 for the batch size but when you do it properly and we have the batch size increasing uh sort of uh proportionally uh it's it's going to save more runtime so a huge Improvement just by doing this properly uh which is pretty interesting to see so next up is down here we can see the more link batching this say a batch size of one this would have been a 100 this is the new result where it's truly using a badge size of one so it's pretty crappy running time but still about half the runtime that I had when we were doing the manual streaming approach putting this back on when it was like this where it's blurry this is when it used truly a batch size of 100 so 100 as a batch size is going to be more effective than a batch size of
one because it has to yield back fewer times it's doing like two orders of magnitude fewer of those right so we would expect to see this be better and that's why when I take this off you can see the performance here looks worse it's still better than my implementation now in the middle here we basically end up having roughly the same runtime if I put this back on all these numbers are quite comparable so nothing too interesting I did want to call out that this moral link batching here this next red uh highlight uh it's basically this one is a quarter of the runtime when and this is uh when it's truly using the correct batch size the original results where it was still using a 100 this is still better but I just wanted to show that there is a bit of a discrepancy
so this is a quarter and this is basically like it was comparable when you have a smaller batch size you have to yield back more times there is going to be likely more overhead for that more copying for each batch this was better but it better than my code right but it was still not as performant when doing it the proper way for the right batch size and then the same type of pattern we're going to see in the code that follows so not really a huge value ad for me to go step through all of that but I just want to you know I'll toggle it again now that your eyes are in this sort of uh bottom area in the middle of the screen and you know same idea when we're truly using the right batch size you'll see that going from one
to 100 and the other way that you have this scale properly so overall um this is a Miss on my part so I do apologize for that it's a small bug I probably should have taken a little bit more time to see that in the code so again my apologies huge thanks to the viewer if you ever see this kind of thing in my code in my videos please just let me know like I said my goal is not I'm not trying to trick you uh I you know I do spend a lot of time putting content together but it's easy to make a mistake so uh my intention is never to mislead and I would much rather follow up with a video spend some extra time doing it uh keep in mind it costs me to make videos I don't make money off of
doing these maybe someday uh so for me to go make a video to follow up I'm just trying to do the right thing so hopefully this was helpful I will have more uh uh this is going to be weird to say more more link benchmarks and comparisons coming up so uh there's bound to be other errors in the future unfortunately so please um I do appreciate you keeping your eyes open and trying to call this stuff out so thanks so much for watching and yeah stay tuned for the next set of videos as we go through more link and check out these benchmarks I'll see you next time
Frequently Asked Questions
What mistake did you make in the previous benchmarks?
In the previous benchmarks, I mistakenly hardcoded the batch size to 100 instead of using a variable batch size parameter. This caused the results to be inaccurate, as it didn't reflect the intended variability in batch sizes across the benchmarks.
How did the viewer help you with your mistake?
A viewer pointed out the error in the comments, and I am very thankful for that. They brought it to my attention in a constructive way, which allowed me to correct the mistake and provide accurate information in this follow-up video.
What can viewers expect in future videos regarding MoreLINQ?
These FAQs were generated by AI from the video transcript.In future videos, I plan to continue exploring MoreLINQ benchmarks and comparisons. I appreciate viewers keeping an eye out for any potential errors, as I aim to provide accurate and helpful content, and I welcome any feedback to improve my work.
