BrandGhost

Combining Collections in C# - LINQ Zip and MoreLINQ Zip Methods

What's one of the most common ways we combine collections? Something like appending or concatenating -- pretty typical. But what about when you want to combine two or more sets of data item by item? What if it's not sequentially but in parallel to the other collections? This operation is known as Zip in LINQ and in this video we'll dive into how it works and what we have access to in MoreLINQ!
View Transcript
when we're working with collections it's pretty common that we want to be able to add things to them or sort of concatenate them together to combine them but there are situations where we want to combine them in a different way maybe where we want to zip them together hi my name is Nick centino and I'm a principal software engineering manager Microsoft in this video we're going to be talking about the link zip method along with the more link zip shortest and zip longest methods we're going to go through the very Basics here and then I'm going to show you as well how we can go try to build our own just so we can see what it looks like before we just start using the shortcut of link if you say right to the end of this video we'll be diving into the benchmarks for comparing all of these different approaches a quick reminder that if you find this kind of content useful remember to subscribe to the channel and check out that pin comment for my courses on dome train with that said let's jump over to visual studio and start working with some collections on my screen here I have two collections Source One and Source 2 I will briefly talk about some of these innumerable methods that we have going on here but if you're not familiar with I inumerable iterators and things like that I do have plenty of videos available on my YouTube channel you can just search for innumerable I inumerable Collections and you'll see a whole bunch of really basic tutorials the first collection that we're going to have up here is going to be from 1 to 5 and then I'm going to have a second collection that goes from the numbers 1 to 10 but what I'm also doing here is I'm reversing it I am going to be dealing with some strings here so I'm just converting these numbers into their string representation and that'll make a little bit more sense as we go ahead with some of these examples the other thing you'll notice is that I am making these innumerables into fullon arrays so I am materializing these enumerables into collections okay when we're talking about being able to zip things together the idea that we want to do is instead of putting one collection followed by the other we want to be able to combine the collections into something new and that means that for each one of these items that we have in a collection we sort of want to take the first one from the first collection and the first one from the second collection and make them into something new and then we would go on to the second one from the first collection and the second one from the second collection and put them together instead of having one for Loop for the first one and one for Loop for the second one to combine them and step through them we actually want to be able to step through them at the same time that's conceptually what we're trying to achieve here the other thing that you'll notice is that I purposefully made these different lengths right so one is five items the other one is 10 items and this is because this is an interesting con straint or Edge case that we have to think about and that is if we're trying to combine these Collections and they are of different lengths what would the expected Behavior be before we continue on this is just a quick reminder that I do have courses available on dome train if you're just getting started in your programming journey and you want to learn C you can head over to dome train I have a getting started in C course it's approximately 5 hours of content taking you from absolutely no experience to being able to program in C and after that I have my deep dive course which will take you to the next level with another 6 hours of content so that you can start building basic applications head over to D train and check it out let's head back to the video when we go to go look at the link implementation versus the more link implementation link will basically take the shortest of the two so if you had 1 to five and 1 to 10 the resulting collection will only be five items however with more link we get zip shortest and zip longest and that means that we have the option to pick between the behavior that we want but I want to start off by doing a bit of a naive implementation just to kind of show you how we might go approach this right so if we know that we have two collections here and we have two different lengths probably what we want to consider is that we're going to take the minimum between the two of these and then we can start stepping through them so to start things off we would do something like a count where we take the minimum of the two different collection sizes so Source length one and Source two length here right after so now once we execute Line 1 through 15 here we have the shortest length from there it's going to be relatively simple we're going to step through both of these at the same time with a four Loop so we would do 4 in I equals 0 I being less than the count and then increment I and then from there we'd be able to grab each item at the index from each different collection but we haven't yet talked about what the resulting item is supposed to be so we have in this case a string and a string what should we be doing with them and when we're dealing with the zip function so this extension method that we get from link and more link as well you're able to pass in your own selector and that will give you the option to take the first element and the second element and decide how you want to combine them in my case I'm just going to make a tupple or a tuple depending on how you'd like to say it and then we're going to add that into a list so I'm going to put that list up here so string so this is going to give us a tuple this list called results has an item type that is a tuple or Tuple of string and string and then I'm going to basically add into the results we'll see if co-pilot can do it right away for us it kind of did there we go so co-pilot thank you so we're going to make a new tupple in here we're going to take the First Source and we're going to take the item at position I we're going to take the second source and take the item at position I as well now at the end of running this for Loop we will have a collection that is populated with the results all the way up to the shortest of the two so in this case it's five so we won't even do from 6 to 10 in the second collection but there's something that's different about this approach that we just kind of walk through right this is a very naive sort of almost like pseudo code approach to how we might go do zip but the big difference is that I had to go make a materialized collection right so I created this collection ahead of time and then I started adding in the elements when we're dealing with things like link we are not creating a new collection unless unless we materialize it so if you write two array two lists kind of like I have here on line 11 and line six when we're working with link these are still just I enumerables and it has this sort of pipelining effect or this streaming approach whereas with this example what I have done is I created a whole other collection ahead of time so that is a very big difference again if you're not familiar with iterators please do a search on my channel I do have videos explaining yield return iterators and I inumerable so this is one big fundamental difference but this should demonstrate the concept of what zipping is attempting to do with this example we were also working with a little bit of extra information that when we're in link were not necessarily having the luxury of working with and that is the collection being materialized so in both of these collections that we had they were fullon arrays that means that we can index into them and it also means means that we have the length of them when we're dealing with I enumerables in link we don't have that luxury if we have an i inumerable we don't know the count we also don't have the ability to index into an arbitrary Point into an inumerable so while this was a bit of a naive approach it also meant that we were able to take advantage of some extra things that link is not able to rely on Strictly when dealing with I enumerables jumping down to the bottom of this file I already created an extension method that deals purely with I enumerables and again this is just to demonstrate a very naive approach for how we can step through two Ion numbles at the same time so if you're not familiar with extension methods or you're not familiar with ion numbles and generic types this might look a little bit confusing because you'll see a lot of like T1 T2 T result like this is a selector function that has three type parameters this might look pretty confusing but I'll explain what's going on here here kind of like the method that we saw that we just created we're going to be dealing with two iables and you can see that I have Source One and Source two the this keyword just allows us to create an extension method so that means when we go to call this it will look like we're able to say manual zip shortest after the name of one of the collections so it would look like Source one. manualzip collection then we're able to pass in Source 2 and then we can pass in a selector function Source 2 is just going to be some other I inumerable in this case again we're dealing with i inumerable for this example and not arrays or lists so they're not necessarily materialized collections this funk here called selector the way that this works is it's going to take in one element that is of type one and another element that is of type two and it must return a type of T result in our example that we made before this we were taking basically a string for T1 a string for T2 and then we had a tuple of two strings as the return type so hopefully that explains what this syntax looks like although I do admit that it's a little bit gnarly if you're not familiar with typ parameters the next part that we're going to look at is a very different way that We're looping through these two things and if you recall I mentioned that we got to cheat a little bit earlier when we were dealing with arrays and that's because we knew the length and we could index into the array we simply can't do that if we're dealing with I enumerables we don't know if it's backed by an array or backed by a list or if it's truly just some connection to a database or connection to some uh web service we don't know that and we can't tell unless we were to dive into the code so we need to basically assume that we can't do that I enumerable allows us to access an enumerator so you'll see on line 93 and 94 if I highlight both of them here we're getting an enumerator for each of these you'll notice that both of these online 93 and 94 also have a using statement out the front and that's because they are ey disposable so we need to make sure that we're cleaning up after ourselves in general with anything in C that's marked as I disposable now the while loop that we have here is different than the four Loop that we had but it's achieving the same purpose in the first example we knew the count and we said hey look let's just Loop until we reach that count in this case all that we're saying is loop as long as we're not at the end of one of these two high enumerables we don't need to know that count ahead of time to find the minimum between the two we can simply just keep trying to walk forward and as soon as both of them are not able to walk forward then we're able to stop the iteration you'll notice that I did Mark this as manual zip shortest and the example that we made in the very beginning of this was also an implementation that did the shortest it might be a little bit easier to see here but you notice this and symbol so we're saying walk through both of these while we can move next in one and we can move next in the other if we wanted to do manual zip longest we would be able to do something like this and essentially say as long as we can move forward in one or the other now it's going to be a little bit more complicated to do a zip longest because when we go to ask for the current essentially we need to have a default value that we're zipping alongside so we aren't going to do that for now but I just wanted to highlight the this is one spot in particular line 95 that you might want to consider if you were doing zip longest instead now if we go to look at line 97 through 100 here we are going to be calling our selector call back and we're passing in the current item from the first collection and the current item from the second collection the result of executing this selector is going to be however we decide to combine these two things and you'll notice that I do have a yield return here so Yi yeld return combined with an i enumerable gives us what's called an iterator and this is another part that's very different than the method we made at the beginning of this video because in this case we are yield returning and not making a whole collection ahead of time so This truly acts like a pipeline yielding one thing back at a time whereas the first example made an entire collection before we could even start to go work with it so that's going to be one of the fundamental differences I recognize that if you're not familiar with iterators yield return this kind of stuff is a little bit weird but I do recommend you play around with it and like I said there's other videos on my channel that you can check out to understand this in more depth okay so we were able to go build a naive implementation up here I'm going to comment this out and we are able to go ahead and use our other approach that has an extension method but it's about time that we go look at the built-in stuff that we have so I'm going to go introduce the link variation and the more link variation as well okay so the example program that we will be running is going to start off with the builtin link zip method and all of these examples are going to have a very similar syntax so you'll see Source One zipping with Source 2 and this syntax again I admit this looks a little bit nasty if you're not familiar with Anonymous delegates but this is essentially going to be the item from collection one and the item from collection 2 and kind of think about that we're stepping through that for Loop and each time that we're stepping through that for loop we're going to have an item one and item two and the right hand side of this arrow is just how we're going to combine them so this is going to make a tupple that has X and Y so in this case the string from the first one and the string from the second one combined into a tupple arguably this syntax looks kind of ridiculous because it looks like we have the same thing on the left and the right hand side of this Anonymous delegate we're also going to be putting that into an array and then we're going to write this information out to the console the next two methods are zip shortest and zip longest so right here and these come from more link you'll notice that the way that we call these things is literally the exact same except for the method names and we're just going to print this information out to the console as well all of these will be combining the two collections by making a tuple of string and string finally the manual zip shortest which is the extension method that I had walked you through this also has the exact same calling Style the difference is going to be the name and the method that we're calling manual zip shortest should behave in theory just like Zip shortest which should behave in theory just like the zip method that is built in zip longest is going to be the outlier here and that's because it will try to take the longest of the two Collections and it will backfill the empty spots with a default value so hopefully that makes sense and what we're going to do now is run these and look through the results starting things off with the zip method that is built into Link in C we can see that we have the numbers 1 through five and then 10 through six on the other side and I did purposefully reverse these so we could see differences in the numbers that way you wouldn't see one beside one all the way through and be confused that maybe I just passed in the same collection truly it's working backwards in the second collection so when we zip them together this is going to be the first element of the first collection and the first element of the second collection and then so on and so forth so one will be counting up one counts down as we go through but you'll notice because the shortest collection had five elements we only get up to five if we go to zip longest and this is going to be from more link it's the first more link method that we're looking at in this video for the output you'll notice that we do the same thing but once we hit six technically that would be here we don't have six in this first collection so it will put in the default value and the default value of a string is null so it's printing out null here and then you'll notice that we keep going all the way through until we iterate through the second collection which is longer it had 10 elements in it so we get the numbers 10 through 1 printed out on the right hand side and the left hand side only goes up to five if we look at the final two examples these behave the exact same as we saw with the built-in link zip method just a quick recap the way that we're able to sort of combine two collections or in this case I innumerables together by zipping them is calling more link zip shortest or zip longest if you want to be using more link or we just have the built-in zip method I did also illustrate how we can go make our own and a naive implementation that wasn't necessarily streaming these things together but making a whole collection ahead of time now I did mention that if you stay at the end of this video we could check out some benchmarks so when that video is ready you can check it out right here thanks and I'll see you next time

Frequently Asked Questions

What is the purpose of the LINQ Zip method in C#?

The LINQ Zip method is used to combine two collections by merging their elements together. It takes elements from both collections in pairs and creates a new collection based on a specified selector function. This allows us to process items from both collections simultaneously, rather than one after the other.

What happens when the collections being zipped have different lengths?

When zipping collections of different lengths using the built-in LINQ Zip method, the resulting collection will only contain elements up to the length of the shorter collection. However, with MoreLINQ, we have options like ZipShortest and ZipLongest, which allow us to choose whether we want to stop at the shortest length or continue to the longest, filling in default values for missing elements.

Can you explain the difference between using LINQ Zip and creating a manual zip implementation?

Using LINQ Zip is straightforward and efficient as it handles the zipping process for you. In contrast, creating a manual zip implementation requires you to manage the iteration and merging logic yourself, which can be more complex. My manual implementation also demonstrates the concept of how zipping works under the hood, but it doesn't take advantage of the streaming capabilities that LINQ provides.

These FAQs were generated by AI from the video transcript.
An error has occurred. This application may no longer respond until reloaded. Reload