June 25th in JavaScript by .

Quick Tip : Find Matching Items In Two Arrays With JavaScript


I was asked an interesting question a couple of days ago. The question was as follows:

If you have two arrays of say, numbers, and you wanted to list only the numbers that exist in both. How would you go about doing this in JavaScript?

The answer was not immediately evident but, I had a basic idea of how one might go about this so, I jumped in and started coding. jQuery has a very useful utility function called inArray that, as expected, allows you to test whether a specific element exists in an array.  I did not want to include the entire jQuery library literally for just one function so, I decided to have a look at the jQuery source and the implementation code for the inArray function. Below is the function:

inArray: function( elem, array ) {
	if ( indexOf ) {
		return indexOf.call( array, elem );
	}

	for ( var i = 0, length = array.length; i < length; i++ ) {
		if ( array[ i ] === elem ) {
			return i;
		}
	}
	return -1;
}

The indexOf variable used in the first if statement get’s assigned the value, which will be a function, of Array.prototype.indexOf earlier on in the jQuery source. indexOf is a new function that has been added to the Array object in ECMAScript 5 and implemented in JavaScript 1.6. It will either return the index at which it found the item in the array or, -1 if the item was not found.

Browser support for this function is very good but, Internet Explorer 8 and below does not offer support for this hence the fallback for loop if the if condition fails. With this piece of the puzzle in place, we can proceed to write the matching function:

var arrayMatcher = function(firstArray, secondArray) {
	var results = [],
	arrayLength = firstArray.length;

	for(i = 0; i < arrayLength; i++) {
		if(inArray(firstArray[i], secondArray) > -1) {
			results.push(firstArray[i]);
		}
	}
	return results;
};

Pretty darn simple, let’s step through the code. The arrayMatcher function first of all takes two arrays as parameters, note that the arrays does not have to be of equal length, just ensure that you pass it the longer of the two as the first parameter. Next we create two variables, the results array that will contain the matched items as well as storing the length of the first array in the arrayLength variable.

Storing the length of the array and then using the resulting variable in your for loops is a general best practice to optimize performance, however small an optimization it might be. Reason for this is that the length of the array does not have to be calculated every time we go through the loop.

Inside our for loop we test if the current item in the first array exists in the second using the inArray function. If it does, we go ahead and push it onto the results array. Once our loop completes, we simply return the result to the calling function. And that, in a nutshell, is that. I hope you find this little utility useful. You can grab the code from my Utils repository on Github.

Image courtesy: Comrade_S

  • Anonymous

    What about getting length of the minimal-length array?

    If, for example, firstArray has 100 elements, and secondArray has 2 elements.

    `for loop` will be looping through 100 elements of firstArray, however looping through secondArray is much better in this case.

    I’d make like this:
    var results = [],
    arrayLength = Math.min (firstArray.length, secondArray.length),
    minArray, maxArray;
    if (arrayLength == firstArray.length) {
    minArray = firstArray;
    maxArray = secondArray;
    } else {
    minArray = secondArray;
    maxArray = firstArray;
    }

    for(i = 0; i -1) {
    results.push(minArray[i]);
    }
    }
    return results;

    • Anonymous

      Good idea, one can always simply loop through the shortest of the two arrays. Thanks for the comment

      • Pete

        You talk about the array length optimization a bit in this, but implement it a second, longer way to the way the jQuery sample was already implementing it.

        You may want to confirm that both methods actually implement it, otherwise some people may think the jQuery way is wrong :)

        • Anonymous

          Hey Pete,

          Good point. Thing is, both ways is perfectly fine as long as you use one of them. Thing is, unless you specifically tick the box to allow this, both  JSLint and JSHint complains about having the var initialization inside the for loop so, this is why I declare mine outside.

          Thanks for your comment.

  • http://profiles.google.com/emmecinque Mike McNally

    This is what’s called an “n-squared” algorithm, which means that the amount of time it takes is (roughly) proportional to the square of the size of the arrays.  For two arrays with 100 elements, this code will require about 5000 comparisons to be run; if they’re 500 elements long, it’ll require about 125,000.

    There are better ways to do this.  In JavaScript, the fastest way might be to use the values from one of the arrays to build an object, with the values as property names and something like the boolean constant “true” as the value.  Then the loop can leverage the efficient property lookup code inside the runtime system.

    Another alternative would be to sort both arrays before starting.  Sorting can be done pretty efficiently, with far less work than an amount proportional to the square of the list length.  Once the arrays are sorted, the matches can be found with a single pass through the lists.

    • Anonymous

      Thanks for the suggestions Mike, will definitely look into them.

    • Anonymous

      Mike, are you here referring to the inArray function?

      • http://profiles.google.com/emmecinque Mike McNally

        Well “inArray” performs a simple linear search for the target value.  That means that if the target is in the array, you’ll average out a finding it after searching through half the array (of course actual data might skew that, but it’s a good enough rough estimate).  Of course, for targets that are not in the array, you won’t find that out without searching the entire thing.

        Now, if you’re looking for a whole array full of targets inside another array, you’ll be doing that work a bunch of times.  That’s how the computational cost estimate is derived – by thinking in rough terms about the work that’s going on. It’s also a start on figuring out ways to save time and do the work more efficiently.

        So, “yes”, I was referring to the “inArray” function :-)

        • Anonymous

          The inArray function I borrowed from jQuery as is commented in the code on Github. I had a look at sorting the arrays before I pass it of to inArray, thing is, because it is a utilty, you will not know whether the two arrays are numeric, string or a combination.

          If you do a non-numeric sort on an array of numbers, you can actually end up messing the array order up more then fixing it: http://goo.gl/1WSbK

          Still, would like to do some tests to see the performance difference one can gain from sorting etc. if the format of the data is known. 

Performance Optimization WordPress Plugins by W3 EDGE