Javascript uniq?
May 14th, 2007
protoype
Protoype is a cool javascript framework very usefull for dynamic webapp creation; however, the uniq algorithm in prototype.js is rather slow for large sets, infact it is O(n2) (quadratic).
This happens because uniq does not expects a sorted array and thus all elements are compared to all elements in the array (the array.include does a foreach for each value). And secondly, the array.concat function ‘will yield a new, intermediary array every time it encounters a new value (a value that wasn’t already in the result array)’.
1 2 3 4 5 |
uniq: function() { return this.inject([], function(array, value) { return array.include(value) ? array : array.concat([value]); }); }, |
stl
If you look at the c++ stl (standard template libary), this problem is solved in O(n) (linear) time, so a solution is near: convert a little bit of c++ to javascript!
(Note: there is a jsstl available on the web, but it does not implement the algorithm part of the stl, only the containers & iterators …)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
// TEMPLATE FUNCTION unique template<class _FwdIt> inline _FwdIt unique(_FwdIt _First, _FwdIt _Last) { // remove each matching previous for (_FwdIt _Firstb; (_Firstb = _First) != _Last && ++_First != _Last; ) if (*_Firstb == *_First) { // copy down for (; ++_First != _Last; ) if (!(*_Firstb == *_First)) *++_Firstb = *_First; return (++_Firstb); } return (_Last); } |
(The above is Copyright© 1992-2002 by P.J. Plauger.)
solution
This translates fairly forward to:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
// Removes redundant elements from the array function unique_inplace(an_array) { var first = 0; var last = an_array.length; for(var firstb; (firstb = first) != last && ++first != last; ) { if(an_array[firstb] == an_array[first]) { for(; ++first != last; ) { if (!(an_array[firstb] == an_array[first])) { an_array[++firstb] = an_array[first]; } } ++firstb; an_array.length = firstb; return; } an_array.length = firstb; } |
And this does not take forever on an array of more than 10 elements. It works on the concept of ranges: since the input array is sorted, all similar items will be grouped together; so if we look for the first of a group of similar, we can then skip the others. In the end will we will have an array without the doubles.
One last caveat: you will need to sort your array before you use this function!
Sorry, comments are closed for this article.