Best Approach - Eliminate Duplicates from Array

A good challenge for all JavaScript developers today is to eliminate duplicates from an array. So many approaches are available across the web to eliminate duplicates or repeating elements from an array in Javascript. There is not a straightforward approach for this and it requires some programming. There is no array.find :(.At the time of writing this post, I cannot find a post that compares all these methods in terms of performance and it is difficult to say which method outweighs the other.

 

We made an honest attempt in this post  to identify the best approach to eliminate duplicates from an array in Javascript. We investigated all the available approaches in the web with a standard array containing a random collection of 575000 elements and found how each of the elimination methods perform. Let us first define a way to randomize our array of 575000 elements.


<html>
<head>
</head>
<body>
<script language='Javascript'>
var myarray=new Array();
var rand_no;
for(var i=0;i<575000;i++) {
rand_no = Math.random();
rand_no = rand_no * 100;
rand_no = Math.ceil(rand_no);
myarray.push(rand_no);
}
alert(myarray.length); => This is only for our confirmation that the array is indeed populated with 25000 elements.
</script>
</body>
</html>


So, we are ready to give it a go. Our array is holding 575000 elements now, all randomized between 0 and 100. The objective is to get a filtered array, removing all the duplicates and also to find out the best approach to do the same. After all the comparison, we found one method  that is really fast. The complete code section is provided below:-

<html>
<head>
</head>
<body>
<div id='startime'></div>
<div id='endtime'></div>
<script language='Javascript'>
var myarray=new Array();
var rand_no;
for(var i=0;i<575000;i++) {
rand_no = Math.random();
rand_no = rand_no * 100;
rand_no = Math.ceil(rand_no);
myarray.push(rand_no);
}
var timestamp = Number(new Date());
document.write(timestamp);
document.write('<br>');
var resarray=new Array();
resarray=eliminateDuplicates(myarray);
var timestamp = Number(new Date());
document.write(timestamp);
function eliminateDuplicates(arr) {
  var i,
      len=arr.length,
      out=[],
      obj={};
 
  for (i=0;i<len;i++) {
    obj[arr[i]]=0;
  }
  for (i in obj) {
    out.push(i);
  }
  return out;
}
</script>
</body>
</html>

The approximate time taken by this method to process 575000 elements of an array is 350 milliseconds. For various other array elements, performance of this method is documented below

575000
350
675000
380
775000
420
800000
450

This method is very good and is worth using. [ method inputs obtained from StackOverflow website ]. If you can find any other approach, please do post it under comments.

No comments:

Post a Comment