Having duplicate values in the dataset is a very common problem that occurs frequently when data is being compiled from two different sources. In these circumstances vlookup is a useful way to compare lists to check for duplicate information.
However, it is not the method that I would recommend because it has a couple of disadvantages compared to the simple method and I will show you now. The other limitation of the vlookup method is that it is really only useful for removing duplicates when comparing lists from two different sources of data but it is not suitable for removing duplicates in cases where there is a single lot of data that contains duplicates within it.
Additionally, the alternative method also has the capability to remove duplicate lines of data when information is spread across several cells within a row. So in this article we will show you how to remove duplicates using a more efficient method and the vlookup as well as including the instructions for removing duplicates using a vlookup.
Method 1 (What We Recommend)
Method 1 is very quick and simple and uses a function that is on the ribbon of Excel under the data tab in the data validation area.
In the example below we have a Ledger with a list of people that owe money within this Ledger Hayley Pearson appears three times. However one of the entries is unique because it has a different value associated with it.
To remove this duplicate the following steps are required;
Step 1 Highlight the data
Step 2 Select data on the ribbon and then remove duplicate under the data tool section of the ribbon
Step 3 Select the columns that you wish to base removal of the duplicates on and click OK.In this example I want to select both the name and the amount owing but it is also possible just to select the name only. In that case two of three rows containing Hayley Pearson would be removed along with the values associated with the entries.
Once the data has been removed you receive a message indicating the number of duplicate that have been found and removed. An example of this is provided below..
Method 2 (Using A Vlookup)
To illustrate the method used to find duplicates in a dataset using a vlookup we have an example where there are two lists which possibly have duplicate in it shown in the image below.
Step 1 Create a vlookup where list 1 is the table array and list 2 contains the lookup values. Copy the formula down to ensure each value unless list 2 is tested.
Step 2 If the values in list 2 are not in list 1 then the vlookup will return an error.
Step 3 Apply a filter to the data by selecting row one in this example and then selecting Home >>> Sort & Filter >>> Filter
Step 4 Click on the filter arrow and then uncheck all data except #N/A and click OK. This will provide a list of the unique data that is not duplicated in list 1.
Step 5 If the data is required for use in another area of a spreadsheet it can be copied at this point and pasted into the area required. Those entries which are hidden by the filter will not be transferred with this information when copied giving a unique list of values.
Notes About This Method
As mentioned earlier in the article there are several disadvantages associated with using a vlookup to create a unique list of data with no duplicate values in it which are as follows;
- There is a larger number of steps required to complete process
- Vlookups lack the ability to remove duplicates in cases where there is a list by itself that has duplicate values in it. It is only suitable for comparing lists.
- Using a vlookup can only evaluate unique entries based on a single point of data. With method 1 it is possible to assess multiple cells simultaneously.
- Using a vlookup for unique values also creates a possible risk of there being existing duplicate values in the list that you are comparing which was the case in the example used above.
Relevant Articles
How To Get Trendline Equation In Excel (2 Methods: In A Cell Or In A Graph)