How to Find Duplicates Between Two Excel Files
Step-by-step guide to finding duplicate data across two spreadsheets using different methods.
Introduction
Finding duplicate data between two Excel files is a common challenge. Whether you're merging customer lists, comparing inventory databases, or identifying duplicate transactions, knowing how to efficiently find duplicates can save hours of manual work.
In this comprehensive guide, we'll walk through multiple methods to find duplicates between two Excel files, from simple online tools to advanced Excel formulas. Choose the method that best fits your skill level and needs.
Method 1: Using DiffSheets (Fastest & Easiest)
DiffSheets is an online tool that automatically highlights all differences and duplicates between two Excel files.
How to find duplicates with DiffSheets:
- Go to diffsheets.com
- Upload your first Excel file (File A)
- Upload your second Excel file (File B)
- Click "Find Difference"
- Review the results:
- Green rows = only in File B (potential duplicates if they match File A)
- Yellow rows = modified between files
- Unchanged rows = exact duplicates
Step-by-step walkthrough:
Step 1: Choose your comparison mode
- Use "Key Column" mode if you have a unique identifier (like ID, Email, or SKU)
- Use "Position" mode for row-by-row comparison
- Use "LCS" mode for automatic intelligent alignment
Step 2: Identify duplicate patterns
- Look for unchanged rows (these are exact duplicates)
- Check yellow rows for partial duplicates (same ID but different values)
Pros:
- No Excel formulas needed
- Visual, color-coded results
- Works with files of any size
- 100% private - data never leaves your browser
- Handles XLSX, XLS, CSV formats
- Free
Cons:
- Requires internet to load (but processing is local)
- View-only (can't edit within the tool)
Best for: Most users. This is the fastest way to find duplicates without any Excel expertise.
Method 2: Using VLOOKUP Formula
VLOOKUP is Excel's classic lookup function, perfect for finding if values from one file exist in another.
How to use VLOOKUP to find duplicates:
- Open both Excel files
- In File A, add a new column called "Duplicate Check"
- Enter this formula:
=IF(ISERROR(VLOOKUP(A2,[File2.xlsx]Sheet1!$A:$A,1,FALSE)),"Unique","Duplicate")
Replace:
A2with your key column reference[File2.xlsx]Sheet1!$A:$Awith your second file reference
What this does:
- Searches for each value from File A in File B
- Returns "Duplicate" if found
- Returns "Unique" if not found
Example:
File A (Customers.xlsx):
| Name | |
|---|---|
| john@email.com | John |
| jane@email.com | Jane |
File B (Subscribers.xlsx):
| john@email.com |
Formula result: john@email.com = "Duplicate", jane@email.com = "Unique"
Pros:
- Works within Excel
- No additional tools needed
- Can be customized
Cons:
- Requires Excel knowledge
- Complex for multiple columns
- Slow with large datasets (10,000+ rows)
- Must have both files open simultaneously
Best for: Excel users comfortable with formulas who need to work within Excel.
Method 3: Using COUNTIF Formula
COUNTIF counts how many times a value appears, making it perfect for finding duplicates.
How to use COUNTIF:
- Combine both files into one spreadsheet (copy File B below File A)
- Add a helper column
- Use this formula:
=COUNTIF($A$2:$A$1000,A2)
If the result is greater than 1, it's a duplicate.
Better approach - comparing two separate ranges:
=COUNTIF([File2.xlsx]Sheet1!$A:$A,A2)>0
This returns TRUE if the value exists in File 2.
Finding duplicates only:
Add a filter: =IF(COUNTIF([File2.xlsx]Sheet1!$A:$A,A2)>0,"Duplicate","")
Pros:
- Simpler than VLOOKUP
- Easy to understand
- Good for counting frequency
Cons:
- Requires combining data or external references
- Not suitable for multi-column matching
- Performance issues with large datasets
Best for: Simple duplicate detection on a single column.
Method 4: Using Conditional Formatting
Excel's conditional formatting can visually highlight duplicates across two ranges.
How to use conditional formatting:
- Copy data from File B and paste it below File A data
- Select all data (from both files)
- Go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values
- Choose formatting (e.g., red fill)
- Click OK
For comparing two separate sheets:
- In File A, select your data range
- Go to Conditional Formatting > New Rule > Use a formula
- Enter:
=COUNTIF(Sheet2!$A:$A,A1)>0 - Set formatting (e.g., yellow fill)
- Apply
Pros:
- Visual highlighting
- No formulas in cells
- Quick setup
- Immediate visual feedback
Cons:
- Requires combining data for best results
- Only visual - doesn't extract duplicates
- Can be slow with very large datasets
- Formatting can be lost when copying
Best for: Visual identification of duplicates when you need to see them highlighted in context.
Method 5: Using Power Query
Power Query (Get & Transform Data) is a powerful Excel feature for advanced data manipulation.
How to find duplicates with Power Query:
- Load both files into Power Query:
- Data > Get Data > From File > From Workbook
- For each file, click "Transform Data"
- Merge queries:
- Home > Merge Queries
- Select matching columns
- Choose "Left Anti" join to find values in File A NOT in File B
- Or "Inner" join to find duplicates
- Load results back to Excel
Finding exact duplicates:
- Merge with "Inner Join" on all columns
- Results show only rows that exist in both files
Finding unique values:
- Use "Left Anti Join"
- Results show rows only in File A
Pros:
- Very powerful for complex scenarios
- Can handle millions of rows
- Refreshable queries
- No formulas needed in cells
Cons:
- Steeper learning curve
- Only available in Excel 2016+ / Microsoft 365
- Overkill for simple tasks
Best for: Advanced users dealing with large datasets or complex matching logic.
Comparison Table: Which Method Should You Use?
| Method | Difficulty | Speed | Large Files | Multi-Column | Visual Output |
|---|---|---|---|---|---|
| DiffSheets | Easy | Fast | Excellent | Yes | Excellent |
| VLOOKUP | Medium | Medium | Poor | Limited | Poor |
| COUNTIF | Easy | Medium | Poor | No | Poor |
| Conditional Formatting | Easy | Medium | Medium | No | Excellent |
| Power Query | Hard | Fast | Excellent | Yes | Medium |
Practical Example: Finding Duplicate Customers
Scenario: You have two customer lists and want to find duplicates based on email address.
Using DiffSheets (Recommended):
- Upload both files to diffsheets.com
- Select "Key Column" mode
- Choose "Email" as the key column
- Review results - unchanged rows are exact duplicates
Using VLOOKUP: In CustomerList1.xlsx, add column: =IF(ISERROR(VLOOKUP(A2,[CustomerList2.xlsx]Sheet1!$A:$A,1,0)),"New","Duplicate")
Using COUNTIF: =IF(COUNTIF([CustomerList2.xlsx]Sheet1!$A:$A,A2)>0,"Duplicate","Unique")
Result interpretation:
- "Duplicate" = Email exists in both files
- "Unique" or "New" = Email only in File 1
Advanced Tips for Finding Duplicates
Tip 1: Case-sensitive matching By default, Excel comparisons are case-insensitive. For case-sensitive: =IF(COUNTIF(EXACT(A2,Sheet2!$A:$A))>0,"Duplicate","Unique")
Tip 2: Fuzzy matching For finding similar (not exact) duplicates, consider DiffSheets' LCS algorithm or use fuzzy lookup add-ins.
Tip 3: Multi-column matching To match on multiple columns (e.g., First Name + Last Name):
VLOOKUP approach: Create a helper column: =A2&B2 Then VLOOKUP on the helper column
DiffSheets approach: Use key column on concatenated fields or compare all columns
Tip 4: Removing duplicates After identifying duplicates:
- Excel: Data > Remove Duplicates
- Filter for "Duplicate" and delete rows
- Use Power Query to filter them out
Conclusion: Best Method for Finding Duplicates
Choose DiffSheets if:
- You want the fastest solution
- You have large files
- You need visual comparison
- Privacy is important
- You're not an Excel expert
Choose VLOOKUP if:
- You're comfortable with Excel formulas
- You need to work within Excel
- You want to keep the results in your spreadsheet
Choose COUNTIF if:
- You need a simple count of duplicates
- You're matching a single column only
Choose Conditional Formatting if:
- You want visual highlighting
- You need to see duplicates in context
- You're working with smaller datasets
Choose Power Query if:
- You're an advanced Excel user
- You have very large datasets
- You need repeatable, refreshable queries
Our recommendation: For 90% of users, DiffSheets is the best choice. It's free, fast, private, and requires zero Excel knowledge. Visit diffsheets.com to try it now.
Ready to compare?
Compare your spreadsheets in seconds. Free, private, no sign-up required.
Try it now