How to Find Duplicates Between Two Excel Files

Step-by-step guide to finding duplicate data across two spreadsheets using different methods.

2025-01-166 min read

Introduction

Finding duplicate data between two Excel files is a common challenge. Whether you're merging customer lists, comparing inventory databases, or identifying duplicate transactions, knowing how to efficiently find duplicates can save hours of manual work.

In this comprehensive guide, we'll walk through multiple methods to find duplicates between two Excel files, from simple online tools to advanced Excel formulas. Choose the method that best fits your skill level and needs.

Method 1: Using DiffSheets (Fastest & Easiest)

DiffSheets is an online tool that automatically highlights all differences and duplicates between two Excel files.

How to find duplicates with DiffSheets:

  1. Go to diffsheets.com
  2. Upload your first Excel file (File A)
  3. Upload your second Excel file (File B)
  4. Click "Find Difference"
  5. Review the results:
  6. Green rows = only in File B (potential duplicates if they match File A)
  7. Yellow rows = modified between files
  8. Unchanged rows = exact duplicates

Step-by-step walkthrough:

Step 1: Choose your comparison mode

  • Use "Key Column" mode if you have a unique identifier (like ID, Email, or SKU)
  • Use "Position" mode for row-by-row comparison
  • Use "LCS" mode for automatic intelligent alignment

Step 2: Identify duplicate patterns

  • Look for unchanged rows (these are exact duplicates)
  • Check yellow rows for partial duplicates (same ID but different values)

Pros:

  • No Excel formulas needed
  • Visual, color-coded results
  • Works with files of any size
  • 100% private - data never leaves your browser
  • Handles XLSX, XLS, CSV formats
  • Free

Cons:

  • Requires internet to load (but processing is local)
  • View-only (can't edit within the tool)

Best for: Most users. This is the fastest way to find duplicates without any Excel expertise.

Method 2: Using VLOOKUP Formula

VLOOKUP is Excel's classic lookup function, perfect for finding if values from one file exist in another.

How to use VLOOKUP to find duplicates:

  1. Open both Excel files
  2. In File A, add a new column called "Duplicate Check"
  3. Enter this formula:

=IF(ISERROR(VLOOKUP(A2,[File2.xlsx]Sheet1!$A:$A,1,FALSE)),"Unique","Duplicate")

Replace:

  • A2 with your key column reference
  • [File2.xlsx]Sheet1!$A:$A with your second file reference

What this does:

  • Searches for each value from File A in File B
  • Returns "Duplicate" if found
  • Returns "Unique" if not found

Example:

File A (Customers.xlsx):

EmailName
john@email.comJohn
jane@email.comJane

File B (Subscribers.xlsx):

Email
john@email.com

Formula result: john@email.com = "Duplicate", jane@email.com = "Unique"

Pros:

  • Works within Excel
  • No additional tools needed
  • Can be customized

Cons:

  • Requires Excel knowledge
  • Complex for multiple columns
  • Slow with large datasets (10,000+ rows)
  • Must have both files open simultaneously

Best for: Excel users comfortable with formulas who need to work within Excel.

Method 3: Using COUNTIF Formula

COUNTIF counts how many times a value appears, making it perfect for finding duplicates.

How to use COUNTIF:

  1. Combine both files into one spreadsheet (copy File B below File A)
  2. Add a helper column
  3. Use this formula:

=COUNTIF($A$2:$A$1000,A2)

If the result is greater than 1, it's a duplicate.

Better approach - comparing two separate ranges:

=COUNTIF([File2.xlsx]Sheet1!$A:$A,A2)>0

This returns TRUE if the value exists in File 2.

Finding duplicates only:

Add a filter: =IF(COUNTIF([File2.xlsx]Sheet1!$A:$A,A2)>0,"Duplicate","")

Pros:

  • Simpler than VLOOKUP
  • Easy to understand
  • Good for counting frequency

Cons:

  • Requires combining data or external references
  • Not suitable for multi-column matching
  • Performance issues with large datasets

Best for: Simple duplicate detection on a single column.

Method 4: Using Conditional Formatting

Excel's conditional formatting can visually highlight duplicates across two ranges.

How to use conditional formatting:

  1. Copy data from File B and paste it below File A data
  2. Select all data (from both files)
  3. Go to Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values
  4. Choose formatting (e.g., red fill)
  5. Click OK

For comparing two separate sheets:

  1. In File A, select your data range
  2. Go to Conditional Formatting > New Rule > Use a formula
  3. Enter: =COUNTIF(Sheet2!$A:$A,A1)>0
  4. Set formatting (e.g., yellow fill)
  5. Apply

Pros:

  • Visual highlighting
  • No formulas in cells
  • Quick setup
  • Immediate visual feedback

Cons:

  • Requires combining data for best results
  • Only visual - doesn't extract duplicates
  • Can be slow with very large datasets
  • Formatting can be lost when copying

Best for: Visual identification of duplicates when you need to see them highlighted in context.

Method 5: Using Power Query

Power Query (Get & Transform Data) is a powerful Excel feature for advanced data manipulation.

How to find duplicates with Power Query:

  1. Load both files into Power Query:
  2. Data > Get Data > From File > From Workbook
  3. For each file, click "Transform Data"
  4. Merge queries:
  5. Home > Merge Queries
  6. Select matching columns
  7. Choose "Left Anti" join to find values in File A NOT in File B
  8. Or "Inner" join to find duplicates
  9. Load results back to Excel

Finding exact duplicates:

  1. Merge with "Inner Join" on all columns
  2. Results show only rows that exist in both files

Finding unique values:

  1. Use "Left Anti Join"
  2. Results show rows only in File A

Pros:

  • Very powerful for complex scenarios
  • Can handle millions of rows
  • Refreshable queries
  • No formulas needed in cells

Cons:

  • Steeper learning curve
  • Only available in Excel 2016+ / Microsoft 365
  • Overkill for simple tasks

Best for: Advanced users dealing with large datasets or complex matching logic.

Comparison Table: Which Method Should You Use?

MethodDifficultySpeedLarge FilesMulti-ColumnVisual Output
DiffSheetsEasyFastExcellentYesExcellent
VLOOKUPMediumMediumPoorLimitedPoor
COUNTIFEasyMediumPoorNoPoor
Conditional FormattingEasyMediumMediumNoExcellent
Power QueryHardFastExcellentYesMedium

Practical Example: Finding Duplicate Customers

Scenario: You have two customer lists and want to find duplicates based on email address.

Using DiffSheets (Recommended):

  1. Upload both files to diffsheets.com
  2. Select "Key Column" mode
  3. Choose "Email" as the key column
  4. Review results - unchanged rows are exact duplicates

Using VLOOKUP: In CustomerList1.xlsx, add column: =IF(ISERROR(VLOOKUP(A2,[CustomerList2.xlsx]Sheet1!$A:$A,1,0)),"New","Duplicate")

Using COUNTIF: =IF(COUNTIF([CustomerList2.xlsx]Sheet1!$A:$A,A2)>0,"Duplicate","Unique")

Result interpretation:

  • "Duplicate" = Email exists in both files
  • "Unique" or "New" = Email only in File 1

Advanced Tips for Finding Duplicates

Tip 1: Case-sensitive matching By default, Excel comparisons are case-insensitive. For case-sensitive: =IF(COUNTIF(EXACT(A2,Sheet2!$A:$A))>0,"Duplicate","Unique")

Tip 2: Fuzzy matching For finding similar (not exact) duplicates, consider DiffSheets' LCS algorithm or use fuzzy lookup add-ins.

Tip 3: Multi-column matching To match on multiple columns (e.g., First Name + Last Name):

VLOOKUP approach: Create a helper column: =A2&B2 Then VLOOKUP on the helper column

DiffSheets approach: Use key column on concatenated fields or compare all columns

Tip 4: Removing duplicates After identifying duplicates:

  • Excel: Data > Remove Duplicates
  • Filter for "Duplicate" and delete rows
  • Use Power Query to filter them out

Conclusion: Best Method for Finding Duplicates

Choose DiffSheets if:

  • You want the fastest solution
  • You have large files
  • You need visual comparison
  • Privacy is important
  • You're not an Excel expert

Choose VLOOKUP if:

  • You're comfortable with Excel formulas
  • You need to work within Excel
  • You want to keep the results in your spreadsheet

Choose COUNTIF if:

  • You need a simple count of duplicates
  • You're matching a single column only

Choose Conditional Formatting if:

  • You want visual highlighting
  • You need to see duplicates in context
  • You're working with smaller datasets

Choose Power Query if:

  • You're an advanced Excel user
  • You have very large datasets
  • You need repeatable, refreshable queries

Our recommendation: For 90% of users, DiffSheets is the best choice. It's free, fast, private, and requires zero Excel knowledge. Visit diffsheets.com to try it now.

Ready to compare?

Compare your spreadsheets in seconds. Free, private, no sign-up required.

Try it now
How to Find Duplicates Between Two Excel Files | DiffSheets