Delete Duplicate Emails SQL Challenge - MySQL Solution

1. Introduction

This blog post addresses a common database management task in MySQL: deleting duplicate email entries while retaining the entry with the smallest id. This operation is essential for maintaining data integrity and avoiding redundancies.

Problem

We are tasked with writing a MySQL query to delete all duplicate email entries in the Person table, ensuring that only the entry with the smallest id for each email remains.

Database Table:

Person Table:
id email
1 john@example.com
2 bob@example.com
3 john@example.com

3. MySQL Solution

DELETE p1 FROM Person p1
JOIN Person p2
ON p1.email = p2.email AND p1.id > p2.id;

Output:

id email
1 john@example.com
2 bob@example.com

Explanation:

1. The query uses a self-join on the Person table.

2. It joins two instances of the table (p1 and p2) on the email field.

3. The DELETE statement removes rows from p1 where there exists a row in p2 with the same email but a smaller id.

4. This ensures that only the entry with the smallest id for each email is retained, effectively removing duplicates.


Comments