Duplicate Emails SQL Challenge - PostgreSQL Solution

1. Introduction

This post explores how to handle a common data query problem in PostgreSQL – identifying duplicate email addresses. This issue is frequently encountered in database management and data cleaning processes.

Problem

The goal is to write a PostgreSQL query that finds all duplicate email addresses in a given table.

Database Table:

Person Table:
Id Email
1 alice@example.com
2 bob@example.com
3 alice@example.com

3. PostgreSQL Solution

SELECT Email
FROM Person
GROUP BY Email
HAVING COUNT(Email) > 1;

Output:

Email
alice@example.com

Explanation:

1. The query aggregates the records in the Person table by Email.

2. It utilizes the GROUP BY clause to group the records.

3. The HAVING clause is then used to filter out the groups, keeping only those with more than one occurrence of the same email.

4. The result is a list of emails that are present multiple times in the table.


Comments