In this article, I am going to explain how to delete duplicate records in an SQL server. I'll also explain how you can delete/remove duplicate records from the table with Common Table Expression in the SQL server. And also show you how you can find duplicate records from the table in the SQL server as well as what is CTE(Common Table Expression) and how to use CTE(Common Table Expression) in SQL Server.
In my previous article, I explained How to Remove Duplicate Records From a Table Using UNION And EXCEPT Operator in SQL Server with an example.
What is CTE(Common Table Expression) in SQL Server?
In SQL Server CTE(Common Table Expression) is introduced in SQL Server 2005, and it is used to create a recursive query. It is a temporary named result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement as well as it is preferred to use as Subquery/View.
In SQL server view is a virtual table based on the result set of an SQL statement and Subquery is a query within the query and we also can say it is an inner/nested query.
Syntex
;With Alias_Tablename (column1,column2...,columnsN) AS ( Your Query )
Types of Common Table Expressions (CTE) in SQL Server
In SQL Servers Common Table Expressions (CTE) have two different types recursive and non-recursive. Where recursive CTE can be explained in three different parts Anchor Query, Separator, and Recursive Query.
What is Anchor Query?
Anchor Query will give the base data for the CTE and this is the first statement that is executed.
What Is Separator?
This is the middle part wherein we generally use an EXCEPT, UNION, UNION ALL, and a few more operators.
What is Recursive Query?
Recursive Query is the Common Table Expression(CTE) query that refers to the same CTE by recursion, and this is the main part of recursive CTE.
Requirement
- Create a temporary with some dummy records.
- Insert duplicate records in the created temporary table.
- Remove duplicate records from the created table using CTE(Common Table Expressions).
Implementation
As per requirement first, we will create/declare a temporary table with the name "tblEmployees" in the SQL server.
DECLARE @tblEmployees AS TABLE ( EmpId int NOT NULL PRIMARY KEY, CompanyId int NOT NULL, FirstName varchar(50) NOT NULL, LastName varchar(50) NOT NULL, Designation varchar(50) NOT NULL, Country varchar(50) NOT NULL, Date_Of_Birth DATETIME NOT NULL )
Now, we will insert some dummy duplicate records on the table for demonstration purposes.
INSERT INTO @tblEmployees VALUES (1,10798, 'Nikunj', 'Satasiya', 'Sr.Software Engineer','India','1996-04-08 00:00:00.000') INSERT INTO @tblEmployees VALUES (2,10798, 'Nikunj', 'Satasiya', 'Sr.Software Engineer','India','1996-04-08 00:00:00.000') INSERT INTO @tblEmployees VALUES (3,10798, 'Hiren', 'Dobariya', 'Sr.Software Engineer','India','1996-03-10 00:00:00.000') INSERT INTO @tblEmployees VALUES (4,10798, 'Ronak', 'Rabadiya', 'Sr.Software Engineer','India','1993-04-09 00:00:00.000') INSERT INTO @tblEmployees VALUES (5,10798, 'Hiren', 'Dobariya', 'Sr.Software Engineer','India','1996-03-10 00:00:00.000') INSERT INTO @tblEmployees VALUES (6,10798, 'Nikunj', 'Satasiya', 'Sr.Software Engineer','India','1996-04-08 00:00:00.000') INSERT INTO @tblEmployees VALUES (7,10798, 'Sneha', 'Patel', 'Support Executive','India','1995-02-12 00:00:00.000') INSERT INTO @tblEmployees VALUES (8,10798, 'Sheetal', 'Patel', 'Web Designer','India','1995-02-12 00:00:00.000') INSERT INTO @tblEmployees VALUES (9,10798, 'Ronak', 'Rabadiya', 'Sr.Software Engineer','India','1993-04-09 00:00:00.000') INSERT INTO @tblEmployees VALUES (10,10798, 'Ronak', 'Patel', 'Sr.Software Engineer','India','1995-11-12 00:00:00.000')
If you execute the select statement and retrieve the result set from the @tblEmployees table then you found there is duplication in the @tblEmployees table.
SELECT * FROM @tblEmployees ORDER BY EmpId
Result Set
SQL Server Duplicate Records |
As per our requirement, we will remove these duplicate entries/records from the @tblEmployees table with CTE(Common Table Expression).
;WITH MyEmployees (FirstName,DuplicateCount) AS ( -- Generate RowNumber based on Employee FirstName, LastName Wise SELECT FirstName,ROW_NUMBER() OVER(PARTITION by FirstName, LastName ORDER BY EmpId) AS DuplicateCount FROM @tblEmployees ) --Remove Duplicate Records From MyEmployees (Common Table Expressions) DELETE FROM MyEmployees WHERE DuplicateCount > 1
If you analyzed the above SQL statement then I have declared one Common Table Expressions(CTE) with the name "MyEmployees". As per the syntax of CTE, I have written a query where I have generated a row number partition by Employee First and Last Name order by Employee Identity and give an alias to the column of row number as DuplicateCount.
Finally, I deleted these rows from the CTE table "MyEmployees" where DuplicateCount is greater than 1.
Now, If you execute the select statement and retrieve the result set from the @tblEmployees table, you will find no duplication in the @tblEmployees table and all the duplicate records/rows are removed from your base table @tblEmployees.
SELECT * FROM @tblEmployees ORDER BY EmpId
- In SQL Server Common Table Expression(CTE) improves your code readability.
- Common Table Expression(CTE) provides recursive programming.
- CTE is making the maintenance of complex queries easy.
- It makes code maintainability easier.
- It can be used in stored procedures, functions, triggers as well as even views also.
- CTE can be used as a Table or a View and we can perform SELECT, INSERT, UPDATE or DELETE operations on the table.
Summary
SQL Server CTE(Common Table Expression) creates a recursive query. It is a temporary named result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement as well as it is preferred to use as Subquery/View and it improves the code readability, maintenance of complex queries, code maintainability and provides recursive programming.