Difference between revisions of "Oracle/Find duplicates"
(New page: Find and remove duplicate rows from a table One of the most important features of Oracle is the ability to detect and remove duplicate rows from a table. While many Oracle DBA place prima...) |
|||
| (6 intermediate revisions by the same user not shown) | |||
| Line 5: | Line 5: | ||
The most effective way to detect duplicate rows is to join the table against itself as shown below. | The most effective way to detect duplicate rows is to join the table against itself as shown below. | ||
< | <syntaxhighlight lang="sql"> | ||
SELECT | SELECT * | ||
FROM my_table A | |||
WHERE rowid > | |||
(SELECT min(rowid) | |||
FROM | FROM my_table B | ||
WHERE B.key1 = A.key1 | |||
WHERE | AND B.key2 = A.key2 | ||
AND B.key3 = A.key3); | |||
</syntaxhighlight> | |||
</ | |||
Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below: | Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM | DELETE FROM table_name A | ||
WHERE A.rowid > ANY | |||
WHERE | (SELECT B.rowid | ||
FROM table_name B | |||
WHERE A.col1 = B.col1 | |||
AND A.col2 = B.col2); | |||
</syntaxhighlight> | |||
; | |||
</ | |||
You can also detect and delete duplicate rows using Oracle analytic functions: | You can also detect and delete duplicate rows using Oracle analytic functions: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM customer | |||
WHERE rowid IN | |||
(SELECT rowid FROM | |||
(SELECT rowid | |||
( | , row_number() | ||
rowid, | OVER (PARTITION BY custnbr ORDER BY custnbr) dup | ||
FROM customer) | |||
WHERE dup > 1); | |||
</syntaxhighlight> | |||
</ | |||
As we see, there are several ways to detect and delete duplicate rows from Oracle tables. | As we see, there are several ways to detect and delete duplicate rows from Oracle tables. | ||
| Line 62: | Line 44: | ||
Match null values. Instead of the following: | Match null values. Instead of the following: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM | DELETE FROM table_name A | ||
WHERE A.rowid > ANY | |||
WHERE | (SELECT B.rowid | ||
FROM table_name B | |||
WHERE A.col1 = B.col1 | |||
AND A.col2 = B.col2); | |||
</syntaxhighlight> | |||
; | |||
</ | |||
I needed to do the following to get rid of all the dupes: | I needed to do the following to get rid of all the dupes: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM | DELETE FROM table_name A | ||
WHERE A.rowid > ANY | |||
WHERE | (SELECT B.rowid | ||
FROM table_name B | |||
WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null)) | |||
AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null))); | |||
</syntaxhighlight> | |||
[[Category:Oracle]] | |||
; | |||
</ | |||
Latest revision as of 11:11, 15 November 2022
Find and remove duplicate rows from a table
One of the most important features of Oracle is the ability to detect and remove duplicate rows from a table. While many Oracle DBA place primary key referential integrity constraints on a table, many shops do not use RI because they need the flexibility.
The most effective way to detect duplicate rows is to join the table against itself as shown below.
SELECT *
FROM my_table A
WHERE rowid >
(SELECT min(rowid)
FROM my_table B
WHERE B.key1 = A.key1
AND B.key2 = A.key2
AND B.key3 = A.key3);
Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below:
DELETE FROM table_name A
WHERE A.rowid > ANY
(SELECT B.rowid
FROM table_name B
WHERE A.col1 = B.col1
AND A.col2 = B.col2);
You can also detect and delete duplicate rows using Oracle analytic functions:
DELETE FROM customer
WHERE rowid IN
(SELECT rowid FROM
(SELECT rowid
, row_number()
OVER (PARTITION BY custnbr ORDER BY custnbr) dup
FROM customer)
WHERE dup > 1);
As we see, there are several ways to detect and delete duplicate rows from Oracle tables.
Match null values. Instead of the following:
DELETE FROM table_name A
WHERE A.rowid > ANY
(SELECT B.rowid
FROM table_name B
WHERE A.col1 = B.col1
AND A.col2 = B.col2);
I needed to do the following to get rid of all the dupes:
DELETE FROM table_name A
WHERE A.rowid > ANY
(SELECT B.rowid
FROM table_name B
WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null))
AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null)));