Difference between revisions of "Oracle/Find duplicates"
m |
|||
| (3 intermediate revisions by the same user not shown) | |||
| Line 5: | Line 5: | ||
The most effective way to detect duplicate rows is to join the table against itself as shown below. | The most effective way to detect duplicate rows is to join the table against itself as shown below. | ||
< | <syntaxhighlight lang="sql"> | ||
SELECT | SELECT * | ||
FROM my_table A | |||
FROM | |||
WHERE rowid > | WHERE rowid > | ||
(SELECT min(rowid) | (SELECT min(rowid) | ||
FROM | FROM my_table B | ||
WHERE B.key1 = A.key1 | WHERE B.key1 = A.key1 | ||
AND B.key2 = A.key2 | AND B.key2 = A.key2 | ||
AND B.key3 = A.key3); | AND B.key3 = A.key3); | ||
</ | </syntaxhighlight> | ||
Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below: | Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM table_name A | DELETE FROM table_name A | ||
WHERE A.rowid > ANY | WHERE A.rowid > ANY | ||
| Line 27: | Line 25: | ||
WHERE A.col1 = B.col1 | WHERE A.col1 = B.col1 | ||
AND A.col2 = B.col2); | AND A.col2 = B.col2); | ||
</ | </syntaxhighlight> | ||
You can also detect and delete duplicate rows using Oracle analytic functions: | You can also detect and delete duplicate rows using Oracle analytic functions: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM customer | DELETE FROM customer | ||
WHERE rowid IN | WHERE rowid IN | ||
| Line 40: | Line 38: | ||
FROM customer) | FROM customer) | ||
WHERE dup > 1); | WHERE dup > 1); | ||
</ | </syntaxhighlight> | ||
As we see, there are several ways to detect and delete duplicate rows from Oracle tables. | As we see, there are several ways to detect and delete duplicate rows from Oracle tables. | ||
| Line 46: | Line 44: | ||
Match null values. Instead of the following: | Match null values. Instead of the following: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM table_name A | DELETE FROM table_name A | ||
WHERE A.rowid > ANY | WHERE A.rowid > ANY | ||
| Line 53: | Line 51: | ||
WHERE A.col1 = B.col1 | WHERE A.col1 = B.col1 | ||
AND A.col2 = B.col2); | AND A.col2 = B.col2); | ||
</ | </syntaxhighlight> | ||
I needed to do the following to get rid of all the dupes: | I needed to do the following to get rid of all the dupes: | ||
< | <syntaxhighlight lang="sql"> | ||
DELETE FROM table_name A | DELETE FROM table_name A | ||
WHERE A.rowid > ANY | WHERE A.rowid > ANY | ||
| Line 64: | Line 62: | ||
WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null)) | WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null)) | ||
AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null))); | AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null))); | ||
</ | </syntaxhighlight> | ||
[[Category:Oracle]] | |||
Latest revision as of 11:11, 15 November 2022
Find and remove duplicate rows from a table
One of the most important features of Oracle is the ability to detect and remove duplicate rows from a table. While many Oracle DBA place primary key referential integrity constraints on a table, many shops do not use RI because they need the flexibility.
The most effective way to detect duplicate rows is to join the table against itself as shown below.
SELECT *
FROM my_table A
WHERE rowid >
(SELECT min(rowid)
FROM my_table B
WHERE B.key1 = A.key1
AND B.key2 = A.key2
AND B.key3 = A.key3);
Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below:
DELETE FROM table_name A
WHERE A.rowid > ANY
(SELECT B.rowid
FROM table_name B
WHERE A.col1 = B.col1
AND A.col2 = B.col2);
You can also detect and delete duplicate rows using Oracle analytic functions:
DELETE FROM customer
WHERE rowid IN
(SELECT rowid FROM
(SELECT rowid
, row_number()
OVER (PARTITION BY custnbr ORDER BY custnbr) dup
FROM customer)
WHERE dup > 1);
As we see, there are several ways to detect and delete duplicate rows from Oracle tables.
Match null values. Instead of the following:
DELETE FROM table_name A
WHERE A.rowid > ANY
(SELECT B.rowid
FROM table_name B
WHERE A.col1 = B.col1
AND A.col2 = B.col2);
I needed to do the following to get rid of all the dupes:
DELETE FROM table_name A
WHERE A.rowid > ANY
(SELECT B.rowid
FROM table_name B
WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null))
AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null)));