Difference between revisions of "Oracle/Find duplicates"

Latest revision as of 11:11, 15 November 2022

Find and remove duplicate rows from a table

One of the most important features of Oracle is the ability to detect and remove duplicate rows from a table. While many Oracle DBA place primary key referential integrity constraints on a table, many shops do not use RI because they need the flexibility.

The most effective way to detect duplicate rows is to join the table against itself as shown below.

SELECT *
  FROM my_table A
 WHERE rowid >
   (SELECT min(rowid)
      FROM my_table B
     WHERE B.key1 = A.key1
       AND B.key2 = A.key2
       AND B.key3 = A.key3);

Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below:

DELETE FROM table_name A
 WHERE A.rowid > ANY
   (SELECT B.rowid
      FROM table_name B
     WHERE A.col1 = B.col1
       AND A.col2 = B.col2);

You can also detect and delete duplicate rows using Oracle analytic functions:

DELETE FROM customer
 WHERE rowid IN
   (SELECT rowid FROM 
     (SELECT rowid
           , row_number()
        OVER (PARTITION BY custnbr ORDER BY custnbr) dup
        FROM customer)
     WHERE dup > 1);

As we see, there are several ways to detect and delete duplicate rows from Oracle tables.

Match null values. Instead of the following:

DELETE FROM table_name A
 WHERE A.rowid > ANY
   (SELECT B.rowid
      FROM table_name B
     WHERE A.col1 = B.col1
       AND A.col2 = B.col2);

I needed to do the following to get rid of all the dupes:

DELETE FROM table_name A
 WHERE A.rowid > ANY
  (SELECT B.rowid
     FROM table_name B
    WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null))
      AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null)));

@@ Line 5: / Line 5: @@
 The most effective way to detect duplicate rows is to join the table against itself as shown below.
-<source lang="oracle8">
+<syntaxhighlight lang="sql">
-SELECT BOOK_UNIQUE_ID
+SELECT *
-     , PAGE_SEQ_NBR
+   FROM my_table A
-     , IMAGE_KEY
-   FROM page_image A
   WHERE rowid >
     (SELECT min(rowid)
-       FROM page_image B
+       FROM my_table B
       WHERE B.key1 = A.key1
         AND B.key2 = A.key2
         AND B.key3 = A.key3);
-</source>
+</syntaxhighlight>
 Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below:
-<source lang="oracle8">
+<syntaxhighlight lang="sql">
 DELETE FROM table_name A
   WHERE A.rowid > ANY
@@ Line 27: / Line 25: @@
       WHERE A.col1 = B.col1
         AND A.col2 = B.col2);
-</source>
+</syntaxhighlight>
 You can also detect and delete duplicate rows using Oracle analytic functions:
-<source lang="oracle8">
+<syntaxhighlight lang="sql">
 DELETE FROM customer
   WHERE rowid IN
@@ Line 40: / Line 38: @@
          FROM customer)
       WHERE dup > 1);
-</source>
+</syntaxhighlight>
 As we see, there are several ways to detect and delete duplicate rows from Oracle tables.
@@ Line 46: / Line 44: @@
 Match null values. Instead of the following:
-<source lang="oracle8">
+<syntaxhighlight lang="sql">
 DELETE FROM table_name A
   WHERE A.rowid > ANY
@@ Line 53: / Line 51: @@
       WHERE A.col1 = B.col1
         AND A.col2 = B.col2);
-</source>
+</syntaxhighlight>
 I needed to do the following to get rid of all the dupes:
-<source lang="oracle8">
+<syntaxhighlight lang="sql">
 DELETE FROM table_name A
   WHERE A.rowid > ANY
@@ Line 64: / Line 62: @@
      WHERE (A.col1 = B.col1 OR (A.col1 is null AND B.col1 is null))
        AND (A.col2 = B.col2 OR (A.col2 is null AND B.col2 is null)));
-</source>
+</syntaxhighlight>
+[[Category:Oracle]]

Difference between revisions of "Oracle/Find duplicates"

Latest revision as of 11:11, 15 November 2022

Navigation menu

Search