ThinkTibits!: Bulk Insert Skip / Ignore Duplicate Rows Oracle Example

In this short tutorial, we will see how to ignore duplicate records from a table during a bulk insert operation in Oracle. The example works on 11g Oracle databases and above and utilizes the hint ignore_row_on_dupkey_index . First things first, why do we need this tutorial?

The problems with Bulk Inserts

Let us say you have a table with a unique key and millions of row. And you perform a bulk insert from another table, and you have a scenario where some of the records inserted are already available in your original table. During such cases, Oracle will reject your insert and throw a unique key constraint violation error. Just because you have a few duplicate rows in your insert set, your complete insert will fail. You will then find the duplicate rows through some means, and ignore them in your insert which is a painful process. This hint simplifies your insert greatly and we will see how it does that, in the following section.

Step -1 : Create a Test Table / Insert Some Rows.

Let us create a test table to explain this and add some rows into it. Execute the statements provided below and have your table ready.

CREATE TABLE TEMP_1
(
MY_ID NUMBER PRIMARY KEY,
MY_TEXT VARCHAR2(10)
)
/
INSERT INTO TEMP_1 VALUES (1,'A')
INSERT INTO TEMP_1 VALUES (2,'B')
INSERT INTO TEMP_1 VALUES (3,'C')
INSERT INTO TEMP_1 VALUES (4,'D')
INSERT INTO TEMP_1 VALUES (5,'E')
/
COMMIT
/

There is a unique key on column MY_ID.

Step-2: Bulk Insert - without Hint - Standard Method

Let us say you attempt to make an insert as per below.

INSERT INTO TEMP_1
SELECT ROWNUM,'A' FROM ALL_OBJECTS WHERE ROWNUM <20

You will get the following exception on the screen: ORA-00001: Unique Constraint Violated

There are so many websites / forums that discusses how you can work around this problem by detecting duplicates before the insert. Instead of reinventing them here, let us go to the new hint based solution direct.

New Approach - Using ignore_row_on_dupkey_index for Bulk Insert

The new approach would be to use the hint ignore_row_on_dupkey_index during the bulk insert. You should also pass the table name and the index name as parameters inside the hint. The new SQL statement is therefore;

INSERT /*+ IGNORE_ROW_ON_DUPKEY_INDEX(TEMP_1,SYS_C0011527) */ INTO TEMP_1
SELECT ROWNUM,'A' FROM ALL_OBJECTS WHERE ROWNUM <20

When you run this statement, it will insert 14 rows and will not touch the existing rows (i.e.1 - 5 ). This is a big time saving feature from Oracle. It prevents the collision of rows during insert and takes the rows that don't complain on the target tables.

Give your try on this hint, and post your experience.

2 comments:

AnonymousNovember 3, 2015 at 10:45 AM
The thing is that it does not work from Java. I tested the code using a single record. The same INSERT with hint works fine in SQL Dev but throws an exception in Java.
Any ideas?
AnonymousJuly 18, 2016 at 9:48 AM
The same is not working using DB link. Any other option for DB Link?