Difference between revisions of "Oracle/Shuffle data"

From YavInWiki
Jump to navigation Jump to search
Line 4: Line 4:
-- create backup of address table (used as data source)
-- create backup of address table (used as data source)
CREATE TABLE address_tmp as SELECT * FROM address;
CREATE TABLE address_tmp as SELECT * FROM address;
-- shuffle the columns name and street (separately)
-- shuffle the columns name and street (separately)
MERGE INTO address t1
MERGE INTO address t1
Line 16: Line 17:
UPDATE
UPDATE
   SET t1.name = t2.name, t1.street = t2.street;
   SET t1.name = t2.name, t1.street = t2.street;
-- drop the backup table (if no longer needed as backup)
-- drop the backup table (if no longer needed as backup)
drop table address_tmp;
drop table address_tmp;

Revision as of 11:57, 5 May 2011

The following statement(s) creates a copy of an existing table (address), randomly shuffles each (selected) column within the temporary table and puts them back into the original table. Afterwards the data within the original table (selected columns only) are randomly shuffled. The individual random min/max values are mandatory since without them all columns (here name and street) would be 'shifted' together (the pair would stay together).

-- create backup of address table (used as data source)
CREATE TABLE address_tmp as SELECT * FROM address;

-- shuffle the columns name and street (separately)
MERGE INTO address t1
 USING (SELECT address_id,
               nvl(lead(name) over (order by dbms_random.value(1,1)),
                   first_value(name) over (order by dbms_random.value(1,1))) name,
               nvl(lead(street) over (order by dbms_random.value(2,2)),
                   first_value(street) over (order by dbms_random.value(2,2))) street
          FROM address_tmp) t2
    ON (t1.address_id = t2.address_id)
WHEN matched THEN
UPDATE
   SET t1.name = t2.name, t1.street = t2.street;

-- drop the backup table (if no longer needed as backup)
drop table address_tmp;

If no temp table is used in the above example, data will (most likely) be lost since the select statement will work a changing table so that some information will be multiplied and some will be lost.