Friday, October 8, 2010

Barbie, Strawberry Shortcake

Oracle Direct Insertion (DIRECT INSERT)

direct insertion of records in tables are made with INSERT and MERGE statements (the insert) or an application that uses OCI direct interface (eg sqlloader). When you need to insert a large amount of rows in an optimum time, it is necessary to sacrifice some functionality at the expense of speed. The performance improvement in the insertion is not free and there are certain requirements that must be met and certain consequences to consider before using. Direct insertion activated in two ways:

- Adding hint / * + APPEND * / in the INSERT INTO ... SELECT ..
- Adding hint / * + APPEND * / INSERT INTO .. VALUES .. (In 11g R1)
- Adding hint / * + APPEND_VALUES * / INSERT INTO .. VALUES .. (In 11g R2)
- Running the parallel insert

In the following cases can not use the direct insertion

- The table has a trigger to modify active trip with inserts.
- The table has a foreign key change enabled. - The table is a table indexed change.
- The table to modify is stored in a cluster. - The table contains column change type object type.


Then I compare the normal insert with putting direct focus on redo and undo space consumed in each case. Also I will show how the direct mode "miss" the cache buffer. Just this last is the key to accelerating the inserts, as new blocks are put together in memory and added to the table directly without using the cache. During the insertion does not increase the HWM and only updates the commit the transaction. For this reason you can not perform any additional operation on the modified table until it has been confirmed directly insert transaction. If we try to execute any statement that references a table after inserting direct mode Oracle generates the error: "ORA-12838: Unable to read / modify an object after modifying it in parallel."

For example I create a table T with two columns. I'll use a column CHAR (500) to be used more blocks without having to load many rows.
 


drop table t create table t (x int, and char (500))



Now inserting 100000 records in the normal way: INSERT


---------- CONVENTIONAL ---------


insert into t select rownum, dbms_random.string ('a', 20)
from dual connect by rownum


The insert delayed: 10.8 seconds.

select round (ms.value/1024/1024)
redo_size from v $ MYSTAT ms, v $ statname

WHERE sn = sn.STATISTIC ms.STATISTIC # #
and sn.NAME = 'redo size'
redo
The amount used was: SELECT
53MB

t . used_ublk undo_size
* 8 \u200b\u200bFROM v $ transaction t, v $ session s WHERE
t.addr = s.taddr
AND s.audsid = userenv ('sessionid') 1104k


undo
The amount used was: 1104Kb


With this query we check if the buffer was used for the insert:


select count (1)
from v $ bh bh, b
WHERE user_objects
bh.OBJD = ob.data_object_id
and ob.object_name = 'T'
and bh.STATUS! = 'free'
and bh.CLASS # = 1




7174 blocks all blocks are loaded into cache for insertion.

Let's see what happens with the direct mode insert: INSERT DIRECT





-------------- insert / * + APPEND * / into t select rownum
, dbms_random.string ('a', 20) 
from dual connect by rownum


10.3s


It took only five seconds less than the conventional insert. This can become discouraged to use direct mode, and you do not see much improvement, but it is important to note that the time difference is going to notice more when we work with a number of records Most importantly, I would say the order of millions.


select round (ms.value/1024/1024) redo_size
from v $ MYSTAT ms, v $ statname
WHERE sn = sn.STATISTIC ms.STATISTIC # #
and sn.NAME = 'redo size'

now consumed a little more redo, but see the undo:



t.used_ublk SELECT * 8 FROM undo_size AND s.audsid = userenv ('sessionid')

8kb


With direct insert only 8k undo consumed, ie 138 times less than with the conventional insert.



select count (1)
from v $ bh bh, b
WHERE user_objects
bh.OBJD = ob.data_object_id
and ob.object_name = 'T'
and bh.STATUS! = 'free'
and bh.CLASS # = 1



 0 Blocks were not used cache blocks, precisely this was expected because as I said above does not use the direct insert buffer. 
As seen with the direct insert significantly decreased undo space. To reduce the redo must also perform the insert on a table that has disabled the logging, ie is in NOLOGGING (if the base is in NOARCHIVELOG already have disabled the logging for the tables cdo is inserted into live mode).


INSERT DIRECT NOLOGGING

--------------------------------- insert / * + APPEND * / into t
select rownum, dbms_random.string ('a', 20)
 from dual connect by rownum 


9.6s


was reduced insertion time, was now 9.6s
<= 100000

select round (ms.value/1024 / 1024) from v $ redo_size MYSTAT
ms, v $ statname

WHERE sn = sn.STATISTIC ms.STATISTIC # #
and sn.NAME = 'redo size'

0


redo consumption was zero. While this sounds very good, we must bear in mind that disable logging (actually is minimized, since internal operations run HWM or add Extent generate redo and undo) causes to an event of failure we can not perform recovery of newly inserted data. It is advisable to perform a logical backup of the tables or a RMAN incremental backup immediately after the direct load. You can not bypass the redo for tables housed in tablespaces to force logging.


select round (ms.value/1024/1024) redo_size
from v $ MYSTAT ms, v $ statname

WHERE sn = sn.STATISTIC ms.STATISTIC # #
and sn.NAME = 'redo size'

8 kb

undo remained exactly the same as the previous test.
 
select round (ms.value/1024/1024) from
redo_size v $ MYSTAT ms, v $ statname

WHERE sn = sn.STATISTIC ms.STATISTIC # #
and sn.NAME = 'redo size'

0 blocks


nor is loaded into memory blocks.

Recall that for the 3 previous tests used a table without indexes. Let's see what happens when you add an index to the table T

 
t_idx create index on t (x)

insert / * + APPEND * / into t select rownum
, dbms_random.string ('a', 20)
from dual connect by rownum

<= 100000
11.2.s

Clearly the insert delayed, but since you had to keep the index updated during insertions.

 select round (ms.value/1024/1024) redo_size 
from v $ MYSTAT ms, v $ statname

WHERE sn = sn.STATISTIC ms.STATISTIC # #
and sn.NAME = 'redo size'

7

redo space now consumed was 0, remains little but now it's 7kb, as he had to store redo information for the index. SELECT


undo_size
 t.used_ublk * 8 FROM v $ transaction t, v $ session s WHERE 
t.addr = s.taddr
AND s.audsid = userenv ('sessionid')

1600 kb


The undo consumption also increased since records were kept for the new index undo
tablita
The following summarizes the results of the tests:

 




In summary, I recommend using wherever possible the direct mode especially when you need to bulk load (ETL processes) and performance on load is the main objective. Also, if possible, and being aware of what it means to use NOLOGGING, we also recommend configuring the receiving table in NOLOGGING. Finally, you should disable the indices (put them into unusable) because as we saw in the latter case, the maintenance of the indices during insertion generates more undo / redo and slows the overall burden. After the insertion will be to rebuild the indexes unusable.


0 comments:

Post a Comment