使用脚本更新 900 万条记录的性能问题

Question

我们已经运行下面的脚本更新了 table 运行 11G oracle DB (11.2.0.3) 中的一些列，大约需要 61 个小时才能完成令人惊讶的是，我们正在使用 Bulk collect 和 Forall 进行实际更新。我们还启用了并行 dml。我们还尝试基于 rowid 进行更新，而不是使用我们认为会更快的索引列。任何加快速度的建议都会很棒。下面是脚本

ALTER session enable parallel dml;
DECLARE
i NUMBER;
j number :=0 ;
TYPE tab_type IS TABLE OF rowid index by binary_integer;
tab_id tab_type;

CURSOR c1 IS
SELECT /*+ parallel(na,DEFAULT) */
                   rowid
                             from sample_table na
                             FOR UPDATE SKIP LOCKED;
BEGIN
  OPEN c1;
  LOOP
    FETCH c1 BULK COLLECT INTO tab_id LIMIT 10000;
    EXIT WHEN tab_id.COUNT = 0;

    FORALL i IN 1..tab_id.COUNT
                             update sample_table 
        set col1 = 'XXX'
        , col2 = 'XXX'
        , col3 = 'XXX'
        , col4 = 'XXX'
                             , col5= 'XXX'
        , col6 = 'XXX' 
     WHERE rowid = tab_id(i);
              j := j+1;
              if mod(j, 1000) = 0 THEN    -- Commit every 1000 records
                    COMMIT;
              end if;


  END LOOP;

  CLOSE c1;

END;
/

Answer 1

我不完全确定这会对您的运行时间产生影响，但我不会伤害它。此外，您的代码表明存在一些误解。

首先，FORALL 语句不会创建循环。它运行是单并入1次，处理全集。
这也意味着您的提交间隔不是您指定的 1000，而是 1M.
语句中的索引变量(i)是局部的语句并且只能在forall的范围内访问陈述。所以声明的变量 i 不是在中使用的变量 forall 因此不需要。没有错误是因为范围规则。
由于退出循环后没有提交，最后一组将除非行数是以下的精确倍数，否则不会提交提交间隔。在您使用 1M 行提交间隔的情况下，如果您有 8,999,999 行，则只会提交 8M。

考虑到所有这些，您可以尝试：

declare
  type tab_type is table of rowid;
  tab_id tab_type;

  k_buffer_limit constant pls_integer  := 10000;

 cursor c1 is
        select /*+ parallel(na,DEFAULT) */
               rowid
          from sample_table na
           for update skip locked;
begin
  open c1;
  loop
    fetch c1 bulk collect into tab_id limit 10000

    forall i in 1..tab_id.count
      update sample_table 
         set col1 = 'XXX'
           , col2 = 'XXX'
           , col3 = 'XXX'
           , col4 = 'XXX'
           , col5=  'XXX'
           , col6 = 'XXX' 
      where rowid = tab_id(i);

     commit;    
     exit when tab_id.count < k_buffer_limit; 
  end loop;

  close c1;

end;

批量收集/Forall 处理是上下文切换之间的权衡和内存使用情况。虽然减少上下文切换是一件好事，但可能会被内存需求所克服。你的过程可能是进入等待以获得足够的内存。通过 lowering the buffer size.

使用脚本更新 900 万条记录的性能问题

Performance issue updating 9 million records using a script

sql

plsql

oracle11g