根据之前的结束日期批量更新开始日期
Bulk Update start date based on previous end date
我正在尝试对包含大约 70 万条记录的 table 进行批量更新。我需要用之前记录的有效结束日期更新有效开始日期。使用子查询时,我在更新语句的性能方面遇到了问题。即使使用日期过滤器(7/1/2016-7/15/2016,大约有 2k 条记录),也需要一个多小时才能达到 运行。我尝试将其作为简单的更新语句、插入语句和循环语句。使用 ROWID 而不是 account_dim_key 的解释计划(table 上的 PK)要优化得多,但是,我得到一个错误,子查询 returns 多于一行。我不确定为什么 ROWID 会发生这种情况。
ID是table上的自然键,account_dim_key是PK,是唯一的。两者都有索引。 Table 是 2 类 SCD。
- 如何使用ROWID修改更新语句
- 使用 FORALL 更新会更好吗?如果是这样,我将如何编写它(pl sql 的新手并且不熟悉数组)
使用 ROWID 更新语句 returns 错误单行子查询 returns 多行但具有最佳解释计划
UPDATE DEXWHS.D_ACCOUNT_VEEVA
SET effective_end_dt =
(SELECT prev_dt
FROM (SELECT LAG (
effective_end_dt,
1,
effective_start_dt)
OVER (PARTITION BY account_dim_key
ORDER BY effective_start_dt)
AS prev_dt,
ROWID AS rid
FROM dexwhs.d_account_veeva ac2) a
WHERE a.rid = ROWID)
使用 acocunt_dim_key 更新语句而不是最佳解释计划
UPDATE DEXWHS.D_ACCOUNT_VEEVA
SET effective_end_dt =
(SELECT prev_dt
FROM (SELECT LAG (
effective_end_dt,
1,
effective_start_dt)
OVER (PARTITION BY id
ORDER BY effective_start_dt, account_dim_key)
AS prev_dt,
account_dim_key AS rid
FROM dexwhs.d_account_veeva ac2) a
WHERE a.rid = account_dim_key)
循环更新
CREATE OR REPLACE PROCEDURE PREV_UPDT
IS
CURSOR c1
IS
SELECT account_dim_key,
id,
active_flag,
effective_end_dt,
effective_start_dt,
created_date,
last_modified_date,
(SELECT prev_dt
FROM (SELECT LAG (
effective_end_dt,
1,
effective_start_dt)
OVER (
PARTITION BY id
ORDER BY effective_start_dt, account_dim_key)
AS prev_dt,
account_dim_key AS rid
FROM dexwhs.d_account_veeva ac2) a
WHERE a.rid = src.account_dim_key)
FROM dexwhs.d_account_veeva src
ORDER BY id, effective_start_dt, account_dim_key;
r1 c1%ROWTYPE;
BEGIN
OPEN c1;
LOOP
FETCH c1 INTO r1;
EXIT WHEN c1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE ('id=' || r1.id);
UPDATE dexwhs.D_ACCOUNT_VEEVA trgt
SET trgt.effective_start_dt = r1.prev_date,
trgt.audit_last_update_dt = SYSDATE,
WHERE trgt.account_dim_key = r1.account_dim_key;
DBMS_OUTPUT.PUT_LINE ('r1.id_found');
END LOOP;
CLOSE c1;
END
如果 account_dim_key
是主键,则尝试 MERGE
MERGE INTO dexwhs.d_account_veeva a
USING (
SELECT account_dim_key,
LAG ( effective_end_dt, 1, effective_start_dt)
OVER (PARTITION BY account_dim_key
ORDER BY effective_start_dt)
AS prev_dt
FROM dexwhs.d_account_veeva
) b
ON (a.account_dim_key = b.account_dim_key )
WHEN MATCHED THEN UPDATE SET a.effective_end_dt = b.prev_dt
查询必须花费一些时间,因为它正在更新整个 table。
也许您可以在 (account_dim_key, effective_start_dt)
列上使用复合索引来加快 LAG ... (PARTITION BY account_dim_key ORDER BY effective_start_dt)
部分的速度。
CREATE INDEX some_name
ON dexwhs.d_account_veeva(account_dim_key, effective_start_dt)
但是 Oracle 可以忽略此索引并更喜欢完整 table 扫描,因为子查询是针对整个 table.
我正在尝试对包含大约 70 万条记录的 table 进行批量更新。我需要用之前记录的有效结束日期更新有效开始日期。使用子查询时,我在更新语句的性能方面遇到了问题。即使使用日期过滤器(7/1/2016-7/15/2016,大约有 2k 条记录),也需要一个多小时才能达到 运行。我尝试将其作为简单的更新语句、插入语句和循环语句。使用 ROWID 而不是 account_dim_key 的解释计划(table 上的 PK)要优化得多,但是,我得到一个错误,子查询 returns 多于一行。我不确定为什么 ROWID 会发生这种情况。
ID是table上的自然键,account_dim_key是PK,是唯一的。两者都有索引。 Table 是 2 类 SCD。
- 如何使用ROWID修改更新语句
- 使用 FORALL 更新会更好吗?如果是这样,我将如何编写它(pl sql 的新手并且不熟悉数组)
使用 ROWID 更新语句 returns 错误单行子查询 returns 多行但具有最佳解释计划
UPDATE DEXWHS.D_ACCOUNT_VEEVA
SET effective_end_dt =
(SELECT prev_dt
FROM (SELECT LAG (
effective_end_dt,
1,
effective_start_dt)
OVER (PARTITION BY account_dim_key
ORDER BY effective_start_dt)
AS prev_dt,
ROWID AS rid
FROM dexwhs.d_account_veeva ac2) a
WHERE a.rid = ROWID)
使用 acocunt_dim_key 更新语句而不是最佳解释计划
UPDATE DEXWHS.D_ACCOUNT_VEEVA
SET effective_end_dt =
(SELECT prev_dt
FROM (SELECT LAG (
effective_end_dt,
1,
effective_start_dt)
OVER (PARTITION BY id
ORDER BY effective_start_dt, account_dim_key)
AS prev_dt,
account_dim_key AS rid
FROM dexwhs.d_account_veeva ac2) a
WHERE a.rid = account_dim_key)
循环更新
CREATE OR REPLACE PROCEDURE PREV_UPDT
IS
CURSOR c1
IS
SELECT account_dim_key,
id,
active_flag,
effective_end_dt,
effective_start_dt,
created_date,
last_modified_date,
(SELECT prev_dt
FROM (SELECT LAG (
effective_end_dt,
1,
effective_start_dt)
OVER (
PARTITION BY id
ORDER BY effective_start_dt, account_dim_key)
AS prev_dt,
account_dim_key AS rid
FROM dexwhs.d_account_veeva ac2) a
WHERE a.rid = src.account_dim_key)
FROM dexwhs.d_account_veeva src
ORDER BY id, effective_start_dt, account_dim_key;
r1 c1%ROWTYPE;
BEGIN
OPEN c1;
LOOP
FETCH c1 INTO r1;
EXIT WHEN c1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE ('id=' || r1.id);
UPDATE dexwhs.D_ACCOUNT_VEEVA trgt
SET trgt.effective_start_dt = r1.prev_date,
trgt.audit_last_update_dt = SYSDATE,
WHERE trgt.account_dim_key = r1.account_dim_key;
DBMS_OUTPUT.PUT_LINE ('r1.id_found');
END LOOP;
CLOSE c1;
END
如果 account_dim_key
是主键,则尝试 MERGE
MERGE INTO dexwhs.d_account_veeva a
USING (
SELECT account_dim_key,
LAG ( effective_end_dt, 1, effective_start_dt)
OVER (PARTITION BY account_dim_key
ORDER BY effective_start_dt)
AS prev_dt
FROM dexwhs.d_account_veeva
) b
ON (a.account_dim_key = b.account_dim_key )
WHEN MATCHED THEN UPDATE SET a.effective_end_dt = b.prev_dt
查询必须花费一些时间,因为它正在更新整个 table。
也许您可以在 (account_dim_key, effective_start_dt)
列上使用复合索引来加快 LAG ... (PARTITION BY account_dim_key ORDER BY effective_start_dt)
部分的速度。
CREATE INDEX some_name
ON dexwhs.d_account_veeva(account_dim_key, effective_start_dt)
但是 Oracle 可以忽略此索引并更喜欢完整 table 扫描,因为子查询是针对整个 table.