使用不同的值更新 table 中的多行

Question

我有以下架构：

DROP SCHEMA IF EXISTS s CASCADE;  
CREATE SCHEMA s;

CREATE TABLE "s"."t1"
(
    "c1" BigSerial PRIMARY KEY,
    "c2" BigInt NOT NULL,
    "c3" BigInt
)
WITH (OIDS=FALSE);

INSERT INTO s.t1 (c2, c3) VALUES (10, 100);
INSERT INTO s.t1 (c2, c3) VALUES (20, 200);
INSERT INTO s.t1 (c2, c3) VALUES (30, 300);
INSERT INTO s.t1 (c2, c3) VALUES (40, 400);

PREPARE updateplan (BigInt, BigInt) AS 
    update s.t1 
    SET c3 = 
    WHERE c2 = ;

EXECUTE updateplan (20, 250);

PREPARE updatearrayplan(BigInt[], BigInt[]) AS
    for i in size()
    DO  
        update s.t1
        SET c3 = [$i]
        WHERE c2 = [$i]
    END FOR    

EXECUTE updatearrayplan({20, 30}, {275, 375})

/* 20, 200 -> 20, 275 */
/* 30, 300 -> 30, 375 */

执行 updatearrayplan 后，我希望行具有这些值 20 -> 275、30 -> 375

有没有办法用作为数组传入的不同列值更新多行。也可以保证数组的顺序将得到维护。

Answer 1

尝试：

WITH arrays AS( 
    SELECT * from 
    unnest(
         ARRAY[20, 30], 
         ARRAY[275, 375]
    ) as xy(x,y)
)
UPDATE t1 
SET c3 = a.y 
FROM arrays a
WHERE c2 = a.x;

在此处查看 unnest 函数的说明：click

编辑

@kordiroko Sorry. I tried out the whole day modifying your solution. Couldn't make it work.

可能是您的 PostgreSQL 版本较旧。我在 9.5 版本上测试了它，我只花了几分钟就让它工作了，只需 copy/paste 并更改查询中的两个参数：

create table t1(
  c2 BIGINT,
  c3 bigint
);


insert into t1( c2, c3 )
select x, x * 100 
from generate_series( 1,1000000 ) x;

CREATE OR REPLACE FUNCTION updatefunc1(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
    FOR i IN array_lower(, 1) .. array_upper(, 1)
    LOOP  
        update t1
        SET c3 = [i]
        WHERE c2 = [i];
    END LOOP;  
END;    
$$
LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION updatefunc2(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
    WITH arrays AS( 
        SELECT * from 
        unnest( ,   ) as xy(x,y)
    )
    UPDATE t1 
    SET c3 = a.y 
    FROM arrays a
    WHERE c2 = a.x;
END;    
$$
LANGUAGE plpgsql;


select updatefunc1(ARRAY[20], ARRAY[275]);

select updatefunc2(ARRAY[30], ARRAY[555]);

select * from t1 where c2 in (20,30);

Let me know if this is correct or there is a better solution.

非常正确，但是...有点慢。

我仅针对 100 条记录测试了您的函数：

select updatefunc1(
    array( select * from generate_series(1,100)),
    array( select 22222 from generate_series(1,100))
);

用了 12 秒：

Result (cost=20.00..20.31 rows=1 width=0) (actual time=12259.095..12259.096 rows=1 loops=1) Output: updatefunc1(([=22=])::bigint[], ()::bigint[]) InitPlan 1 (returns [=22=])

现在将其与我的函数进行比较，但对于 100.000 条记录：

select updatefunc2(
    array( select * from generate_series(1,100000)),
    array( select 22222 from generate_series(1,100000))
);

一个结果是1秒150毫秒：

Result (cost=20.00..20.31 rows=1 width=0) (actual time=1150.018..1150.123 rows=1 loops=1) Output: updatefunc2(([=23=])::bigint[], ()::bigint[]) InitPlan 1 (returns [=23=])

以上结果意味着，你的函数是：

( 12 / 100 ) / ( 1.150 / 100000 ) = 10434,78

次 sloooooooooooooooooooooooooooooooooeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
并且在 % 中这只是 1043400 % 较慢

编辑 2

My version is 9.2.15. It throws up syntax errors

以下是应该适用于较早版本的 PostgreSQL 的版本：

CREATE OR REPLACE FUNCTION updatefunc3(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
    WITH arrays AS( 
        SELECT arr1[ rn ] as x, arr2[ rn ] as y 
        FROM (
            SELECT  as arr1,  as arr2, generate_subscripts(, 1) As rn
        ) x
    )
    UPDATE t1 
    SET c3 = a.y 
    FROM arrays a
    WHERE c2 = a.x;
END;    
$$
LANGUAGE plpgsql;

select updatefunc3(ARRAY[40,82,77], ARRAY[333,654]);

select * from t1 where c2 in (40,82,77);

uptadint 100,000 行的速度测试是：

select updatefunc3(
    array( select * from generate_series(1,100000)),
    array( select 22222 from generate_series(1,100000))
);

Result (cost=20.00..20.31 rows=1 width=0) (actual time=1361.358..1361.460 rows=1 loops=1) Output: updatefunc3(([=26=])::bigint[], ()::bigint[]) InitPlan 1 (returns [=26=])

更新100k行的时间低于1.5秒

编辑 3

@kordiko : Could you please tell me why your query is so much better. My function goes through each row and updates the elements one by one. Your function also appears to do the same. Is it that all the equivalent rows are updated ed simultaneously in your query.

这是因为我的函数只运行一个更新命令，而不管数组中的元素数量如何，而您的函数一个接一个地更新元素 - 对于 100 个元素，它运行 100 个更新命令.对于 1000 个元素，它运行 1000 个更新命令。
我已经在具有 1000000 行但没有任何索引的 table 上完成了测试。在我的函数中，更新只读取一次 table 内容（进行完整的 table 扫描），并更新匹配行。您的函数执行 100 次更新，并且每次都执行完整的 table 扫描。
如果您在 col2 上创建和索引，那么您的函数速度会急剧下降，请参见下面的测试（请注意，此测试中的元素数量从 100 增加到 100000：

create INDEX t1_c2_ix on t1( c2 );

select updatefunc1(
    array( select * from generate_series(1,100000)),
    array( select 22222 from generate_series(1,100000))
);

Result  (cost=20.00..20.31 rows=1 width=0) (actual time=**3430.536**..3430.636 rows=1 loops=1)
  Output: updatefunc1(([=16=])::bigint[], ()::bigint[])
  InitPlan 1 (returns [=16=])

现在一个时间只有3.5秒左右。

创建索引后测试我的功能：

select updatefunc3(
    array( select * from generate_series(1,100000)),
    array( select 22222 from generate_series(1,100000))
);

结果 (cost=20.00..20.31 rows=1 width=0) (actual time=1270.619..1270.724 rows=1 loops=1) 输出：updatefunc3(($0)::bigint[], ($1)::bigint[]) 初始计划 1 (returns $0)

时间保持不变，但仍然比您的函数快 100%。

Answer 2

我的回答：

CREATE OR REPLACE FUNCTION s.updatefunc1(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
    FOR i IN array_lower(, 1) .. array_upper(, 1)
    LOOP  
        update s.t1
        SET c3 = [i]
        WHERE c2 = [i];
    END LOOP;  
END;    
$$
LANGUAGE plpgsql;

select s.updatefunc1(ARRAY[20], ARRAY[275]);

这确实有效。我得到了我想要的答案：

   SELECT c2, c3 FROM s.t1;
   c2 | c3  
   ----+-----
   10 | 100
   30 | 300
   40 | 400
   20 | 275  --> Updated

让我知道这是否正确或有更好的解决方案。

使用不同的值更新 table 中的多行

Update multiple rows in a table with different values

database

postgresql

multiple-columns

sql-update