运行 总和不能小于零的正数和负数的总和

Running total of positive and negative numbers where the sum cannot go below zero

这是一个 SQL 问题。

我有一列数字,可以是正数也可以是负数,我正在尝试找出一种方法来获得该列的 运行 总和,但总数不能低于零。

Date       | Number | Desired | Actual
2020-01-01 |    8   |   8     |    8
2020-01-02 |   11   |  19     |   19
2020-01-03 |   30   |  49     |   49
2020-01-04 |  -10   |  39     |   39
2020-01-05 |  -12   |  27     |   27
2020-01-06 |   -9   |  18     |   18
2020-01-07 |  -26   |   0     |   -8
2020-01-08 |    5   |   5     |   -3
2020-01-09 |  -23   |   0     |  -26
2020-01-10 |   12   |  12     |  -14
2020-01-11 |   14   |  26     |    0

我已经尝试了很多不同的 window 函数,但还没有找到防止 运行 总数变成负数的方法。

如有任何帮助,我们将不胜感激。


编辑 - 添加了日期列以指示排序

您可以使用 CASE 运算符和 SIGN 函数来实现...

CASE SIGN(my computed expression) WHEN -1 THEN 0 ELSE my computed expression END AS Actual

不幸的是,如果不逐条循环浏览记录,就无法做到这一点。反过来,这需要类似递归 CTE 的东西。

with t as (
      select t.*, row_number() over (order by date) as seqnum
      from mytable t
     ),
     cte as (
      select NULL as number, 0 as desired, 0 as seqnum
      union all
      select t.number,
             (case when cte.desired + t.number < 0 then 0
                   else cte.desired + t.number
              end),
             cte.seqnum + 1
      from cte join
           t
           on t.seqnum = cte.seqnum + 1
     )
select cte.*
from cte
where cte.number is not null;

只有当您的数据相当小时,我才会推荐这种方法。但是话又说回来,如果你必须这样做,除了经历 table 一行一行的痛苦之外没有太多选择。

Here 是一个 db<>fiddle(使用 Postgres)。

这可以通过USER DEFINE TABLE FUNCTION到"manage"你想要携带的状态

来完成
CREATE OR REPLACE FUNCTION non_neg_sum(val float) RETURNS TABLE (out_sum float)
LANGUAGE JAVASCRIPT AS
'{
  processRow: function (row, rowWriter) {
    this.sum += row.VAL;
    if(this.sum < 0)
        this.sum = 0;
    rowWriter.writeRow({OUT_SUM: this.sum})
  },
  initialize: function() {
    this.sum = 0;
  }
}';

并像这样使用:

WITH input AS
(
    SELECT *
    FROM VALUES ('2020-01-01', 8, 8),
        ('2020-01-02', 11, 19 ),
        ('2020-01-03', 30, 49 ),
        ('2020-01-04',-10, 39 ),
        ('2020-01-05',-12, 27 ),
        ('2020-01-06', -9, 18 ),
        ('2020-01-07',-26,  0 ),
        ('2020-01-08',  5,  5 ),
        ('2020-01-09',-23,  0 ),
        ('2020-01-10', 12, 12 ),
        ('2020-01-11', 14, 26 ) d(day,num,wanted)
)
SELECT d.*
    ,sum(d.num)over(order by day) AS simple_sum
    ,j.*
FROM input AS d,
  TABLE(non_neg_sum(d.num::float) OVER (ORDER BY d.day)) j
ORDER BY day
;

给出结果:

DAY          NUM     WANTED    SIMPLE_SUM    OUT_SUM
2020-01-01   8       8         8             8
2020-01-02   11      19        19            19
2020-01-03   30      49        49            49
2020-01-04   -10     39        39            39
2020-01-05   -12     27        27            27
2020-01-06   -9      18        18            18
2020-01-07   -26     0         -8            0
2020-01-08   5       5         -3            5
2020-01-09   -23     0         -26           0
2020-01-10   12      12        -14           12
2020-01-11   14      26        0             26

另一个UDF解决方案:

select d, x, conditional_sum(x) from values 
  ('2020-01-01',   8), 
  ('2020-01-02',  11), 
  ('2020-01-03',  30), 
  ('2020-01-04', -10), 
  ('2020-01-05', -12), 
  ('2020-01-06',  -9), 
  ('2020-01-07', -26), 
  ('2020-01-08',   5), 
  ('2020-01-09', -23), 
  ('2020-01-10',  12), 
  ('2020-01-11',  14)
  t(d,x)
order by d;

其中 conditional_sum 定义为:

create or replace function conditional_sum(X float) 
returns float 
language javascript 
volatile
as 
$$
    if (!('sum' in this)) this.sum = 0
    return this.sum = (X+this.sum)<0 ? 0 : this.sum+X 
$$;

演示:

WITH input AS
(   SELECT *
    FROM (VALUES 
        ('2020-01-01', 8, 8),
        ('2020-01-02', 11, 19 ),
        ('2020-01-03', 30, 49 ),
        ('2020-01-04',-10, 39 ),
        ('2020-01-05',-12, 27 ),
        ('2020-01-06', -9, 18 ),
        ('2020-01-07',-26,  0 ),
        ('2020-01-08',  5,  5 ),
        ('2020-01-09',-23,  0 ),
        ('2020-01-10', 12, 12 ),
        ('2020-01-11', 14, 26 ),
        ('2020-01-12', 3,  26 )) AS d (day,num,wanted)
)
SELECT *, sum(num)over(order by day) AS CUM_SUM,
       CASE SIGN(sum(num)over(order by day)) 
          WHEN -1 THEN 0 
          ELSE sum(num)over(order by day) 
       END AS Actual
FROM   input 
ORDER BY day;

Return :

day                num      wanted     CUM_SUM      Actual
---------- ----------- ----------- ----------- -----------
2020-01-01           8           8           8           8
2020-01-02          11          19          19          19
2020-01-03          30          49          49          49
2020-01-04         -10          39          39          39
2020-01-05         -12          27          27          27
2020-01-06          -9          18          18          18
2020-01-07         -26           0          -8           0
2020-01-08           5           5          -3           0
2020-01-09         -23           0         -26           0
2020-01-10          12          12         -14           0
2020-01-11          14          26           0           0
2020-01-12           3          26           3           3

我在您的测试值中再添加一行......以证明最终条件总和为 3