teradata,重置时间,分区依据,排序依据
teradata, reset when, partition by, order by
我需要帮助来理解下面的代码。在 Teradata 中使用时,我从未见过重置。 Teradata 中的 RESET WHEN 有何作用?我了解分区和按部分排序。我也不确定为什么这没有被 PARTITION BY A.ACCT_DIM_NB, A.DAY_TIME_DIM_NB ORDER BY A.TXN_POSTING_SEQ 分区。此外,ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW 是否仅使用整个分区 window?
Removed
I was also unsure why this wasn't partitioned by PARTITION BY Y.ACCT_DIM_NB, Y.DAY_TIME_DIM_NB ORDER BY Y.DAY_TIME_DIM_NB, Y.TXN_POSTING_SEQ
不知道,但这会 return 不同的结果(Y.DAY_TIME_DIM_NB
在 ORDER BY
中不需要,因为它已经被它分区了)
Also, is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW just using the whole partitioned window?
它与 ROWS UNBOUNDED PRECEDING
完全相同,即 Cumulative Max 的语法变体。 lpartition 是 ROWS UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
What does RESET WHEN do in Teradata?
RESET WHEN
是用于动态添加分区的 Teradata 扩展,它是两个(在您的情况下)或三个嵌套 OLAP 函数的较短语法:
-- using RESET WHEN
MAX(A.RUN_BAL_AM)
OVER (PARTITION BY A.ACCT_DIM_NB
ORDER BY A.DAY_TIME_DIM_NB, A.TXN_POSTING_SEQ
RESET WHEN A.CS_TXN_CD NOT IN ('072','075','079','107','111','112','139','181','318')
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS EOD_BAL_AM
-- Same result using Standard SQL
SELECT
Max(A.RUN_BAL_AM)
Over (PARTITION BY A.ACCT_DIM_NB, dynamic_partition
ORDER BY A.DAY_TIME_DIM_NB, A.TXN_POSTING_SEQ
ROWS BETWEEN Unbounded Preceding AND CURRENT ROW) AS EOD_BAL_AM
FROM
(
SELECT
-- this cumulative sum over 0/1 assigns a new value for each series of rows based on the CASE
Sum(CASE WHEN A.CS_TXN_CD NOT IN ('072','075','079','107','111','112','139','181','318') THEN 1 ELSE 0 end)
Over (PARTITION BY A.ACCT_DIM_NB, dynamic_partition
ORDER BY A.DAY_TIME_DIM_NB, A.TXN_POSTING_SEQ
ROWS Unbounded Preceding) AS dynamic_partition
FROM ...
) AS dt
What does RESET WHEN do in Teradata?
当子句为真时,重置 window 累积。网络上有很多这样的例子,但在你的情况下,我想象(从未见过它与 max 一起使用)它有效地定义了一个点,从这个点开始计算 max,并且每次遇到不在给定列表中的 txid它导致最大值仅从该点计算
I was also unsure why this wasn't partitioned by PARTITION BY Y.ACCT_DIM_NB, Y.DAY_TIME_DIM_NB ORDER BY Y.DAY_TIME_DIM_NB, Y.TXN_POSTING_SEQ .
为什么你认为应该这样做?分区和顺序有很大的不同。如果您有银行系统,您可能会按帐户分区,但如果您正在准备银行对帐单,则按日期对交易进行排序。
Also, is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW just using the whole partitioned window?
它定义了累加器应该查看的记录段以得出其答案。在您的情况下,最大值仅根据前几行计算。无界前置方式(自分区开始以来的所有行)。当前行就是这个意思。其他有效示例可能是:
ROWS BETWEEN 200 preceding and current row
ROWS BETWEEN 10 preceding and 20 following
ROWS BETWEEN current row and unbounded following
因为您的 window 仅定义为之前的行,所以随着行顺序的增加,最大值将保持在任何给定的最大值,直到数据中出现新的最大值。例如:
Data,max
3,3
2,3
1,3
4,4
1,4
3,4
1,4
5,5
4,5
2,5
4,5
9,9
5,9
当您从上到下进行操作时,一旦在当前行上找到比已知最大值更大的最大值,它就会成为新的最大值。仅在没有前几行的限制的情况下,如果整个数据集被最大化,则报告的每行最大值为 9
我需要帮助来理解下面的代码。在 Teradata 中使用时,我从未见过重置。 Teradata 中的 RESET WHEN 有何作用?我了解分区和按部分排序。我也不确定为什么这没有被 PARTITION BY A.ACCT_DIM_NB, A.DAY_TIME_DIM_NB ORDER BY A.TXN_POSTING_SEQ 分区。此外,ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW 是否仅使用整个分区 window?
Removed
I was also unsure why this wasn't partitioned by PARTITION BY Y.ACCT_DIM_NB, Y.DAY_TIME_DIM_NB ORDER BY Y.DAY_TIME_DIM_NB, Y.TXN_POSTING_SEQ
不知道,但这会 return 不同的结果(Y.DAY_TIME_DIM_NB
在 ORDER BY
中不需要,因为它已经被它分区了)
Also, is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW just using the whole partitioned window?
它与 ROWS UNBOUNDED PRECEDING
完全相同,即 Cumulative Max 的语法变体。 lpartition 是 ROWS UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
What does RESET WHEN do in Teradata?
RESET WHEN
是用于动态添加分区的 Teradata 扩展,它是两个(在您的情况下)或三个嵌套 OLAP 函数的较短语法:
-- using RESET WHEN
MAX(A.RUN_BAL_AM)
OVER (PARTITION BY A.ACCT_DIM_NB
ORDER BY A.DAY_TIME_DIM_NB, A.TXN_POSTING_SEQ
RESET WHEN A.CS_TXN_CD NOT IN ('072','075','079','107','111','112','139','181','318')
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS EOD_BAL_AM
-- Same result using Standard SQL
SELECT
Max(A.RUN_BAL_AM)
Over (PARTITION BY A.ACCT_DIM_NB, dynamic_partition
ORDER BY A.DAY_TIME_DIM_NB, A.TXN_POSTING_SEQ
ROWS BETWEEN Unbounded Preceding AND CURRENT ROW) AS EOD_BAL_AM
FROM
(
SELECT
-- this cumulative sum over 0/1 assigns a new value for each series of rows based on the CASE
Sum(CASE WHEN A.CS_TXN_CD NOT IN ('072','075','079','107','111','112','139','181','318') THEN 1 ELSE 0 end)
Over (PARTITION BY A.ACCT_DIM_NB, dynamic_partition
ORDER BY A.DAY_TIME_DIM_NB, A.TXN_POSTING_SEQ
ROWS Unbounded Preceding) AS dynamic_partition
FROM ...
) AS dt
What does RESET WHEN do in Teradata?
当子句为真时,重置 window 累积。网络上有很多这样的例子,但在你的情况下,我想象(从未见过它与 max 一起使用)它有效地定义了一个点,从这个点开始计算 max,并且每次遇到不在给定列表中的 txid它导致最大值仅从该点计算
I was also unsure why this wasn't partitioned by PARTITION BY Y.ACCT_DIM_NB, Y.DAY_TIME_DIM_NB ORDER BY Y.DAY_TIME_DIM_NB, Y.TXN_POSTING_SEQ .
为什么你认为应该这样做?分区和顺序有很大的不同。如果您有银行系统,您可能会按帐户分区,但如果您正在准备银行对帐单,则按日期对交易进行排序。
Also, is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW just using the whole partitioned window?
它定义了累加器应该查看的记录段以得出其答案。在您的情况下,最大值仅根据前几行计算。无界前置方式(自分区开始以来的所有行)。当前行就是这个意思。其他有效示例可能是:
ROWS BETWEEN 200 preceding and current row
ROWS BETWEEN 10 preceding and 20 following
ROWS BETWEEN current row and unbounded following
因为您的 window 仅定义为之前的行,所以随着行顺序的增加,最大值将保持在任何给定的最大值,直到数据中出现新的最大值。例如:
Data,max
3,3
2,3
1,3
4,4
1,4
3,4
1,4
5,5
4,5
2,5
4,5
9,9
5,9
当您从上到下进行操作时,一旦在当前行上找到比已知最大值更大的最大值,它就会成为新的最大值。仅在没有前几行的限制的情况下,如果整个数据集被最大化,则报告的每行最大值为 9