window 函数的默认 window 框架是什么
What's the default window frame for window functions
运行以下代码:
val sales = Seq(
(0, 0, 0, 5),
(1, 0, 1, 3),
(2, 0, 2, 1),
(3, 1, 0, 2),
(4, 2, 0, 8),
(5, 2, 2, 8))
.toDF("id", "orderID", "prodID", "orderQty")
val orderedByID = Window.orderBy('id')
val totalQty = sum('orderQty').over(orderedByID).as('running_total')
val salesTotalQty = sales.select(*, totalQty).orderBy('id')
salesTotalQty.show()
结果是:
+---+-------+------+--------+-------------+
| id|orderID|prodID|orderQty|running_total|
+---+-------+------+--------+-------------+
| 0| 0| 0| 5| 5|
| 1| 0| 1| 3| 8|
| 2| 0| 2| 1| 9|
| 3| 1| 0| 2| 11|
| 4| 2| 0| 8| 19|
| 5| 2| 2| 8| 27|
+---+-------+------+--------+-------------+
上面代码中没有定义window帧,看起来默认的window帧是rowsBetween(Window.unboundedPreceding, Window.currentRow)
不确定我对默认 window 框架的理解是否正确
Default frame specification depends on other aspects of a given window defintion:
- if the ORDER BY clause is specified and the function accepts the frame specification, then the frame specification is defined by RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,
- otherwise the frame specification is defined by ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
运行以下代码:
val sales = Seq(
(0, 0, 0, 5),
(1, 0, 1, 3),
(2, 0, 2, 1),
(3, 1, 0, 2),
(4, 2, 0, 8),
(5, 2, 2, 8))
.toDF("id", "orderID", "prodID", "orderQty")
val orderedByID = Window.orderBy('id')
val totalQty = sum('orderQty').over(orderedByID).as('running_total')
val salesTotalQty = sales.select(*, totalQty).orderBy('id')
salesTotalQty.show()
结果是:
+---+-------+------+--------+-------------+
| id|orderID|prodID|orderQty|running_total|
+---+-------+------+--------+-------------+
| 0| 0| 0| 5| 5|
| 1| 0| 1| 3| 8|
| 2| 0| 2| 1| 9|
| 3| 1| 0| 2| 11|
| 4| 2| 0| 8| 19|
| 5| 2| 2| 8| 27|
+---+-------+------+--------+-------------+
上面代码中没有定义window帧,看起来默认的window帧是rowsBetween(Window.unboundedPreceding, Window.currentRow)
不确定我对默认 window 框架的理解是否正确
Default frame specification depends on other aspects of a given window defintion:
- if the ORDER BY clause is specified and the function accepts the frame specification, then the frame specification is defined by RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,
- otherwise the frame specification is defined by ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.