ClickHouse 数组 - 在数组中找到最长的重复数字链
ClickHouse array - find a longest chain of repeating number in array
在 Clickhouse 中,我有一列包含 Int16 元素的数组。我正在寻找一种方法来找到最长的重复数字 1 链。
例如,数组 [0,1,1,1,5,1,1,1,1,1,2] 中重复 1 的最长链为 5 个元素。有什么办法可以用现有的函数来做吗?
试试这个查询:
SELECT
/* The source number. */
data.1 AS number,
/* The source array. */
data.2 AS array,
/* Number the values in each chain. */
arrayCumSumNonNegative((x, index) -> x = number ? 1 : -index, array, arrayEnumerate(array)) AS partiallySumArray,
arrayReduce('max', partiallySumArray) AS result
FROM
(
/* test data set */
SELECT arrayJoin([
/**/
(1, []),
(1, [0, 2, 2, 2, 5]),
(1, [0, 1, 1, 1, 5, 1, 1, 1, 1, 1,2]),
(1, [1, 1, 1, 2, 3, 4, 5, 1, 1]),
(1, [-5, 100, 1, 1, 0, 1, 1, 1]),
(1, [1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0]),
/**/
(5, []),
(5, [0, 2, 2, 2, 55]),
(5, [5, 5, 10, 300, 5, 77, 5])
]) AS data
)
FORMAT Vertical
/* Result:
Row 1:
──────
number: 1
array: []
partiallySumArray: []
result: 0
Row 2:
──────
number: 1
array: [0,2,2,2,5]
partiallySumArray: [0,0,0,0,0]
result: 0
Row 3:
──────
number: 1
array: [0,1,1,1,5,1,1,1,1,1,2]
partiallySumArray: [0,1,2,3,0,1,2,3,4,5,0]
result: 5
Row 4:
──────
number: 1
array: [1,1,1,2,3,4,5,1,1]
partiallySumArray: [1,2,3,0,0,0,0,1,2]
result: 3
Row 5:
──────
number: 1
array: [-5,100,1,1,0,1,1,1]
partiallySumArray: [0,0,1,2,0,1,2,3]
result: 3
Row 6:
──────
number: 1
array: [1,1,0,1,1,1,1,1,1,0,0]
partiallySumArray: [1,2,0,1,2,3,4,5,6,0,0]
result: 6
Row 7:
──────
number: 5
array: []
partiallySumArray: []
result: 0
Row 8:
──────
number: 5
array: [0,2,2,2,55]
partiallySumArray: [0,0,0,0,0]
result: 0
Row 9:
───────
number: 5
array: [5,5,10,300,5,77,5]
partiallySumArray: [1,2,0,0,1,0,1]
result: 2
*/
在 Clickhouse 中,我有一列包含 Int16 元素的数组。我正在寻找一种方法来找到最长的重复数字 1 链。 例如,数组 [0,1,1,1,5,1,1,1,1,1,2] 中重复 1 的最长链为 5 个元素。有什么办法可以用现有的函数来做吗?
试试这个查询:
SELECT
/* The source number. */
data.1 AS number,
/* The source array. */
data.2 AS array,
/* Number the values in each chain. */
arrayCumSumNonNegative((x, index) -> x = number ? 1 : -index, array, arrayEnumerate(array)) AS partiallySumArray,
arrayReduce('max', partiallySumArray) AS result
FROM
(
/* test data set */
SELECT arrayJoin([
/**/
(1, []),
(1, [0, 2, 2, 2, 5]),
(1, [0, 1, 1, 1, 5, 1, 1, 1, 1, 1,2]),
(1, [1, 1, 1, 2, 3, 4, 5, 1, 1]),
(1, [-5, 100, 1, 1, 0, 1, 1, 1]),
(1, [1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0]),
/**/
(5, []),
(5, [0, 2, 2, 2, 55]),
(5, [5, 5, 10, 300, 5, 77, 5])
]) AS data
)
FORMAT Vertical
/* Result:
Row 1:
──────
number: 1
array: []
partiallySumArray: []
result: 0
Row 2:
──────
number: 1
array: [0,2,2,2,5]
partiallySumArray: [0,0,0,0,0]
result: 0
Row 3:
──────
number: 1
array: [0,1,1,1,5,1,1,1,1,1,2]
partiallySumArray: [0,1,2,3,0,1,2,3,4,5,0]
result: 5
Row 4:
──────
number: 1
array: [1,1,1,2,3,4,5,1,1]
partiallySumArray: [1,2,3,0,0,0,0,1,2]
result: 3
Row 5:
──────
number: 1
array: [-5,100,1,1,0,1,1,1]
partiallySumArray: [0,0,1,2,0,1,2,3]
result: 3
Row 6:
──────
number: 1
array: [1,1,0,1,1,1,1,1,1,0,0]
partiallySumArray: [1,2,0,1,2,3,4,5,6,0,0]
result: 6
Row 7:
──────
number: 5
array: []
partiallySumArray: []
result: 0
Row 8:
──────
number: 5
array: [0,2,2,2,55]
partiallySumArray: [0,0,0,0,0]
result: 0
Row 9:
───────
number: 5
array: [5,5,10,300,5,77,5]
partiallySumArray: [1,2,0,0,1,0,1]
result: 2
*/