如何在 SQL 查询中使用 ClickHouse 分区值?
How to use ClickHouse partition value in SQL query?
我有一个带有元组分区的 table:(0, 0)
、(0, 1)
、(1, 0)
、(1, 1)
、(2, 0)
、(2, 1)
, (3, 0)
, ...
CREATE TABLE my_table
(
id Int32,
a Int32,
b Float32,
c Int32
)
ENGINE = MergeTree
PARTITION BY
(
intDiv(id, 1000000),
a < 20000 AND b > 0.6 AND c >= 100
)
ORDER BY id;
我只需要分区 (<any number>, 1)
的行,我正在寻找一种在查询中使用分区值的方法,例如
SELECT *
FROM my_table
WHERE my_table.partition[2] == 1;
ClickHouse有这样的功能吗?
在版本 21.6 中添加了虚拟列 _partition_id 和 _partition_value 可以提供帮助你:
SELECT
*,
_partition_id,
_partition_value
FROM my_table
WHERE (_partition_value.2) = 1
有什么问题
where (a < 20000 AND b > 0.6 AND c >= 100) = 1
???
insert into my_table select 1, 3000000, 0, 0 from numbers(100000000);
insert into my_table select 1, 0, 10, 200 from numbers(100);
SET send_logs_level = 'debug';
set force_index_by_date=1;
select sum(id) from my_table where (a < 20000 AND b > 0.6 AND c >= 100) = 1;
...Selected 1/7 parts by partition key...
┌─sum(id)─┐
│ 100 │
└─────────┘
1 rows in set. Elapsed: 0.002 sec.
尽管 (_partition_value.2) = 1
会更快,因为它不需要读取 a、b、c 列进行过滤。
我有一个带有元组分区的 table:(0, 0)
、(0, 1)
、(1, 0)
、(1, 1)
、(2, 0)
、(2, 1)
, (3, 0)
, ...
CREATE TABLE my_table
(
id Int32,
a Int32,
b Float32,
c Int32
)
ENGINE = MergeTree
PARTITION BY
(
intDiv(id, 1000000),
a < 20000 AND b > 0.6 AND c >= 100
)
ORDER BY id;
我只需要分区 (<any number>, 1)
的行,我正在寻找一种在查询中使用分区值的方法,例如
SELECT *
FROM my_table
WHERE my_table.partition[2] == 1;
ClickHouse有这样的功能吗?
在版本 21.6 中添加了虚拟列 _partition_id 和 _partition_value 可以提供帮助你:
SELECT
*,
_partition_id,
_partition_value
FROM my_table
WHERE (_partition_value.2) = 1
where (a < 20000 AND b > 0.6 AND c >= 100) = 1
???
insert into my_table select 1, 3000000, 0, 0 from numbers(100000000);
insert into my_table select 1, 0, 10, 200 from numbers(100);
SET send_logs_level = 'debug';
set force_index_by_date=1;
select sum(id) from my_table where (a < 20000 AND b > 0.6 AND c >= 100) = 1;
...Selected 1/7 parts by partition key...
┌─sum(id)─┐
│ 100 │
└─────────┘
1 rows in set. Elapsed: 0.002 sec.
尽管 (_partition_value.2) = 1
会更快,因为它不需要读取 a、b、c 列进行过滤。