如何在 SQL 查询中使用 ClickHouse 分区值？

Question

我有一个带有元组分区的 table：(0, 0)、(0, 1)、(1, 0)、(1, 1)、(2, 0)、(2, 1), (3, 0), ...

CREATE TABLE my_table
(
    id Int32,
    a Int32,
    b Float32,
    c Int32
)
ENGINE = MergeTree
PARTITION BY
(
    intDiv(id, 1000000),
    a < 20000 AND b > 0.6 AND c >= 100
)
ORDER BY id;

我只需要分区 (<any number>, 1) 的行，我正在寻找一种在查询中使用分区值的方法，例如

SELECT *
FROM my_table
WHERE my_table.partition[2] == 1;

ClickHouse有这样的功能吗？

Answer 1

在版本 21.6 中添加了虚拟列 _partition_id 和 _partition_value 可以提供帮助你：

SELECT
    *,
    _partition_id,
    _partition_value
FROM my_table
WHERE (_partition_value.2) = 1

Answer 2

有什么问题

where (a < 20000 AND b > 0.6 AND c >= 100) = 1

???

insert into my_table select 1, 3000000, 0, 0 from numbers(100000000);
insert into my_table select 1, 0, 10, 200 from numbers(100);

SET send_logs_level = 'debug';
set force_index_by_date=1;

select sum(id) from my_table where (a < 20000 AND b > 0.6 AND c >= 100) = 1;
           
...Selected 1/7 parts by partition key...

┌─sum(id)─┐
│     100 │
└─────────┘
1 rows in set. Elapsed: 0.002 sec.

尽管 (_partition_value.2) = 1 会更快，因为它不需要读取 a、b、c 列进行过滤。

如何在 SQL 查询中使用 ClickHouse 分区值？

How to use ClickHouse partition value in SQL query?

clickhouse