Clickhouse ASOF left Join right table 可空列未实现

Clickhouse ASOF left Join right table Nullable column is not implemented

我试图在文档和其他论坛中找到答案,但未能理解。 文档:https://clickhouse.com/docs/en/sql-reference/statements/select/join/ CH doc

我的 table 类似于:stack overflow question

我想在交易前获得订单价格(以及 return 订单 ts 和有关交易的信息)。

使用下面的代码,我得到以下错误:SQL 错误 [48]:ClickHouse 异常,代码:48,DB::Exception:ASOF join over right table 可空列未实现

我试过的条件是“IS NOT NULL”,但它什么也没做。

代码:

WITH 
orders as(
    SELECT order_timestamp, order_price, product_id
    FROM order_table
    WHERE
        ( order_timestamp >= toInt32(toDateTime64('2022-01-02 10:00:00.000', 3))*1000
            AND order_timestamp <= toInt32(toDateTime64('2022-01-02 12:00:00.000', 3))*1000)
        AND product_id = 'SPXFUT'
        AND order_timestamp IS NOT NULL),
trades as (
    SELECT 
       trade_timestamp,
       price
    FROM trades_table
    WHERE
         trade_timestamp >= toInt32(toDateTime64('2021-12-02 10:00:00.000', 3))*1000 AND trade_timestamp <= toInt32(toDateTime64('2021-12-02 12:00:00.000', 3))*1000
         AND product_id = 'SPXFUT'
         AND trade_timestamp IS NOT NULL),
results as(
SELECT 
    tt.product_id,
    tt.trade_timestamp,
    tt.price,
    o.order_timestamp,
    o.order_price
    FROM trades tt
    ASOF LEFT JOIN orders o
    ON (tt.product_id = o.product_id ) AND (tt.trade_timestamp >= o.order_timestamp ))
SELECT *
FROM results

ASOF LEFT JOIN 的当前实现要求不等式中使用的右侧列是不可为 null 的类型。由于类型来自 table 定义(并且您的 order_table 必须具有 order_timestamp 的定义,例如 Nullable(Int64)),ClickHouse 将拒绝 运行 例外情况像 ClickHouse exception, code: 48,DB::Exception: ASOF join over right table Nullable column is not implemented.

作为解决方案,您可以在 ASOF LEFT JOIN 条件中用 assumeNotNull 函数包装 o.order_timestamp

ON (tt.product_id = o.product_id ) AND (tt.trade_timestamp >= assumeNotNull(o.order_timestamp) ))

但你应该考虑 assumeNotNull (documentation) - it will give a non-null value for the type (default value), which might give wrong results for the cases where o.order_timestamp is null (assumeNotNull is implementation specific and can bring more problems - assumeNotNull and friends) 的行为。

另一个解决方案是使用 ifNull 并提供 suitable 替换缺失值,避免 assumeNotNull 带来的潜在问题:

ON (tt.product_id = o.product_id ) AND (tt.trade_timestamp >= ifNull(o.order_timestamp,0) ))

最后一个建议是将列数据类型更改为不可为空的数据类型:

ALTER TABLE order_table MODIFY COLUMN order_timestamp <TypeName>;

但这取决于您的用例 - 如果要求 order_timestamp 必须接受空值,我建议采用第二种解决方案。