在同一 table 上进行聚合的多个左连接会导致 SAP HANA 中的性能受到巨大影响

Question

我在 HANA 上加入了两个 table，为了获得一些统计数据，我还加入了项目 table 3 次以获得总计数、处理的条目数和错误，如下图。

这是一个开发系统，项目 table 只有 1500 个项目。但是下面的查询运行了 17 秒。

当我删除三个聚合项中的任何一个（但保留相应的 JOIN）时，查询几乎立即执行。

我也试过在特定 JOIN 中使用的字段上添加索引，但这没有什么区别。

select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by, 
count( distinct rp.guid ), 
count( distinct rp2.guid ), 
count( distinct rp3.guid )
    from zbsbpi_rk as rk
    left join zbsbpi_rp as rp
      on rp.header = rk.guid
    left join zbsbpi_rp as rp2
      on rp2.header = rk.guid
     and rp2.processed = 'X'
    left join zbsbpi_rp as rp3
      on rp3.header = rk.guid
     and rp3.result_status = 'E'
    where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

Answer 1

我认为您可以重新编写查询以提高性能：

select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by, 
count( distinct rp.guid ), 
count( distinct (CASE WHEN rp.processed = 'X' then rp.guid else null end) ), 
count( distinct (CASE WHEN rp.result_status = 'E' then rp.guid else null end))
    from zbsbpi_rk as rk
    left join zbsbpi_rp as rp
      on rp.header = rk.guid
where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

我不完全确定 count distinct case 构造 是否适用于 hana，但你可以试试。

Answer 2

抱歉，我忘了我已经在这里发布了这个问题。我在这里没有得到任何快乐后在 answers.sap.com 上发布了同样的问题：https://answers.sap.com/questions/172096/multiple-left-joins-with-aggregation-on-same-table.html

我最终想出了解决方案，这有点 "doh!" 时刻：

  select rk.guid, rk.run_id, rk.status, rk.created_at, rk.created_by,
    count( distinct rp.guid ), 
    count( distinct rp2.guid ), 
    count( distinct rp3.guid )
    from zbsbpi_rk as rk
    join zbsbpi_rp as rp
      on rp.header = rk.guid
    left join zbsbpi_rp as rp2
      on rp2.guid = rp.guid
     and rp2.processed = 'X'
    left join zbsbpi_rp as rp3
      on rp3.guid = rp.guid
     and rp3.result_status = 'E'
    where rk.run_id = '0000000010'
    group by rk.guid, run_id, status, created_at, created_by

后续的左连接只需要连接到同一 table 上的第一个连接，因为无论如何第一个连接包含所有记录的超集。

在同一 table 上进行聚合的多个左连接会导致 SAP HANA 中的性能受到巨大影响

Multiple left joins with aggregation on same table causes huge performance hit in SAP HANA

sap

hana