将特定日期的数据加入每日时间序列 (Teradata SQL)

Joining specific days of data to a daily time series (Teradata SQL)

我很难在标题中总结我想要做什么,但我的例子应该是有意义的。我正在尝试在 Teradata 中编写一个更高效的查询,它将在 Tableau 中使用。我可以用蛮力和无知的方法来完成它,但过了一会儿我 运行 脱线 space 所以我需要提高效率。

假设我有两个 table,一个具有客户属性的客户 table 和一个每日余额 table(比这更复杂,但这是重要的部分) .我想编写一个查询 returns 每个客户的每日余额以及其他列,这些列是该客户的特定天数余额,而不考虑最终 table.

中的日期字段

示例:

客户Table

CustID | CustState | CustType | ...
001    | NY        | A        | ...
002    | CA        | B        | ...    
003    | NC        | C        | ...

余额Table

CustID | Date      | Balance
001    |04/01/2018 | 100
001    |04/02/2018 | 105
001    |04/03/2018 | 110
002    |04/01/2018 | 5000
002    |04/02/2018 | 15000
002    |04/03/2018 | 25

最终查询结果

CustID | CustState | Date      | Balance | Balance42 | Balance43
001    | NY        |04/01/2018 | 100     | 105       | 110
001    | NY        |04/02/2018 | 105     | 105       | 110
001    | NY        |04/03/2018 | 110     | 105       | 110
002    | CA        |04/01/2018 | 5000    | 1500      | 25
002    | CA        |04/02/2018 | 15000   | 1500      | 25
002    | CA        |04/03/2018 | 25      | 1500      | 25

如您所见,前四列是直截了当的,后两列分别代表 4/2/2018 和 4/3/2018 的余额。我目前正在这样做,如下所示,我使用多个 joins/subqueries 来获取特定余额:

select a.CustID
  , a.CustState
  , b.Date
  , sum(b.Balance) as Balance
  , c.Balance as Balance42
  , d.Balance as Balance43

from Customer a

inner join Balance b on a.CustID=b.CustID

inner join (
  select aa.CustID
    , sum(bb.Balance) as Balance
  from Customer aa
  inner join Balance bb on aa.CustID=bb.CustID
  where aa.CustType in ('A','B')
    and bb.Date=DATE '2018-04-02
  group by aa.CustID
) c on a.CustID=c.CustID 

inner join (
  select aa.CustID
    , sum(bb.Balance) as Balance
  from Customer aa
  inner join Balance bb on aa.CustID=bb.CustID
  where aa.CustType in ('A','B')
    and bb.Date=DATE '2018-04-03
  group by aa.CustID
) d on a.CustID=c.CustID 

where a.CustType in ('A','B')

group by a.CustID
  , a.CustState
  , b.Date
  , c.Balance
  , d.Balance

有没有一种方法可以只用一个 join/subquery 来提高效率?当我添加太多 joins/subqueries 时,我开始 运行 脱离假脱机 space 但我有一个特定的业务用途,为什么我试图获得最终结果结构。

不确定我是否完全理解您想要做的事情。但似乎你应该能够在一个语句中完成,对你的最后两个计算使用 case 语句:

select a.CustID
  , a.CustState
  , b.Date
  , sum(b.Balance) as Balance
  , sum (case when b.date = '2018-04-02' then b.balance else null end) as balance42
  , sum (case when b.date = '2018-04-03' then b.balance else null end) as balance 43
from Customer a

inner join Balance b on a.CustID=b.CustID

您需要条件聚合,但在您的情况下它基于Windowed Aggregate:

select a.CustID
  , a.CustState
  , b.Date
  , sum(b.Balance) as Balance 

  , max(case when b.Date=DATE '2018-04-02' then sum(b.Balance) end)
    over (partition by a.CustID) as Balance42

  , max(case when b.Date=DATE '2018-04-03' then sum(b.Balance) end)
    over (partition by a.CustID) as Balance43

from Customer a

inner join Balance b on a.CustID=b.CustID

where a.CustType in ('A','B')

group by a.CustID
  , a.CustState
  , b.Date

不使用 OLAP 的替代查询(仅当 Customer.CustID 是 PK 时有效)

with x as (
  select a.CustID
    , a.CustState
    , b.Date
    , sum(b.Balance) as Balance 
  from Customer a
  inner join Balance b on a.CustID=b.CustID
  where a.CustType in ('A','B')
  group by a.CustID
    , a.CustState
    , b.Date
)
select x.CustID
    , x.CustState
    , x.Date
    , x.Balance
    , d1.Balance as Balance42
    , d2.Balance as Balance43
from x
inner join x d1 when d1.CustID = x.CustID and d1.Date=DATE '2018-04-02'
inner join x d2 when d2.CustID = x.CustID and d2.Date=DATE '2018-04-03'