在 SQL Table 中填写缺失的日期并取最后的市场价值

Fill in missing dates in SQL Table and take last market value

我有以下Table

Account Netflow FeeAmount   Income  TWR MarketValue Date
33L951572   0.00    0.00    0.00    0.00    375645.74   3/31/2004
33L951572   5547.31 0.00    0.00    0.08    338817.64   12/31/2004
33L951572   13250.45    0.00    35.00   0.01    322791.22   12/31/2005
33L951572   344.12  0.00    310.66  0.02    328899.02   1/31/2006
33L951572   6168.03 0.00    69.78   0.03    326221.04   2/28/2006
33L951572   140.50  0.00    186.62  0.01    328616.53   3/31/2006

我需要这个 table 在每个月末都有一行,并且日期始终是月末日期。但是,日期之间存在差距。你可以看到例如 3/31/2004 跳转到 12/31/2014 然后 12/31/2014 跳转到 12/31/2015,之后是每月的数据。

我想在所有行中插入一行 0。但是,我还想包括最后一个已知的市场价值,无论它是缺口之前的什么。

理想情况下,这个 table 应该如下所示。

Account Netflow FeeAmount   Income  TWR MarketValue Date
33L951572   0.00    0.00    0.00    0.000   375,645.74  3/31/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  4/30/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  5/31/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  6/30/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  7/31/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  8/31/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  9/30/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  10/31/2004
33L951572   0.00    0.00    0.00    0.000   375,645.74  11/30/2004
33L951572   5,547.31    0.00    0.00    0.077   338,817.64  12/31/2004
33L951572   0.00    0.00    0.00    0.000   338,817.64  1/31/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  2/28/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  3/31/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  4/30/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  5/31/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  6/30/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  7/31/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  8/31/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  9/30/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  10/31/2005
33L951572   0.00    0.00    0.00    0.000   338,817.64  11/30/2005
33L951572   13,250.45   0.00    35.00   0.006   322,791.22  12/31/2005
33L951572   344.12  0.00    310.66  0.019   328,899.02  1/31/2006
33L951572   6,168.03    0.00    69.78   0.026   326,221.04  2/28/2006
33L951572   140.50  0.00    186.62  0.007   328,616.53  3/31/2006

如果只有一个属性要执行逻辑,发条缪斯在下面提供的查询将完美运行。第一个示例中的属性是 Account.

但是我意识到我的一些数据需要根据第二个标准 AssetClassCode 进行分区。意思是在帐户中有子属性。这又是示例,但添加了属性。

Account Netflow FeeAmount   Income  TWR AssetClassCode  AssetClass  MarketValue Date
33L951572   0   5   0   0.87947 1   Cash    1000    3/31/2004
33L951572   0   6   45  0.25564 2   Equity  2000    3/31/2004
33L951572   0   8   5   0.90677 3   Fixed   3000    3/31/2004
33L951572   123 5   2   0.29787 1   Cash    4000    7/30/2014
33L951572   456 4   4   0.55341 2   Equity  5000    7/30/2014
33L951572   657 2   45  0.10634 3   Fixed   6000    7/30/2014

这是想要的结果

Account Netflow FeeAmount   Income  TWR AssetClassCode  AssetClass   MarketValue    Date
    33L951572   0   5   0   0.88    1   Cash    1000    3/31/2004
    33L951572   0   6   45  0.26    2   Equity  2000    3/31/2004
    33L951572   0   8   5   0.91    3   Fixed   3000    3/31/2004
    33L951572   0   0   0   0.00    1   Cash    1000    4/30/2014
    33L951572   0   0   0   0.00    2   Equity  2000    4/30/2014
    33L951572   0   0   0   0.00    3   Fixed   3000    4/30/2014
    33L951572   0   0   0   0.00    1   Cash    1000    5/30/2014
    33L951572   0   0   0   0.00    2   Equity  2000    5/30/2014
    33L951572   0   0   0   0.00    3   Fixed   3000    5/30/2014
    33L951572   0   0   0   0.00    1   Cash    1000    6/30/2014
    33L951572   0   0   0   0.00    2   Equity  2000    6/30/2014
    33L951572   0   0   0   0.00    3   Fixed   3000    6/30/2014
    33L951572   123 5   2   0.30    1   Cash    4000    7/30/2014
    33L951572   456 4   4   0.55    2   Equity  5000    7/30/2014
    33L951572   657 2   45  0.11    3   Fixed   6000    7/30/2014

更新

我得到了多余的值。我创建了一个名为 CAC_Codes 的新 table,它反映了您在 AssetClass 中拥有的内容。相关的 tables 现在是 FTDatelist 作为日历 table。具有各种措施的 FTPerfCACCAssetClass,以及具有资产分类信息的CAC_Codes。

SELECT Account.accountID, 
       COALESCE(FTPerfCACCAssetClass.AccountNetDeposits, 0) AS netFlow, COALESCE(FTPerfCACCAssetClass.AccountFees, 0) AS feeAmount, 
       COALESCE(FTPerfCACCAssetClass.AccountIncome, 0) AS income, COALESCE(FTPerfCACCAssetClass.AccountReturn, 0) AS TWR,
       CAC_Codes.assetClassCode, CAC_Codes.assetClass,
       MarketValue.AccountMKV,
       Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
      FROM FTDateList
      GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN (SELECT DISTINCT accountID
            FROM FTPerfCACCAssetClass) Account
CROSS JOIN CAC_Codes
LEFT JOIN FTPerfCACCAssetClass
       ON FTPerfCACCAssetClass.accountID = Account.accountID
          AND FTPerfCACCAssetClass.assetClassCode = CAC_Codes.assetClassCode
          AND FTPerfCACCAssetClass.EndDate = Calendar.calendarDate
JOIN (SELECT accountid, assetClassCode,
             AccountMKV,
             EndDate AS valueStartDate,
             LEAD(EndDate, 1, DATEADD(day, 1, EndDate)) OVER (PARTITION BY accountid, assetClassCode ORDER BY EndDate) AS valueEndDate
              FROM FTPerfCACCAssetClass) MarketValue
  ON MarketValue.accountID = Account.accountID
     AND MarketValue.assetClassCode = CAC_Codes.assetClassCode
     AND Calendar.calendarDate >= MarketValue.valueStartDate
     AND Calendar.calendarDate < MarketValue.valueEndDate
ORDER BY Account.accountID, Calendar.calendarDate, CAC_Codes.assetClassCode

但是我得到的结果看起来像这样。

accountID   netFlow feeAmount   income  TWR assetClassCode  assetClass  AccountMKV  calendarDate
100106  11532813.47000000000    0.00000000000   0.00000000000   0.00000000000   36  Domestic Large Cap  11532813.48000000000    2007-03-31
100106  11532813.47000000000    0.00000000000   0.00000000000   0.00000000000   36  Domestic Large Cap  11532813.48000000000    2007-03-31
100106  11532813.47000000000    0.00000000000   0.00000000000   0.00000000000   36  Domestic Large Cap  11532813.48000000000    2007-03-31
100106  11532813.47000000000    0.00000000000   0.00000000000   0.00000000000   36  Domestic Large Cap  11532813.48000000000    2007-03-31
100106  3055.94000000000    0.00000000000   1.38000000000   -0.06492600000  1   Cash and Money Market   2857.53000000000    2007-04-30
100106  3055.94000000000    0.00000000000   1.38000000000   -0.06492600000  1   Cash and Money Market   2857.53000000000    2007-04-30
100106  3055.94000000000    0.00000000000   1.38000000000   -0.06492600000  1   Cash and Money Market   2857.53000000000    2007-04-30
100106  3055.94000000000    0.00000000000   1.38000000000   -0.06492600000  1   Cash and Money Market   2857.53000000000    2007-04-30

您需要日期或数字 table 来填补空白。前一段时间我遇到了类似的问题。请参阅 https://dba.stackexchange.com/questions/86435/filling-in-date-holes-in-grouped-by-date-sql-data

在您的情况下,从 numbers/calendar table 中选择后,您必须在 ISNULL 中执行子查询以获取最新值。这可能非常昂贵。像这样...

SELECT ...
ISNULL(t.TWR, 0) TWR, 
ISNULL(t.MarketValue, (SELECT MarketValue FROM Table inner WHERE inner.Date <= t.Date ORDER BY t.Date DESC) MarketValue
FROM Calendar c WITH (NOLOCK)
LEFT JOIN Table t ON t.Date=c.Date
WHERE c.Date >= @StartDate AND c.Date < @EndDate

一个大问题是您实际上想要在每个日期做两件不同的事情:

  1. 行的 "instant" 值(费用、收入等)。
  2. 列的持续值(市场价值)。

现在我们知道我们要找的是什么,我们可以构建我们的声明了。

首先,我假设您同时拥有一个日历 table 和一个帐户 table(或者只对一个帐户感兴趣,并且不需要额外的加入).我们需要稍微处理一下日历数据,但帐户应该没问题 as-is。这些构成了查询的初始基础:

SELECT Account.account, 
       -- instantaneous columns
       -- ongoing columns
       Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
      FROM Calendar
      GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account

这为我们提供了包含所有日期的所有帐户的列表。您可以根据需要添加限制 - 毕竟您可能有未来的日期 - 但重要的部分是获取每个月的最大日期。 (就个人而言,我可能会选择每月的 第一天 因为索引它要容易得多,但这很有效)生成的日历查询 table 可能是拉入内存 - 非常 小(一年 12 行!)。

接下来获取 "instantaneous" 行。现在我们有了 "base" 数据,一个简单的连接就足够了:

COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount, 
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR, 
......
LEFT JOIN MarketData
       ON MarketData.marketDate = Calendar.calendarDate
          AND MarketData.account = Account.account

...所以如果我们在那里有一行,然后显示它。当我们没有一行时,值为0.

最后,我们需要 "ongoing" 值。这个我们必须单独收集。现在,通常你想使用像 LAG(marketValue) 这样的东西......不幸的是,我们的 "base" tables 的连接给了我们一堆行,其中 marketValuenull,所以窗口将 return 而不是我们的 "previous" 值。我们需要创建一个 range-query table.
范围查询 table 是给定键的上限和下限。对于日期(如所有 positive-range 键值),这是 lower-bound 包含 (>=) 和 upper-bound 包含 (<)。本质上,我们这里的 upper-bound 是我们拥有 市场价值(旧的被取代)的第一个瞬间。 这个我们可以用LEAD(...)得到:

MarketValue.marketValue,
........
JOIN (SELECT account, marketValue,
             marketDate AS valueStartDate,
             LEAD(marketDate, 1, '99991231') OVER (PARTITION BY account ORDER BY marketDate) AS valueEndDate
      FROM MarketData) MarketValue
  ON Calendar.calendarDate >= MarketValue.valueStartDate
     AND Calendar.calendarDate < MarketValue.valueEndDate
     AND MarketValue.Account = Account.account

我们的 MarketValue 内联查询 return 是一个 table 看起来像这样的:

33L951572 | 375645.74 | 2004-03-31 | 2004-12-31

... 我们可以为每一行加入。请注意连接条件是如何构建的——这使得 "old" 和 "new" marketValue 之间没有冲突。在最后一行,因为 LEAD(...) 会 return 一个空值,我们 return "next" 天;因为(再次)我们使用独占 upper-bound,这使我们的最后一个条目成为最后一个可连接的行。

将所有内容放在一起得出:

SELECT Account.account, 
       COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount, 
       COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,            
       MarketValue.marketValue,
       Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
      FROM Calendar
      GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
LEFT JOIN MarketData
       ON MarketData.marketDate = Calendar.calendarDate
          AND MarketData.account = Account.account   
JOIN (SELECT account, marketValue,
             marketDate AS valueStartDate,
             LEAD(marketDate, 1, DATEADD(day, 1, marketDate)) OVER (PARTITION BY account ORDER BY marketDate) AS valueEndDate
      FROM MarketData) MarketValue
  ON Calendar.calendarDate >= MarketValue.valueStartDate
     AND Calendar.calendarDate < MarketValue.valueEndDate
     AND MarketValue.Account = Account.account
ORDER BY Account.account, Calendar.calendarDate

SQL Fiddle Example

(不要忘记外面的 ORDER BY,否则行可能会出现在您最意想不到的地方!)


修改查询

对于每个额外的分区标准,或"repeat",需要执行几个简单的步骤。

首先,您需要添加 "base" 引用,以确保所有行都存在:

-- I'm assuming you have a code reference table.  
-- Otherwise, create it like I did for the account table
CROSS JOIN AssetClass
  • 步骤 1b - 将此基本引用用于 SELECT 中的列,可能还有 ORDER BY

其次,您需要将额外的键值添加到 "child" table 连接条件中:

-- Because asset-class - 'Cash', etc - are _dependent_ values,
-- we only need the code key in this case
AND MarketData.assetClassCode = AssetClass.assetClassCode

最后,您需要将相关列添加到分区中:

...结束(按账户划分,assetClassCode 按市场日期排序)...

导致:

SELECT Account.account, 
       COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount, 
       COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
       AssetClass.assetClassCode, AssetClass.assetClass,            
       MarketValue.marketValue,
       Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
      FROM Calendar
      GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
CROSS JOIN AssetClass
LEFT JOIN MarketData
       ON MarketData.account = Account.account
          AND MarketData.assetClassCode = AssetClass.assetClassCode
          AND MarketData.marketDate = Calendar.calendarDate 
JOIN (SELECT account, marketValue,
             marketDate AS valueStartDate,
             LEAD(marketDate, 1, DATEADD(day, 1, marketDate)) OVER (PARTITION BY account, assetClassCode ORDER BY marketDate) AS valueEndDate
      FROM MarketData) MarketValue
  ON MarketValue.Account = Account.account
     AND MarketValue.assetClassCode = AssetClass.assetClassCode
     AND Calendar.calendarDate >= MarketValue.valueStartDate
     AND Calendar.calendarDate < MarketValue.valueEndDate
ORDER BY Account.account, Calendar.calendarDate, AssetClass.assetClassCode

SQL Fiddle Example

(请注意,我调整了 JOINLEFT JOIN 中条件的顺序,以更好地反映使用的 "primary" 键:帐户和资产 class代码)