在 SQL Table 中填写缺失的日期并取最后的市场价值
Fill in missing dates in SQL Table and take last market value
我有以下Table
Account Netflow FeeAmount Income TWR MarketValue Date
33L951572 0.00 0.00 0.00 0.00 375645.74 3/31/2004
33L951572 5547.31 0.00 0.00 0.08 338817.64 12/31/2004
33L951572 13250.45 0.00 35.00 0.01 322791.22 12/31/2005
33L951572 344.12 0.00 310.66 0.02 328899.02 1/31/2006
33L951572 6168.03 0.00 69.78 0.03 326221.04 2/28/2006
33L951572 140.50 0.00 186.62 0.01 328616.53 3/31/2006
我需要这个 table 在每个月末都有一行,并且日期始终是月末日期。但是,日期之间存在差距。你可以看到例如 3/31/2004 跳转到 12/31/2014 然后 12/31/2014 跳转到 12/31/2015,之后是每月的数据。
我想在所有行中插入一行 0。但是,我还想包括最后一个已知的市场价值,无论它是缺口之前的什么。
理想情况下,这个 table 应该如下所示。
Account Netflow FeeAmount Income TWR MarketValue Date
33L951572 0.00 0.00 0.00 0.000 375,645.74 3/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 4/30/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 5/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 6/30/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 7/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 8/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 9/30/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 10/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 11/30/2004
33L951572 5,547.31 0.00 0.00 0.077 338,817.64 12/31/2004
33L951572 0.00 0.00 0.00 0.000 338,817.64 1/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 2/28/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 3/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 4/30/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 5/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 6/30/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 7/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 8/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 9/30/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 10/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 11/30/2005
33L951572 13,250.45 0.00 35.00 0.006 322,791.22 12/31/2005
33L951572 344.12 0.00 310.66 0.019 328,899.02 1/31/2006
33L951572 6,168.03 0.00 69.78 0.026 326,221.04 2/28/2006
33L951572 140.50 0.00 186.62 0.007 328,616.53 3/31/2006
如果只有一个属性要执行逻辑,发条缪斯在下面提供的查询将完美运行。第一个示例中的属性是 Account.
但是我意识到我的一些数据需要根据第二个标准 AssetClassCode 进行分区。意思是在帐户中有子属性。这又是示例,但添加了属性。
Account Netflow FeeAmount Income TWR AssetClassCode AssetClass MarketValue Date
33L951572 0 5 0 0.87947 1 Cash 1000 3/31/2004
33L951572 0 6 45 0.25564 2 Equity 2000 3/31/2004
33L951572 0 8 5 0.90677 3 Fixed 3000 3/31/2004
33L951572 123 5 2 0.29787 1 Cash 4000 7/30/2014
33L951572 456 4 4 0.55341 2 Equity 5000 7/30/2014
33L951572 657 2 45 0.10634 3 Fixed 6000 7/30/2014
这是想要的结果
Account Netflow FeeAmount Income TWR AssetClassCode AssetClass MarketValue Date
33L951572 0 5 0 0.88 1 Cash 1000 3/31/2004
33L951572 0 6 45 0.26 2 Equity 2000 3/31/2004
33L951572 0 8 5 0.91 3 Fixed 3000 3/31/2004
33L951572 0 0 0 0.00 1 Cash 1000 4/30/2014
33L951572 0 0 0 0.00 2 Equity 2000 4/30/2014
33L951572 0 0 0 0.00 3 Fixed 3000 4/30/2014
33L951572 0 0 0 0.00 1 Cash 1000 5/30/2014
33L951572 0 0 0 0.00 2 Equity 2000 5/30/2014
33L951572 0 0 0 0.00 3 Fixed 3000 5/30/2014
33L951572 0 0 0 0.00 1 Cash 1000 6/30/2014
33L951572 0 0 0 0.00 2 Equity 2000 6/30/2014
33L951572 0 0 0 0.00 3 Fixed 3000 6/30/2014
33L951572 123 5 2 0.30 1 Cash 4000 7/30/2014
33L951572 456 4 4 0.55 2 Equity 5000 7/30/2014
33L951572 657 2 45 0.11 3 Fixed 6000 7/30/2014
更新
我得到了多余的值。我创建了一个名为 CAC_Codes 的新 table,它反映了您在 AssetClass 中拥有的内容。相关的 tables 现在是 FTDatelist 作为日历 table。具有各种措施的 FTPerfCACCAssetClass,以及具有资产分类信息的CAC_Codes。
SELECT Account.accountID,
COALESCE(FTPerfCACCAssetClass.AccountNetDeposits, 0) AS netFlow, COALESCE(FTPerfCACCAssetClass.AccountFees, 0) AS feeAmount,
COALESCE(FTPerfCACCAssetClass.AccountIncome, 0) AS income, COALESCE(FTPerfCACCAssetClass.AccountReturn, 0) AS TWR,
CAC_Codes.assetClassCode, CAC_Codes.assetClass,
MarketValue.AccountMKV,
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM FTDateList
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN (SELECT DISTINCT accountID
FROM FTPerfCACCAssetClass) Account
CROSS JOIN CAC_Codes
LEFT JOIN FTPerfCACCAssetClass
ON FTPerfCACCAssetClass.accountID = Account.accountID
AND FTPerfCACCAssetClass.assetClassCode = CAC_Codes.assetClassCode
AND FTPerfCACCAssetClass.EndDate = Calendar.calendarDate
JOIN (SELECT accountid, assetClassCode,
AccountMKV,
EndDate AS valueStartDate,
LEAD(EndDate, 1, DATEADD(day, 1, EndDate)) OVER (PARTITION BY accountid, assetClassCode ORDER BY EndDate) AS valueEndDate
FROM FTPerfCACCAssetClass) MarketValue
ON MarketValue.accountID = Account.accountID
AND MarketValue.assetClassCode = CAC_Codes.assetClassCode
AND Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
ORDER BY Account.accountID, Calendar.calendarDate, CAC_Codes.assetClassCode
但是我得到的结果看起来像这样。
accountID netFlow feeAmount income TWR assetClassCode assetClass AccountMKV calendarDate
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
您需要日期或数字 table 来填补空白。前一段时间我遇到了类似的问题。请参阅 https://dba.stackexchange.com/questions/86435/filling-in-date-holes-in-grouped-by-date-sql-data。
在您的情况下,从 numbers/calendar table 中选择后,您必须在 ISNULL 中执行子查询以获取最新值。这可能非常昂贵。像这样...
SELECT ...
ISNULL(t.TWR, 0) TWR,
ISNULL(t.MarketValue, (SELECT MarketValue FROM Table inner WHERE inner.Date <= t.Date ORDER BY t.Date DESC) MarketValue
FROM Calendar c WITH (NOLOCK)
LEFT JOIN Table t ON t.Date=c.Date
WHERE c.Date >= @StartDate AND c.Date < @EndDate
一个大问题是您实际上想要在每个日期做两件不同的事情:
- 行的 "instant" 值(费用、收入等)。
- 列的持续值(市场价值)。
现在我们知道我们要找的是什么,我们可以构建我们的声明了。
首先,我假设您同时拥有一个日历 table 和一个帐户 table(或者只对一个帐户感兴趣,并且不需要额外的加入).我们需要稍微处理一下日历数据,但帐户应该没问题 as-is。这些构成了查询的初始基础:
SELECT Account.account,
-- instantaneous columns
-- ongoing columns
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM Calendar
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
这为我们提供了包含所有日期的所有帐户的列表。您可以根据需要添加限制 - 毕竟您可能有未来的日期 - 但重要的部分是获取每个月的最大日期。 (就个人而言,我可能会选择每月的 第一天 因为索引它要容易得多,但这很有效)生成的日历查询 table 可能是拉入内存 - 非常 小(一年 12 行!)。
接下来获取 "instantaneous" 行。现在我们有了 "base" 数据,一个简单的连接就足够了:
COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount,
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
......
LEFT JOIN MarketData
ON MarketData.marketDate = Calendar.calendarDate
AND MarketData.account = Account.account
...所以如果我们在那里有一行,然后显示它。当我们没有一行时,值为0
.
最后,我们需要 "ongoing" 值。这个我们必须单独收集。现在,通常你想使用像 LAG(marketValue)
这样的东西......不幸的是,我们的 "base" tables 的连接给了我们一堆行,其中 marketValue
是 null,所以窗口将 return 而不是我们的 "previous" 值。我们需要创建一个 range-query table.
范围查询 table 是给定键的上限和下限。对于日期(如所有 positive-range 键值),这是 lower-bound 包含 (>=
) 和 upper-bound 包含 (<
)。本质上,我们这里的 upper-bound 是我们拥有 新 市场价值(旧的被取代)的第一个瞬间。 这个我们可以用LEAD(...)
得到:
MarketValue.marketValue,
........
JOIN (SELECT account, marketValue,
marketDate AS valueStartDate,
LEAD(marketDate, 1, '99991231') OVER (PARTITION BY account ORDER BY marketDate) AS valueEndDate
FROM MarketData) MarketValue
ON Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
AND MarketValue.Account = Account.account
我们的 MarketValue
内联查询 return 是一个 table 看起来像这样的:
33L951572 | 375645.74 | 2004-03-31 | 2004-12-31
... 我们可以为每一行加入。请注意连接条件是如何构建的——这使得 "old" 和 "new" marketValue
之间没有冲突。在最后一行,因为 LEAD(...)
会 return 一个空值,我们 return "next" 天;因为(再次)我们使用独占 upper-bound,这使我们的最后一个条目成为最后一个可连接的行。
将所有内容放在一起得出:
SELECT Account.account,
COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount,
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
MarketValue.marketValue,
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM Calendar
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
LEFT JOIN MarketData
ON MarketData.marketDate = Calendar.calendarDate
AND MarketData.account = Account.account
JOIN (SELECT account, marketValue,
marketDate AS valueStartDate,
LEAD(marketDate, 1, DATEADD(day, 1, marketDate)) OVER (PARTITION BY account ORDER BY marketDate) AS valueEndDate
FROM MarketData) MarketValue
ON Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
AND MarketValue.Account = Account.account
ORDER BY Account.account, Calendar.calendarDate
(不要忘记外面的 ORDER BY
,否则行可能会出现在您最意想不到的地方!)
修改查询
对于每个额外的分区标准,或"repeat",需要执行几个简单的步骤。
首先,您需要添加 "base" 引用,以确保所有行都存在:
-- I'm assuming you have a code reference table.
-- Otherwise, create it like I did for the account table
CROSS JOIN AssetClass
- 步骤 1b - 将此基本引用用于
SELECT
中的列,可能还有 ORDER BY
。
其次,您需要将额外的键值添加到 "child" table 连接条件中:
-- Because asset-class - 'Cash', etc - are _dependent_ values,
-- we only need the code key in this case
AND MarketData.assetClassCode = AssetClass.assetClassCode
最后,您需要将相关列添加到分区中:
...结束(按账户划分,assetClassCode 按市场日期排序)...
导致:
SELECT Account.account,
COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount,
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
AssetClass.assetClassCode, AssetClass.assetClass,
MarketValue.marketValue,
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM Calendar
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
CROSS JOIN AssetClass
LEFT JOIN MarketData
ON MarketData.account = Account.account
AND MarketData.assetClassCode = AssetClass.assetClassCode
AND MarketData.marketDate = Calendar.calendarDate
JOIN (SELECT account, marketValue,
marketDate AS valueStartDate,
LEAD(marketDate, 1, DATEADD(day, 1, marketDate)) OVER (PARTITION BY account, assetClassCode ORDER BY marketDate) AS valueEndDate
FROM MarketData) MarketValue
ON MarketValue.Account = Account.account
AND MarketValue.assetClassCode = AssetClass.assetClassCode
AND Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
ORDER BY Account.account, Calendar.calendarDate, AssetClass.assetClassCode
(请注意,我调整了 JOIN
和 LEFT JOIN
中条件的顺序,以更好地反映使用的 "primary" 键:帐户和资产 class代码)
我有以下Table
Account Netflow FeeAmount Income TWR MarketValue Date
33L951572 0.00 0.00 0.00 0.00 375645.74 3/31/2004
33L951572 5547.31 0.00 0.00 0.08 338817.64 12/31/2004
33L951572 13250.45 0.00 35.00 0.01 322791.22 12/31/2005
33L951572 344.12 0.00 310.66 0.02 328899.02 1/31/2006
33L951572 6168.03 0.00 69.78 0.03 326221.04 2/28/2006
33L951572 140.50 0.00 186.62 0.01 328616.53 3/31/2006
我需要这个 table 在每个月末都有一行,并且日期始终是月末日期。但是,日期之间存在差距。你可以看到例如 3/31/2004 跳转到 12/31/2014 然后 12/31/2014 跳转到 12/31/2015,之后是每月的数据。
我想在所有行中插入一行 0。但是,我还想包括最后一个已知的市场价值,无论它是缺口之前的什么。
理想情况下,这个 table 应该如下所示。
Account Netflow FeeAmount Income TWR MarketValue Date
33L951572 0.00 0.00 0.00 0.000 375,645.74 3/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 4/30/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 5/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 6/30/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 7/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 8/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 9/30/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 10/31/2004
33L951572 0.00 0.00 0.00 0.000 375,645.74 11/30/2004
33L951572 5,547.31 0.00 0.00 0.077 338,817.64 12/31/2004
33L951572 0.00 0.00 0.00 0.000 338,817.64 1/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 2/28/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 3/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 4/30/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 5/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 6/30/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 7/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 8/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 9/30/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 10/31/2005
33L951572 0.00 0.00 0.00 0.000 338,817.64 11/30/2005
33L951572 13,250.45 0.00 35.00 0.006 322,791.22 12/31/2005
33L951572 344.12 0.00 310.66 0.019 328,899.02 1/31/2006
33L951572 6,168.03 0.00 69.78 0.026 326,221.04 2/28/2006
33L951572 140.50 0.00 186.62 0.007 328,616.53 3/31/2006
如果只有一个属性要执行逻辑,发条缪斯在下面提供的查询将完美运行。第一个示例中的属性是 Account.
但是我意识到我的一些数据需要根据第二个标准 AssetClassCode 进行分区。意思是在帐户中有子属性。这又是示例,但添加了属性。
Account Netflow FeeAmount Income TWR AssetClassCode AssetClass MarketValue Date
33L951572 0 5 0 0.87947 1 Cash 1000 3/31/2004
33L951572 0 6 45 0.25564 2 Equity 2000 3/31/2004
33L951572 0 8 5 0.90677 3 Fixed 3000 3/31/2004
33L951572 123 5 2 0.29787 1 Cash 4000 7/30/2014
33L951572 456 4 4 0.55341 2 Equity 5000 7/30/2014
33L951572 657 2 45 0.10634 3 Fixed 6000 7/30/2014
这是想要的结果
Account Netflow FeeAmount Income TWR AssetClassCode AssetClass MarketValue Date
33L951572 0 5 0 0.88 1 Cash 1000 3/31/2004
33L951572 0 6 45 0.26 2 Equity 2000 3/31/2004
33L951572 0 8 5 0.91 3 Fixed 3000 3/31/2004
33L951572 0 0 0 0.00 1 Cash 1000 4/30/2014
33L951572 0 0 0 0.00 2 Equity 2000 4/30/2014
33L951572 0 0 0 0.00 3 Fixed 3000 4/30/2014
33L951572 0 0 0 0.00 1 Cash 1000 5/30/2014
33L951572 0 0 0 0.00 2 Equity 2000 5/30/2014
33L951572 0 0 0 0.00 3 Fixed 3000 5/30/2014
33L951572 0 0 0 0.00 1 Cash 1000 6/30/2014
33L951572 0 0 0 0.00 2 Equity 2000 6/30/2014
33L951572 0 0 0 0.00 3 Fixed 3000 6/30/2014
33L951572 123 5 2 0.30 1 Cash 4000 7/30/2014
33L951572 456 4 4 0.55 2 Equity 5000 7/30/2014
33L951572 657 2 45 0.11 3 Fixed 6000 7/30/2014
更新
我得到了多余的值。我创建了一个名为 CAC_Codes 的新 table,它反映了您在 AssetClass 中拥有的内容。相关的 tables 现在是 FTDatelist 作为日历 table。具有各种措施的 FTPerfCACCAssetClass,以及具有资产分类信息的CAC_Codes。
SELECT Account.accountID,
COALESCE(FTPerfCACCAssetClass.AccountNetDeposits, 0) AS netFlow, COALESCE(FTPerfCACCAssetClass.AccountFees, 0) AS feeAmount,
COALESCE(FTPerfCACCAssetClass.AccountIncome, 0) AS income, COALESCE(FTPerfCACCAssetClass.AccountReturn, 0) AS TWR,
CAC_Codes.assetClassCode, CAC_Codes.assetClass,
MarketValue.AccountMKV,
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM FTDateList
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN (SELECT DISTINCT accountID
FROM FTPerfCACCAssetClass) Account
CROSS JOIN CAC_Codes
LEFT JOIN FTPerfCACCAssetClass
ON FTPerfCACCAssetClass.accountID = Account.accountID
AND FTPerfCACCAssetClass.assetClassCode = CAC_Codes.assetClassCode
AND FTPerfCACCAssetClass.EndDate = Calendar.calendarDate
JOIN (SELECT accountid, assetClassCode,
AccountMKV,
EndDate AS valueStartDate,
LEAD(EndDate, 1, DATEADD(day, 1, EndDate)) OVER (PARTITION BY accountid, assetClassCode ORDER BY EndDate) AS valueEndDate
FROM FTPerfCACCAssetClass) MarketValue
ON MarketValue.accountID = Account.accountID
AND MarketValue.assetClassCode = CAC_Codes.assetClassCode
AND Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
ORDER BY Account.accountID, Calendar.calendarDate, CAC_Codes.assetClassCode
但是我得到的结果看起来像这样。
accountID netFlow feeAmount income TWR assetClassCode assetClass AccountMKV calendarDate
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 11532813.47000000000 0.00000000000 0.00000000000 0.00000000000 36 Domestic Large Cap 11532813.48000000000 2007-03-31
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
100106 3055.94000000000 0.00000000000 1.38000000000 -0.06492600000 1 Cash and Money Market 2857.53000000000 2007-04-30
您需要日期或数字 table 来填补空白。前一段时间我遇到了类似的问题。请参阅 https://dba.stackexchange.com/questions/86435/filling-in-date-holes-in-grouped-by-date-sql-data。
在您的情况下,从 numbers/calendar table 中选择后,您必须在 ISNULL 中执行子查询以获取最新值。这可能非常昂贵。像这样...
SELECT ...
ISNULL(t.TWR, 0) TWR,
ISNULL(t.MarketValue, (SELECT MarketValue FROM Table inner WHERE inner.Date <= t.Date ORDER BY t.Date DESC) MarketValue
FROM Calendar c WITH (NOLOCK)
LEFT JOIN Table t ON t.Date=c.Date
WHERE c.Date >= @StartDate AND c.Date < @EndDate
一个大问题是您实际上想要在每个日期做两件不同的事情:
- 行的 "instant" 值(费用、收入等)。
- 列的持续值(市场价值)。
现在我们知道我们要找的是什么,我们可以构建我们的声明了。
首先,我假设您同时拥有一个日历 table 和一个帐户 table(或者只对一个帐户感兴趣,并且不需要额外的加入).我们需要稍微处理一下日历数据,但帐户应该没问题 as-is。这些构成了查询的初始基础:
SELECT Account.account,
-- instantaneous columns
-- ongoing columns
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM Calendar
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
这为我们提供了包含所有日期的所有帐户的列表。您可以根据需要添加限制 - 毕竟您可能有未来的日期 - 但重要的部分是获取每个月的最大日期。 (就个人而言,我可能会选择每月的 第一天 因为索引它要容易得多,但这很有效)生成的日历查询 table 可能是拉入内存 - 非常 小(一年 12 行!)。
接下来获取 "instantaneous" 行。现在我们有了 "base" 数据,一个简单的连接就足够了:
COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount,
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
......
LEFT JOIN MarketData
ON MarketData.marketDate = Calendar.calendarDate
AND MarketData.account = Account.account
...所以如果我们在那里有一行,然后显示它。当我们没有一行时,值为0
.
最后,我们需要 "ongoing" 值。这个我们必须单独收集。现在,通常你想使用像 LAG(marketValue)
这样的东西......不幸的是,我们的 "base" tables 的连接给了我们一堆行,其中 marketValue
是 null,所以窗口将 return 而不是我们的 "previous" 值。我们需要创建一个 range-query table.
范围查询 table 是给定键的上限和下限。对于日期(如所有 positive-range 键值),这是 lower-bound 包含 (>=
) 和 upper-bound 包含 (<
)。本质上,我们这里的 upper-bound 是我们拥有 新 市场价值(旧的被取代)的第一个瞬间。 这个我们可以用LEAD(...)
得到:
MarketValue.marketValue,
........
JOIN (SELECT account, marketValue,
marketDate AS valueStartDate,
LEAD(marketDate, 1, '99991231') OVER (PARTITION BY account ORDER BY marketDate) AS valueEndDate
FROM MarketData) MarketValue
ON Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
AND MarketValue.Account = Account.account
我们的 MarketValue
内联查询 return 是一个 table 看起来像这样的:
33L951572 | 375645.74 | 2004-03-31 | 2004-12-31
... 我们可以为每一行加入。请注意连接条件是如何构建的——这使得 "old" 和 "new" marketValue
之间没有冲突。在最后一行,因为 LEAD(...)
会 return 一个空值,我们 return "next" 天;因为(再次)我们使用独占 upper-bound,这使我们的最后一个条目成为最后一个可连接的行。
将所有内容放在一起得出:
SELECT Account.account,
COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount,
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
MarketValue.marketValue,
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM Calendar
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
LEFT JOIN MarketData
ON MarketData.marketDate = Calendar.calendarDate
AND MarketData.account = Account.account
JOIN (SELECT account, marketValue,
marketDate AS valueStartDate,
LEAD(marketDate, 1, DATEADD(day, 1, marketDate)) OVER (PARTITION BY account ORDER BY marketDate) AS valueEndDate
FROM MarketData) MarketValue
ON Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
AND MarketValue.Account = Account.account
ORDER BY Account.account, Calendar.calendarDate
(不要忘记外面的 ORDER BY
,否则行可能会出现在您最意想不到的地方!)
修改查询
对于每个额外的分区标准,或"repeat",需要执行几个简单的步骤。
首先,您需要添加 "base" 引用,以确保所有行都存在:
-- I'm assuming you have a code reference table.
-- Otherwise, create it like I did for the account table
CROSS JOIN AssetClass
- 步骤 1b - 将此基本引用用于
SELECT
中的列,可能还有ORDER BY
。
其次,您需要将额外的键值添加到 "child" table 连接条件中:
-- Because asset-class - 'Cash', etc - are _dependent_ values,
-- we only need the code key in this case
AND MarketData.assetClassCode = AssetClass.assetClassCode
最后,您需要将相关列添加到分区中:
...结束(按账户划分,assetClassCode 按市场日期排序)...
导致:
SELECT Account.account,
COALESCE(MarketData.netFlow, 0) AS netFlow, COALESCE(MarketData.feeAmount, 0) AS feeAmount,
COALESCE(MarketData.income, 0) AS income, COALESCE(MarketData.TWR, 0) AS TWR,
AssetClass.assetClassCode, AssetClass.assetClass,
MarketValue.marketValue,
Calendar.calendarDate
FROM (SELECT MAX(calendarDate) AS calendarDate
FROM Calendar
GROUP BY calendarYear, calendarMonth) Calendar
CROSS JOIN Account
CROSS JOIN AssetClass
LEFT JOIN MarketData
ON MarketData.account = Account.account
AND MarketData.assetClassCode = AssetClass.assetClassCode
AND MarketData.marketDate = Calendar.calendarDate
JOIN (SELECT account, marketValue,
marketDate AS valueStartDate,
LEAD(marketDate, 1, DATEADD(day, 1, marketDate)) OVER (PARTITION BY account, assetClassCode ORDER BY marketDate) AS valueEndDate
FROM MarketData) MarketValue
ON MarketValue.Account = Account.account
AND MarketValue.assetClassCode = AssetClass.assetClassCode
AND Calendar.calendarDate >= MarketValue.valueStartDate
AND Calendar.calendarDate < MarketValue.valueEndDate
ORDER BY Account.account, Calendar.calendarDate, AssetClass.assetClassCode
(请注意,我调整了 JOIN
和 LEFT JOIN
中条件的顺序,以更好地反映使用的 "primary" 键:帐户和资产 class代码)