如何将两个 select 查询与大多数相同列的计数合并,但一组 returns 多行
How can I merge two select queries with counts of mostly identical columns but one set returns multiple rows
我有两个完美的查询:
DECLARE @StartDate DATETIME = '2021-11-01 00:00:00';
DECLARE @EndDate DATETIME = '2022-03-16 23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
COUNT(TransactionId) AS TransactionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND MasterRecord = 1
AND TransactionTypeId = @SalesEstimateTransactionTypeId
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
COUNT(DISTINCT TransactionId) AS ConversionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND MasterRecord = 0
AND TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
请注意,第二个查询 return 是不同的,因为它可以 return 多个值,我们只想在这种情况下对每个 TransactionId
计数一次。这些 return 以下结果:
ReportingMonth
MonthNumber
ReportingYear
TransactionCount
November
11
2021
82
December
12
2021
49
January
1
2022
64
February
2
2022
67
March
3
2022
49
ReportingMonth
MonthNumber
ReportingYear
ConversionCount
November
11
2021
42
December
12
2021
27
January
1
2022
31
February
2
2022
50
March
3
2022
24
我实际上需要像这样组合它们:
ReportingMonth
MonthNumber
ReportingYear
TransactionCount
ConversionCount
November
11
2021
82
42
December
12
2021
49
27
January
1
2022
64
31
February
2
2022
67
50
March
3
2022
49
24
我已经尝试了几乎所有我能想到的方法——联合、连接、子查询——但到目前为止还没有什么是完全正确的。这是我能得到的最接近的:
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
SUM(CASE
WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
AND MasterRecord = 1
THEN 1 ELSE 0
END) AS TransactionCount,
COUNT(CASE
WHEN TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0 THEN 1
END) AS ConversionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
但是,我找不到为 ConversionCount
获取不同值的方法。结果是 return 完整计数:
ReportingMonth
MonthNumber
ReportingYear
TransactionCount
ConversionCount
November
11
2021
82
152
December
12
2021
49
67
January
1
2022
64
101
February
2
2022
67
136
March
3
2022
49
64
任何人都可以指导我找到一种方法来组合两个查询结果,同时在转换计数上保持 Distinct 吗?我必须补充一点,答案必须与 SQL Server 和 VistaDB 兼容,后者的语法是 T-SQL 的子集,因为我有义务以相同的方式支持这两个数据库引擎查询。
编辑 - 最终解决方案
根据 Nick 的出色回答,我能够将解决方案嵌入到我现有的查询代码中,以确保即使几个月没有记录也能得到结果,在此显示以防对其他人有帮助:
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-10-31T23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
DECLARE @CurrentDate DATETIME;
DECLARE @Months TABLE(ReportingYear INT, MonthNumber INT, ReportingMonth VARCHAR (40));
-- Set the initial date
SET @CurrentDate = @StartDate
-- insert all dates into temp table
WHILE @CurrentDate <= @EndDate
BEGIN
INSERT INTO @Months VALUES(DATEPART(year, @CurrentDate), DATEPART(month, @CurrentDate), DATENAME(mm, @CurrentDate))
SET @CurrentDate = dateadd(mm, 1, @CurrentDate)
END;
SELECT ReportingMonth, ReportingYear, Coalesce(TransactionCount, 0) AS TransactionCount, Coalesce(ConversionCount,0) AS ConversionCount
FROM
(
SELECT months.[ReportingMonth], months.[ReportingYear], conversionData.[TransactionCount], conversionData.[ConversionCount]
FROM @Months months
LEFT JOIN
(
SELECT
ReportingMonth = DATENAME(mm, GeneralJournal.[TransactionDate]),
MonthNumber = DATEPART(mm, GeneralJournal.[TransactionDate]),
ReportingYear = DATEPART(yyyy, GeneralJournal.[TransactionDate]),
TransactionCount = SUM(CASE WHEN TransactionTypeId = @SalesEstimateTransactionTypeId AND GeneralJournal.[MasterRecord] = 1 THEN
1
ELSE
0
END
),
ConversionCount = COUNT(DISTINCT CASE WHEN GeneralJournal.[TransactionTypeId] = @SalesOrderTransactionTypeId
AND GeneralJournal.[SEReferenceId] > 0
AND GeneralJournal.[MasterRecord] = 0 THEN
GeneralJournal.[TransactionID]
END
)
FROM GeneralJournal
WHERE GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND GeneralJournal.[TransactionTypeId] IN ( @SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm, GeneralJournal.[TransactionDate])
) as conversionData
ON months.[ReportingYear] = conversionData.[ReportingYear] AND months.[MonthNumber] = conversionData.[MonthNumber]
) AS data;
您可以将这两列放在同一个查询中。由于 WHERE
子句略有不同,因此变得更加复杂。所以你需要分组,然后再次分组,并使用条件聚合来计算每一列的正确行数。
注意以下几点:
- 理论上您可以执行
COUNT(DISTINCT CASE
,但通常速度较慢,因为编译器不会识别 CASE
正在做什么,而是执行完整排序。
- 按单个
EOMONTH
计算分组比按整月分组更快。您可以在 SELECT
. 中拉出年份和月份
COUNT(TransactionId)
将 return 个 non-null TransactionId
值。如果 TransactionId
不能为 null 那么 COUNT(*)
是一样的。
- 如果
TransactionDate
有时间成分,那么你应该使用 half-open 间隔 >= AND <
- 在表上使用别名,它使您的查询更具可读性。
- 使用 whitepsace,它是免费的。
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-03-17T00:00:00';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
SELECT
DATENAME(month, gj.mth) AS ReportingMonth,
DATEPART(month, gj.mth) AS MonthNumber,
DATEPART(year , gj.mth) AS ReportingYear,
SUM(TransactionCount) AS TransactionCount,
COUNT(CASE WHEN ConversionCount > 0 THEN 1 END) AS ConversionCount
FROM (
SELECT
EOMONTH(gj.TransactionDate) AS mth,
gj.TransactionId,
COUNT(CASE WHEN gj.MasterRecord = 1 AND gj.TransactionTypeId = @SalesEstimateTransactionTypeId THEN 1 END) AS TransactionCount,
COUNT(CASE WHEN gj.MasterRecord = 0 AND gj.TransactionTypeId = @SalesOrderTransactionTypeId AND gj.SEReferenceId > 0 THEN 1 END) AS ConversionCount
FROM GeneralJournal gj
WHERE gj.TransactionDate >= @StartDate
AND gj.TransactionDate < @EndDate
AND gj.TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
EOMONTH(gj.TransactionDate),
TransactionId
) g
GROUP BY
mth;
你的第二个查询很接近,我认为只是有一些小遗漏。
- 您在 ConversionCount
CASE
语句中忘记了 MasterRecord = 0
。
- 您应该 return TransactionID 或 NULL,而不是 return从您的 ConversionCount
CASE
中获取 1 或 0,这样您仍然可以计算不同的值。
- 您的 ConversionCount
COUNT
中缺少 DISTINCT
。
- 您将需要处理 ConversionCount
COUNT
中的 NULL 值。我假设你总是有一个或多个 NULL
s,所以我只是从 COUNT(DISTINCT ...)
中减去 1 来补偿。
(如果没有一些示例详细数据可以使用,我不能 100% 了解这里的语法。)
代码
SELECT
ReportingMonth = DATENAME(mm, GeneralJournal.TransactionDate),
MonthNumber = DATEPART(mm, GeneralJournal.TransactionDate),
ReportingYear = DATEPART(yyyy, GeneralJournal.TransactionDate),
TransactionCount = SUM(CASE
WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
AND MasterRecord = 1 THEN
1
ELSE
0
END
),
ConversionCount = COUNT(DISTINCT CASE
WHEN TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0
AND MasterRecord = 0 THEN
TransactionID
ELSE
NULL
END
) - 1 /* Subtract 1 for the NULL */
FROM GeneralJournal
WHERE
GeneralJournal.TransactionDate >= @StartDate
AND GeneralJournal.TransactionDate <= @EndDate
AND TransactionTypeId IN (
@SalesOrderTransactionTypeId,
@SalesEstimateTransactionTypeId
)
GROUP BY
DATEPART(yyyy, GeneralJournal.TransactionDate),
DATEPART(mm, GeneralJournal.TransactionDate),
DATENAME(mm, GeneralJournal.TransactionDate);
我有两个完美的查询:
DECLARE @StartDate DATETIME = '2021-11-01 00:00:00';
DECLARE @EndDate DATETIME = '2022-03-16 23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
COUNT(TransactionId) AS TransactionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND MasterRecord = 1
AND TransactionTypeId = @SalesEstimateTransactionTypeId
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
COUNT(DISTINCT TransactionId) AS ConversionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND MasterRecord = 0
AND TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
请注意,第二个查询 return 是不同的,因为它可以 return 多个值,我们只想在这种情况下对每个 TransactionId
计数一次。这些 return 以下结果:
ReportingMonth | MonthNumber | ReportingYear | TransactionCount |
---|---|---|---|
November | 11 | 2021 | 82 |
December | 12 | 2021 | 49 |
January | 1 | 2022 | 64 |
February | 2 | 2022 | 67 |
March | 3 | 2022 | 49 |
ReportingMonth | MonthNumber | ReportingYear | ConversionCount |
---|---|---|---|
November | 11 | 2021 | 42 |
December | 12 | 2021 | 27 |
January | 1 | 2022 | 31 |
February | 2 | 2022 | 50 |
March | 3 | 2022 | 24 |
我实际上需要像这样组合它们:
ReportingMonth | MonthNumber | ReportingYear | TransactionCount | ConversionCount |
---|---|---|---|---|
November | 11 | 2021 | 82 | 42 |
December | 12 | 2021 | 49 | 27 |
January | 1 | 2022 | 64 | 31 |
February | 2 | 2022 | 67 | 50 |
March | 3 | 2022 | 49 | 24 |
我已经尝试了几乎所有我能想到的方法——联合、连接、子查询——但到目前为止还没有什么是完全正确的。这是我能得到的最接近的:
SELECT
DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth,
DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber,
DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
SUM(CASE
WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
AND MasterRecord = 1
THEN 1 ELSE 0
END) AS TransactionCount,
COUNT(CASE
WHEN TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0 THEN 1
END) AS ConversionCount
FROM
GeneralJournal
WHERE
GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm,GeneralJournal.[TransactionDate]);
但是,我找不到为 ConversionCount
获取不同值的方法。结果是 return 完整计数:
ReportingMonth | MonthNumber | ReportingYear | TransactionCount | ConversionCount |
---|---|---|---|---|
November | 11 | 2021 | 82 | 152 |
December | 12 | 2021 | 49 | 67 |
January | 1 | 2022 | 64 | 101 |
February | 2 | 2022 | 67 | 136 |
March | 3 | 2022 | 49 | 64 |
任何人都可以指导我找到一种方法来组合两个查询结果,同时在转换计数上保持 Distinct 吗?我必须补充一点,答案必须与 SQL Server 和 VistaDB 兼容,后者的语法是 T-SQL 的子集,因为我有义务以相同的方式支持这两个数据库引擎查询。
编辑 - 最终解决方案
根据 Nick 的出色回答,我能够将解决方案嵌入到我现有的查询代码中,以确保即使几个月没有记录也能得到结果,在此显示以防对其他人有帮助:
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-10-31T23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
DECLARE @CurrentDate DATETIME;
DECLARE @Months TABLE(ReportingYear INT, MonthNumber INT, ReportingMonth VARCHAR (40));
-- Set the initial date
SET @CurrentDate = @StartDate
-- insert all dates into temp table
WHILE @CurrentDate <= @EndDate
BEGIN
INSERT INTO @Months VALUES(DATEPART(year, @CurrentDate), DATEPART(month, @CurrentDate), DATENAME(mm, @CurrentDate))
SET @CurrentDate = dateadd(mm, 1, @CurrentDate)
END;
SELECT ReportingMonth, ReportingYear, Coalesce(TransactionCount, 0) AS TransactionCount, Coalesce(ConversionCount,0) AS ConversionCount
FROM
(
SELECT months.[ReportingMonth], months.[ReportingYear], conversionData.[TransactionCount], conversionData.[ConversionCount]
FROM @Months months
LEFT JOIN
(
SELECT
ReportingMonth = DATENAME(mm, GeneralJournal.[TransactionDate]),
MonthNumber = DATEPART(mm, GeneralJournal.[TransactionDate]),
ReportingYear = DATEPART(yyyy, GeneralJournal.[TransactionDate]),
TransactionCount = SUM(CASE WHEN TransactionTypeId = @SalesEstimateTransactionTypeId AND GeneralJournal.[MasterRecord] = 1 THEN
1
ELSE
0
END
),
ConversionCount = COUNT(DISTINCT CASE WHEN GeneralJournal.[TransactionTypeId] = @SalesOrderTransactionTypeId
AND GeneralJournal.[SEReferenceId] > 0
AND GeneralJournal.[MasterRecord] = 0 THEN
GeneralJournal.[TransactionID]
END
)
FROM GeneralJournal
WHERE GeneralJournal.[TransactionDate] >= @StartDate
AND GeneralJournal.[TransactionDate] <= @EndDate
AND GeneralJournal.[TransactionTypeId] IN ( @SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
DATEPART(yyyy, GeneralJournal.[TransactionDate]),
DATEPART(mm, GeneralJournal.[TransactionDate]),
DATENAME(mm, GeneralJournal.[TransactionDate])
) as conversionData
ON months.[ReportingYear] = conversionData.[ReportingYear] AND months.[MonthNumber] = conversionData.[MonthNumber]
) AS data;
您可以将这两列放在同一个查询中。由于 WHERE
子句略有不同,因此变得更加复杂。所以你需要分组,然后再次分组,并使用条件聚合来计算每一列的正确行数。
注意以下几点:
- 理论上您可以执行
COUNT(DISTINCT CASE
,但通常速度较慢,因为编译器不会识别CASE
正在做什么,而是执行完整排序。 - 按单个
EOMONTH
计算分组比按整月分组更快。您可以在SELECT
. 中拉出年份和月份
COUNT(TransactionId)
将 return 个 non-nullTransactionId
值。如果TransactionId
不能为 null 那么COUNT(*)
是一样的。- 如果
TransactionDate
有时间成分,那么你应该使用 half-open 间隔>= AND <
- 在表上使用别名,它使您的查询更具可读性。
- 使用 whitepsace,它是免费的。
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-03-17T00:00:00';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;
SELECT
DATENAME(month, gj.mth) AS ReportingMonth,
DATEPART(month, gj.mth) AS MonthNumber,
DATEPART(year , gj.mth) AS ReportingYear,
SUM(TransactionCount) AS TransactionCount,
COUNT(CASE WHEN ConversionCount > 0 THEN 1 END) AS ConversionCount
FROM (
SELECT
EOMONTH(gj.TransactionDate) AS mth,
gj.TransactionId,
COUNT(CASE WHEN gj.MasterRecord = 1 AND gj.TransactionTypeId = @SalesEstimateTransactionTypeId THEN 1 END) AS TransactionCount,
COUNT(CASE WHEN gj.MasterRecord = 0 AND gj.TransactionTypeId = @SalesOrderTransactionTypeId AND gj.SEReferenceId > 0 THEN 1 END) AS ConversionCount
FROM GeneralJournal gj
WHERE gj.TransactionDate >= @StartDate
AND gj.TransactionDate < @EndDate
AND gj.TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY
EOMONTH(gj.TransactionDate),
TransactionId
) g
GROUP BY
mth;
你的第二个查询很接近,我认为只是有一些小遗漏。
- 您在 ConversionCount
CASE
语句中忘记了MasterRecord = 0
。 - 您应该 return TransactionID 或 NULL,而不是 return从您的 ConversionCount
CASE
中获取 1 或 0,这样您仍然可以计算不同的值。 - 您的 ConversionCount
COUNT
中缺少DISTINCT
。 - 您将需要处理 ConversionCount
COUNT
中的 NULL 值。我假设你总是有一个或多个NULL
s,所以我只是从COUNT(DISTINCT ...)
中减去 1 来补偿。
(如果没有一些示例详细数据可以使用,我不能 100% 了解这里的语法。)
代码
SELECT
ReportingMonth = DATENAME(mm, GeneralJournal.TransactionDate),
MonthNumber = DATEPART(mm, GeneralJournal.TransactionDate),
ReportingYear = DATEPART(yyyy, GeneralJournal.TransactionDate),
TransactionCount = SUM(CASE
WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
AND MasterRecord = 1 THEN
1
ELSE
0
END
),
ConversionCount = COUNT(DISTINCT CASE
WHEN TransactionTypeId = @SalesOrderTransactionTypeId
AND SEReferenceId > 0
AND MasterRecord = 0 THEN
TransactionID
ELSE
NULL
END
) - 1 /* Subtract 1 for the NULL */
FROM GeneralJournal
WHERE
GeneralJournal.TransactionDate >= @StartDate
AND GeneralJournal.TransactionDate <= @EndDate
AND TransactionTypeId IN (
@SalesOrderTransactionTypeId,
@SalesEstimateTransactionTypeId
)
GROUP BY
DATEPART(yyyy, GeneralJournal.TransactionDate),
DATEPART(mm, GeneralJournal.TransactionDate),
DATENAME(mm, GeneralJournal.TransactionDate);