如何将两个 select 查询与大多数相同列的计数合并,但一组 returns 多行

How can I merge two select queries with counts of mostly identical columns but one set returns multiple rows

我有两个完美的查询:

DECLARE @StartDate DATETIME = '2021-11-01 00:00:00';
DECLARE @EndDate DATETIME = '2022-03-16 23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;

SELECT 
    DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth, 
    DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber, 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
    COUNT(TransactionId) AS TransactionCount
FROM 
    GeneralJournal 
WHERE 
    GeneralJournal.[TransactionDate] >= @StartDate 
    AND GeneralJournal.[TransactionDate] <= @EndDate 
    AND MasterRecord = 1 
    AND TransactionTypeId = @SalesEstimateTransactionTypeId
GROUP BY 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]), 
    DATEPART(mm, GeneralJournal.[TransactionDate]),  
    DATENAME(mm,GeneralJournal.[TransactionDate]);

SELECT 
    DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth, 
    DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber, 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear, 
    COUNT(DISTINCT TransactionId) AS ConversionCount
FROM 
    GeneralJournal 
WHERE 
    GeneralJournal.[TransactionDate] >= @StartDate 
    AND GeneralJournal.[TransactionDate] <= @EndDate 
    AND MasterRecord = 0 
    AND TransactionTypeId = @SalesOrderTransactionTypeId 
    AND SEReferenceId > 0
GROUP BY 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]), 
    DATEPART(mm, GeneralJournal.[TransactionDate]),  
    DATENAME(mm,GeneralJournal.[TransactionDate]);

请注意,第二个查询 return 是不同的,因为它可以 return 多个值,我们只想在这种情况下对每个 TransactionId 计数一次。这些 return 以下结果:

ReportingMonth MonthNumber ReportingYear TransactionCount
November 11 2021 82
December 12 2021 49
January 1 2022 64
February 2 2022 67
March 3 2022 49
ReportingMonth MonthNumber ReportingYear ConversionCount
November 11 2021 42
December 12 2021 27
January 1 2022 31
February 2 2022 50
March 3 2022 24

我实际上需要像这样组合它们:

ReportingMonth MonthNumber ReportingYear TransactionCount ConversionCount
November 11 2021 82 42
December 12 2021 49 27
January 1 2022 64 31
February 2 2022 67 50
March 3 2022 49 24

我已经尝试了几乎所有我能想到的方法——联合、连接、子查询——但到目前为止还没有什么是完全正确的。这是我能得到的最接近的:

SELECT 
    DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth, 
    DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber, 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear, 
    SUM(CASE 
            WHEN TransactionTypeId = @SalesEstimateTransactionTypeId 
                 AND MasterRecord = 1 
               THEN 1 ELSE 0 
        END) AS TransactionCount, 
    COUNT(CASE 
              WHEN TransactionTypeId = @SalesOrderTransactionTypeId  
                   AND SEReferenceId > 0 THEN 1 
          END) AS ConversionCount
FROM 
    GeneralJournal 
WHERE 
    GeneralJournal.[TransactionDate] >= @StartDate 
    AND GeneralJournal.[TransactionDate] <= @EndDate 
    AND TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]), 
    DATEPART(mm, GeneralJournal.[TransactionDate]),    
    DATENAME(mm,GeneralJournal.[TransactionDate]);

但是,我找不到为 ConversionCount 获取不同值的方法。结果是 return 完整计数:

ReportingMonth MonthNumber ReportingYear TransactionCount ConversionCount
November 11 2021 82 152
December 12 2021 49 67
January 1 2022 64 101
February 2 2022 67 136
March 3 2022 49 64

任何人都可以指导我找到一种方法来组合两个查询结果,同时在转换计数上保持 Distinct 吗?我必须补充一点,答案必须与 SQL Server 和 VistaDB 兼容,后者的语法是 T-SQL 的子集,因为我有义务以相同的方式支持这两个数据库引擎查询。

编辑 - 最终解决方案

根据 Nick 的出色回答,我能够将解决方案嵌入到我现有的查询代码中,以确保即使几个月没有记录也能得到结果,在此显示以防对其他人有帮助:

DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-10-31T23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;

DECLARE @CurrentDate DATETIME;
DECLARE @Months TABLE(ReportingYear INT, MonthNumber INT, ReportingMonth VARCHAR (40));

-- Set the initial date
SET @CurrentDate = @StartDate
-- insert all dates into temp table
WHILE @CurrentDate <=  @EndDate
BEGIN
    INSERT INTO @Months VALUES(DATEPART(year, @CurrentDate), DATEPART(month, @CurrentDate), DATENAME(mm, @CurrentDate))
    SET @CurrentDate = dateadd(mm, 1, @CurrentDate)
END;

SELECT ReportingMonth, ReportingYear, Coalesce(TransactionCount, 0) AS TransactionCount, Coalesce(ConversionCount,0) AS ConversionCount
FROM
(
    SELECT months.[ReportingMonth], months.[ReportingYear], conversionData.[TransactionCount], conversionData.[ConversionCount]
    FROM @Months months
    LEFT JOIN
    (
        SELECT
        ReportingMonth      = DATENAME(mm, GeneralJournal.[TransactionDate]),
        MonthNumber         = DATEPART(mm, GeneralJournal.[TransactionDate]),
        ReportingYear       = DATEPART(yyyy, GeneralJournal.[TransactionDate]),
        TransactionCount    = SUM(CASE WHEN TransactionTypeId = @SalesEstimateTransactionTypeId AND GeneralJournal.[MasterRecord] = 1 THEN
                                        1
                                    ELSE
                                        0
                                END
                            ),
        ConversionCount     = COUNT(DISTINCT CASE WHEN GeneralJournal.[TransactionTypeId] = @SalesOrderTransactionTypeId
                                        AND GeneralJournal.[SEReferenceId] > 0
                                        AND GeneralJournal.[MasterRecord] = 0 THEN
                                        GeneralJournal.[TransactionID]
                                END
                            )
        FROM GeneralJournal
        WHERE GeneralJournal.[TransactionDate] >= @StartDate
            AND GeneralJournal.[TransactionDate] <= @EndDate
            AND GeneralJournal.[TransactionTypeId] IN ( @SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
        GROUP BY
            DATEPART(yyyy, GeneralJournal.[TransactionDate]),
            DATEPART(mm, GeneralJournal.[TransactionDate]),
            DATENAME(mm, GeneralJournal.[TransactionDate])
    ) as conversionData
    ON months.[ReportingYear] = conversionData.[ReportingYear] AND months.[MonthNumber] = conversionData.[MonthNumber]
) AS data;

您可以将这两列放在同一个查询中。由于 WHERE 子句略有不同,因此变得更加复杂。所以你需要分组,然后再次分组,并使用条件聚合来计算每一列的正确行数。

注意以下几点:

  • 理论上您可以执行 COUNT(DISTINCT CASE,但通常速度较慢,因为编译器不会识别 CASE 正在做什么,而是执行完整排序。
  • 按单个 EOMONTH 计算分组比按整月分组更快。您可以在 SELECT.
  • 中拉出年份和月份
  • COUNT(TransactionId) 将 return 个 non-null TransactionId 值。如果 TransactionId 不能为 null 那么 COUNT(*) 是一样的。
  • 如果 TransactionDate 有时间成分,那么你应该使用 half-open 间隔 >= AND <
  • 在表上使用别名,它使您的查询更具可读性。
  • 使用 whitepsace,它是免费的。
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-03-17T00:00:00';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;

SELECT
  DATENAME(month, gj.mth) AS ReportingMonth,
  DATEPART(month, gj.mth) AS MonthNumber,
  DATEPART(year , gj.mth) AS ReportingYear,
  SUM(TransactionCount) AS TransactionCount,
  COUNT(CASE WHEN ConversionCount > 0 THEN 1 END) AS ConversionCount
FROM (
    SELECT
      EOMONTH(gj.TransactionDate) AS mth,
      gj.TransactionId,
      COUNT(CASE WHEN gj.MasterRecord = 1 AND gj.TransactionTypeId = @SalesEstimateTransactionTypeId THEN 1 END) AS TransactionCount,
      COUNT(CASE WHEN gj.MasterRecord = 0 AND gj.TransactionTypeId = @SalesOrderTransactionTypeId AND gj.SEReferenceId > 0 THEN 1 END) AS ConversionCount
    FROM GeneralJournal gj
    WHERE gj.TransactionDate >= @StartDate
      AND gj.TransactionDate <  @EndDate
      AND gj.TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
    GROUP BY
      EOMONTH(gj.TransactionDate),
      TransactionId
) g
GROUP BY
  mth;

你的第二个查询很接近,我认为只是有一些小遗漏。

  1. 您在 ConversionCount CASE 语句中忘记了 MasterRecord = 0
  2. 您应该 return TransactionID 或 NULL,而不是 return从您的 ConversionCount CASE 中获取 1 或 0,这样您仍然可以计算不同的值。
  3. 您的 ConversionCount COUNT 中缺少 DISTINCT
  4. 您将需要处理 ConversionCount COUNT 中的 NULL 值。我假设你总是有一个或多个 NULLs,所以我只是从 COUNT(DISTINCT ...) 中减去 1 来补偿。

(如果没有一些示例详细数据可以使用,我不能 100% 了解这里的语法。)

代码

SELECT
    ReportingMonth      = DATENAME(mm, GeneralJournal.TransactionDate),
    MonthNumber         = DATEPART(mm, GeneralJournal.TransactionDate),
    ReportingYear       = DATEPART(yyyy, GeneralJournal.TransactionDate),
    TransactionCount    = SUM(CASE
                                WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
                                    AND MasterRecord = 1 THEN
                                    1
                                ELSE
                                    0
                            END
                        ),
    ConversionCount     = COUNT(DISTINCT CASE
                                WHEN TransactionTypeId = @SalesOrderTransactionTypeId
                                    AND SEReferenceId > 0
                                    AND MasterRecord = 0 THEN
                                    TransactionID
                                ELSE
                                    NULL
                            END
                        ) - 1 /* Subtract 1 for the NULL */
FROM    GeneralJournal
WHERE
    GeneralJournal.TransactionDate >= @StartDate
    AND GeneralJournal.TransactionDate <= @EndDate
    AND TransactionTypeId IN (
            @SalesOrderTransactionTypeId,
            @SalesEstimateTransactionTypeId
        )
GROUP BY
    DATEPART(yyyy, GeneralJournal.TransactionDate),
    DATEPART(mm, GeneralJournal.TransactionDate),
    DATENAME(mm, GeneralJournal.TransactionDate);