PostgreSQL - Select 每个 ID 只有 1 行

PostgreSQL - Select only 1 row for each ID

情况

我在一个旅游引擎网站上工作,正在编写一个复杂的查询,以根据 IP 地址目的地[将访问者的搜索查询与他们的预订相匹配=37=] 和 日期 这样我可以稍后计算出转换率。

问题

需要基于一个参数的多个转换比率(在这种情况下,我从RequestUrl[=中提取的utm_source 37=] 存储在搜索中 table)。问题是一些用户从不同的位置进行多次搜索..有时我们在请求中得到 utm_source 有时却没有......当然我们只需要匹配1 次预订。请参阅下面的查询结果截图以更好地理解:

看到第 3 行和第 4 行具有相同的预订 ID 等。但 Value 列的值不同。我只需要 select 其中之一,而不是两者。基本上,如果有1个以上,我需要选择不是"N/A".

的1

我的查询:

SELECT DISTINCT "B"."Id" AS "BookingId", "PQ"."IPAddress", "PQ"."To", "PQ"."SearchDate", "PQ"."Value"
FROM
(
    SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
    FROM dbo."PackageQueries"
    WHERE "SiteId" = '<The ID>'
    AND "CreatedAt" >= '<Start Date>'
    AND "CreatedAt" < '<End Date>'
) AS "PQ"
INNER JOIN dbo."Bookings" AS "B"
    ON "PQ"."IPAddress" = "B"."IPAddress"
    AND "B"."To" = "PQ"."To"
    AND "B"."BookingDate"::date = "PQ"."SearchDate"
WHERE "B"."SiteId" = '<The ID>'
AND "B"."BookingStatus" = 2
AND "B"."BookingDate" >= '<Start Date>'
AND "B"."BookingDate" < '<End Date>'
ORDER BY "B"."Id", "PQ"."IPAddress", "PQ"."To";

我找到了一个解决方案,并基于我在这里找到的内容: and here: Postgres CASE in ORDER BY using an alias

我的解决方法如下:

SELECT "BookingId", "IPAddress", "To", "SearchDate", "Value"
FROM
(
    SELECT DISTINCT
        "B"."Id" AS "BookingId",
        "PQ"."IPAddress",
        "PQ"."To",
        "PQ"."SearchDate",
        "PQ"."Value",
        RANK() OVER
        (
            PARTITION BY "B"."Id"
            ORDER BY
            CASE
                WHEN "PQ"."Value" = 'N/A' THEN 1
                ELSE 0
            END
        ) AS "RowNumber"
    FROM
    (
        SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
        FROM dbo."PackageQueries"
        WHERE "SiteId" = '<Site ID>'
        AND "CreatedAt" >= '<Start Date>'
        AND "CreatedAt" < '<End Date>'
    ) AS "PQ"
    INNER JOIN dbo."Bookings" AS "B"
        ON "PQ"."IPAddress" = "B"."IPAddress"
        AND "B"."To" = "PQ"."To"
        AND "B"."BookingDate"::date = "PQ"."SearchDate"
    WHERE "B"."SiteId" = '<Site ID>'
    AND "B"."BookingStatus" = 2
    AND "B"."BookingDate" >= '<Start Date>'
    AND "B"."BookingDate" < '<End Date>'
) T
WHERE "RowNumber" = 1
ORDER BY "BookingId", "IPAddress", "To";

有点long-winded,但效果很好。希望对大家有帮助。

编辑

故事还没有结束:在某些情况下,我得到的值不止 1 个。答案是修改 CASE 语句,为每个文本值生成一个唯一的数字。可以在这里找到解决方案: