简化 SELECT 语句

Simplifying SELECT statement

所以我有一个我认为应该有效的声明...但是它感觉很次优,我无法在我的生活中弄清楚如何优化它。

我有以下表格:

我想做的是 select 所有转账,以及来自其相关交易(一笔交易到多笔转账)的一些数据,我目前没有在其中存储价格与该转账相关的代币价格。

在查询的第一部分,我得到了所有转账的列表,并计算了最近找到的代币价格之间的 DIFF。如果找不到,则为 null(我最终想要 select)。我在交易时间戳的两边都允许 3 小时 - 如果在该时间跨度内没有发现任何内容,它将为空。

其次,我 select 从这个集合开始,首先确保 diff 为空,因为这意味着价格丢失,最后,令牌价格尝试要么没有尝试获取的条目一个价格,或者如果它列出了少于 5 次尝试并且最后一次尝试是在一个多星期前。

我的布局方式导致 WHERE 子句中基本上有 3 个相同/相似的 SELECT 语句,感觉非常不理想...

我该如何改进这种方法?

        WITH [transferDateDiff] AS 
        (
            SELECT
                [t1].[Id],
                [t1].[TransactionId],
                [t1].[From],
                [t1].[To],
                [t1].[Value],
                [t1].[Type],
                [t1].[ContractAddress],
                [t1].[TokenId],
                [t2].[Hash],
                [t2].[Timestamp],
                ABS(DATEDIFF(SECOND, [tp].[Timestamp], [t2].[Timestamp])) AS diff 
            FROM
                [dbo].[Transfers] AS [t1]
                LEFT JOIN
                    [dbo].[Transactions] AS [t2] 
                    ON [t1].[TransactionId] = [t2].[Id]
                LEFT JOIN
                    (
                        SELECT
                            * 
                        FROM
                            [dbo].[TokenPrices]
                    )
                    AS [tp] 
                    ON [tp].[ContractAddress] = [t1].[ContractAddress]
                    AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp]) 
                    AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp]) 
            WHERE
                [t1].[Type] < 2
        )


        SELECT
            [tdd].[Id],
            [tdd].[TransactionId],
            [tdd].[From],
            [tdd].[To],
            [tdd].[Value],
            [tdd].[Type],
            [tdd].[ContractAddress],
            [tdd].[TokenId],
            [tdd].[Hash],
            [tdd].[Timestamp]
        FROM
            [transferDateDiff] AS tdd
        WHERE
            [tdd].[diff] IS NULL AND
            (
                (
                    SELECT
                        COUNT(*)
                    FROM
                        [dbo].[TokenPriceAttempts] tpa
                    WHERE
                        [tpa].[TransferId] = [tdd].[Id]
                )
                = 0 OR
                (
                    (
                        SELECT
                            COUNT(*)
                        FROM
                            [dbo].[TokenPriceAttempts] tpa
                        WHERE
                            [tpa].[TransferId] = [tdd].[Id]
                    )
                    < 5 AND
                    (
                        DATEDIFF(DAY,
                            (
                                SELECT
                                    MAX([tpa].[Created])
                                FROM
                                    [dbo].[TokenPriceAttempts] tpa
                                WHERE
                                    [tpa].[TransferId] = [tdd].[Id]
                            ),
                            CURRENT_TIMESTAMP
                        ) >= 7
                    )
                )
            )

我不明白你为什么要这样做:

LEFT JOIN
(
    SELECT
        * 
    FROM
        [dbo].[TokenPrices]
)
AS [tp] 
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp]) 
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp]) 

这不是

LEFT JOIN [dbo].[TokenPrices] as TP ...

这个:

SELECT
    COUNT(*)
FROM
    [dbo].[TokenPriceAttempts] tpa
WHERE
    [tpa].[TransferId] = [tdd].[Id]

可能是另一个 CTE 而不是子... 事实上,您的任何子查询都可能是 CTE,这是 CTE 的一部分,使事情更容易阅读。

,TPA
AS
(
    SELECT COUNT(*)
    FROM [dbo].[TokenPriceAttempts] tpa
    WHERE [tpa].[TransferId] = [tdd].[Id]
)

这里试图帮助简化。我删除了所有真正不需要的 [brackets] 除非你 运行 变成保留关键字之类的东西,或者名称中有空格的列(开头不好)。

无论如何,您的主查询每个 ID 有 3 个 select 实例。为了消除这种情况,我对一个子查询执行了一个 LEFT JOIN,该子查询将所有类型 < 2 的传输和 JOINS 拉到一次价格尝试中。这样,结果将已经 pre-aggregated count(*) 和 Max(Created) 完成一次,用于与您的 WITH CTE 声明相同的传输基础。因此,您不必每次都保留 运行 3 个查询,也不必查询所有传输的整个 table,只需查询具有相同基础类型 < 2 条件的那些。结果子查询别名“PQ”(preQuery)

这现在简化了外部 WHERE 子句从每个 Id 的冗余计数中的可读性。

WITH transferDateDiff AS 
(
SELECT
        t1.Id,
        t1.TransactionId,
        t1.From,
        t1.To,
        t1.Value,
        t1.Type,
        t1.ContractAddress,
        t1.TokenId,
        t2.Hash,
        t2.Timestamp,
        ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) AS diff 
    FROM
        dbo.Transfers t1
            LEFT JOIN dbo.Transactions t2
                ON t1.TransactionId = t2.Id
                LEFT JOIN dbo.TokenPrices tp
                    ON t1.ContractAddress = tp.ContractAddress
                    AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
                    AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
    WHERE
        t1.Type < 2
)


SELECT
        tdd.Id,
        tdd.TransactionId,
        tdd.From,
        tdd.To,
        tdd.Value,
        tdd.Type,
        tdd.ContractAddress,
        tdd.TokenId,
        tdd.Hash,
        tdd.Timestamp
    FROM
        transferDateDiff tdd
            LEFT JOIN
            ( SELECT
                    t1.Id,
                    COUNT(*) Attempts,
                    MAX(tpa.Created) MaxCreated
                FROM
                    dbo.Transfers t1
                        JOIN dbo.TokenPriceAttempts tpa
                            on t1.Id = tpa.TransferId
                WHERE
                    t1.Type < 2
                GROUP BY
                    t1.Id ) PQ
                on tdd.Id = PQ.Id
    WHERE
            tdd.diff IS NULL 
        AND (   PQ.Attempts IS NULL
            OR PQ.Attempts = 0
            OR (    PQ.Attempts < 5
                AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7
                )
            )

已修改以将 WITH CTE 删除到单个查询中

SELECT
        t1.Id,
        t1.TransactionId,
        t1.From,
        t1.To,
        t1.Value,
        t1.Type,
        t1.ContractAddress,
        t1.TokenId,
        t2.Hash,
        t2.Timestamp
    FROM
        -- Now, this pre-query is left-joined to token price attempts
        -- so ALL Transfers of type < 2 are considered
        ( SELECT
                t1.Id,
                coalesce( COUNT(*), 0 ) Attempts,
                MAX(tpa.Created) MaxCreated
            FROM
                dbo.Transfers t1
                    LEFT JOIN dbo.TokenPriceAttempts tpa
                        on t1.Id = tpa.TransferId
            WHERE
                t1.Type < 2
            GROUP BY
                t1.Id ) PQ
            -- Now, we can just directly join to transfers for the rest
            JOIN dbo.Transfers t1
                on PQ.Id = t1.Id
                -- and the rest from the WITH CTE construct
                LEFT JOIN dbo.Transactions t2
                    ON t1.TransactionId = t2.Id
                    LEFT JOIN dbo.TokenPrices tp
                        ON t1.ContractAddress = tp.ContractAddress
                        AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
                        AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
    WHERE
            ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) IS NULL
        AND (   PQ.Attempts = 0
            OR (    PQ.Attempts < 5
                AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7 )
            )