尝试使用 SQL 语法进行斜率计算

Trying to make a slope calculation work with SQL syntaxes

我是一个相对较新的 SQL 程序员,我正在尝试让下面的代码在 SQL 中工作。该代码用于计算给定数据集的斜率,遵循与 EXCEL SLOPE 函数完全相同的逻辑。现在的问题是不允许计数,因为聚合是嵌套的。但是如果我为计数和求和创建一个子查询,我将不得不对 x 和 y 进行分组,否则我的外部查询中将没有 x 和 y 来计算。

CREATE TABLE TEST (X FLOAT, Y FLOAT);

INSERT INTO TEST (X, Y) VALUES (1,4.10242258729964);
INSERT INTO TEST (X, Y) VALUES (2,4.57708865242591);
INSERT INTO TEST (X, Y) VALUES (3,5.16785670619896);
INSERT INTO TEST (X, Y) VALUES (4,6.88149559336059);

select sum((x-sum(x)/count(x))^2)/sum(((x-sum(x)/count(x))*(y-sum(y)/count(y))))
from TEST

您可以根据 sum(x * x) 和 sum(x * y) 以及 avg(x) 和 avg(y) 以及 n 计算斜率:

SELECT avg(x) AS mx,sum(x*x) AS sx2,sum(x*y) as sxy,avg(y) as my, count(x) AS n
FROM test

那么你可以使用:

SELECT (sxy-n*mx*my)/(sx2 - n* mx*mx)
FROM
(    SELECT avg(x) AS mx,sum(x*x) AS sx2,sum(x*y) as sxy,avg(y) as my, count(x) AS n
     FROM test
)

我通常按照这些思路做一些事情(我使语法尽可能简单以避免任何可能有问题的事情,例如 CTE):

CREATE TABLE #test (x FLOAT, y FLOAT);
INSERT INTO #test SELECT 1, 4.10242258729964;
INSERT INTO #test SELECT 2, 4.57708865242591;
INSERT INTO #test SELECT 3, 5.16785670619896;
INSERT INTO #test SELECT 4, 6.88149559336059;
SELECT 
    (N * SUM_XY - SUM_X * SUM_Y) / (N * SUM_X2 - SUM_X * SUM_X) AS slope
FROM 
    (
    SELECT
        COUNT(*) AS N,
        SUM(x) AS SUM_X,
        SUM(x * x) AS SUM_X2,
        SUM(y) AS SUM_Y,
        SUM(y * y) AS SUM_Y2,
        SUM(x * y) AS SUM_XY
    FROM
        #test) a;

只是 运行 这个,然后注意到你有来自 "SQL Hacks" 的另一个答案。我 运行 两个版本都得到了完全相同的答案,但另一个版本更短 :D

这是您创建的 SQL 的工作版本:

SELECT sum((x-avgx)*(x-avgx)) / sum((x-avgx)*(y-avgy))
FROM TEST, (SELECT sum(X)/count(X) as avgx, sum(Y)/count(Y) as avgy FROM TEST) average;

我查了 excel 斜率函数,它的定义有点不同:

SELECT sum((x-avgx)*(y-avgy)) / sum((x-avgx)*(x-avgx))
FROM TEST, 
    (
        SELECT 
            sum(X)/count(X) as avgx, 
            sum(Y)/count(Y) as avgy 
        FROM TEST
    ) average;

希望这就是您所需要的:)