使用 GROUP BY 和 JOIN-ing 非聚合列查询
Using GROUP BY and JOIN-ing non-aggregated columns to query
我知道有很多类似的问题,但是我没有通过阅读来解决我的问题。我会很感激一些指示。
下面是我的 dummy
table 中的一些示例数据:
id
foo
bar
baz
moo
ins_date
percentage
yes
no
maybe
38
foothing
bar_one
pizazz
amoosing
2018-05-26 06:59:00
81
25
529
196
41
foothing
bar_one
pizazz
amoosing
2018-05-29 06:43:00
83
441
144
49
23
foothing
bar_one
pizazz
amoosing
2018-06-24 08:48:00
62
9
1
16
20
foothing
bar_one
pizazz
amoosing
2018-06-27 10:37:00
94
676
16
400
65
foothing
bar_one
pizazz
amoosing
2018-07-01 08:34:00
92
121
64
225
68
foothing
bar_one
pizazz
amoosing
2018-07-04 01:46:00
91
324
25
289
71
foothing
bar_one
pizazz
amoosing
2018-07-06 23:44:00
65
196
676
100
74
foothing
bar_one
pizazz
amoosing
2018-07-10 09:41:00
92
1024
121
81
77
foothing
bar_one
pizazz
amoosing
2018-07-13 06:47:00
64
576
169
1
96
foothing
bar_one
pizazz
amoosing
2018-08-02 10:34:00
78
1369
256
81
99
foothing
bar_one
pizazz
amoosing
2018-08-04 08:25:00
82
2809
9
256
102
foothing
bar_one
pizazz
amoosing
2018-08-07 06:49:00
87
576
9
676
105
foothing
bar_one
pizazz
amoosing
2018-08-10 03:29:00
68
4225
1089
196
108
foothing
bar_one
pizazz
amoosing
2018-08-13 03:59:00
92
1156
169
484
111
foothing
bar_one
pizazz
amoosing
2018-08-16 05:34:00
63
1764
100
108
我想通过单次查询实现:
- 过滤所有行以获得 ins_date 我选择的日期范围
- 获取每组 foo、bar、baz 和 moo 的 最大值 ins_date
- 能够按 foo、bar、baz 和 moo 筛选行
- 另外在查询中显示未分组的值,例如百分比、是、否和可能。
总的来说,这被证明是复杂的。到目前为止,我已经设法实现了下面查询中的前 3 点,希望这能解释我正在寻找的内容:
SELECT
s.foo,
s.bar,
s.baz,
s.moo,
MAX(s.ins_date) mdate
FROM
(
SELECT *
FROM dummy
WHERE ins_date
-- My arbitrary date range goes here
BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
) s
GROUP BY foo, bar, baz, moo
-- I could add other filters into the 'HAVING' clause
HAVING moo LIKE "%moo%"
AND baz = "baz"
这给出了输出:
foo
bar
baz
moo
mdate
foothing
bar_one
baz
amoosing
2018-11-29 05:31:00
foothing
bar_one
baz
mooman_being
2019-04-21 10:31:00
foothing
bar_one
baz
strawberry_moosse
2019-03-17 06:37:00
在此示例中,如果我要更改日期限制以仅显示 2018-05-01
和 2018-05-29
之间的日期,那么第一行 mdate
将显示 2018-05-29 06:43:00
,因为对于 foo/bar/baz/moo.
的特定分组,这是该日期范围内的最新(最近)日期
但我无法附加与该分组无关的其他列。我试过使用 JOIN ...
SELECT
s1.foo,
s1.bar,
s1.baz,
s1.moo,
MAX(s1.ins_date) mdate,
s2.percentage,
s2.yes,
s2.maybe,
s2.no
FROM
(
SELECT *
FROM dummy
WHERE ins_date
-- My arbitrary date range goes here
BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
) s1
-- Attempting to a do a self-join to get the non-aggregated columns
INNER JOIN
(
SELECT id, percentage, yes, maybe, no
FROM dummy
) s2
ON s2.id = s1.id
GROUP BY foo, bar, baz, moo
-- I could add other filters into the 'HAVING' clause
HAVING moo LIKE "%moo%"
AND baz = "baz"
但是这个returns错误:
Expression #6 of SELECT list is not in GROUP BY clause and contains nonaggregated column 's2.percentage' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
如何在不破坏查询的情况下将非聚合列添加到查询中?
我正在使用 mysql 5.7,所以 mysql 8 个选项不可用。
Fiddle 下面:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=1980dd582c2235dc0938cb14c781e3c6
也许通过聚合额外的列?
SELECT foo, bar, baz, moo
, MAX(ins_date) AS mdate
, AVG(percentage) AS avg_perc
, MAX(yes) AS YEAHBABY
, MAX(maybe) AS MAYBEBABY
, MAX(no) AS NONONONONOOO
FROM dummy dum
WHERE ins_date BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
AND moo LIKE '%moo%'
AND baz = 'baz'
GROUP BY foo, bar, baz, moo
foo | bar | baz | moo | mdate | avg_perc | YEAHBABY | MAYBEBABY | NONONONONOOO
:------- | :------ | :-- | :---------------- | :------------------ | -------: | -------: | --------: | -----------:
foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 82.8000 | 11236 | 625 | 841
foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70.0000 | 3969 | 16 | 121
foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80.0000 | 23716 | 529 | 49
db<>fiddle here
或加入分组字段和最大日期。
或使用模拟 row_number。
或使用 EXISTS
.
SELECT foo, bar, baz, moo
, ins_date
, percentage
, yes, maybe, no
FROM dummy dum
WHERE EXISTS (
SELECT 1
FROM dummy dum2
WHERE dum2.ins_date BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
AND dum2.moo LIKE '%moo%'
AND dum2.baz = 'baz'
AND dum2.foo = dum.foo
AND dum2.bar = dum.bar
AND dum2.baz = dum.baz
AND dum2.moo = dum.moo
GROUP BY foo, bar, baz, moo
HAVING MAX(dum2.ins_date) = dum.ins_date
);
foo | bar | baz | moo | ins_date | percentage | yes | maybe | no
:------- | :------ | :-- | :---------------- | :------------------ | ---------: | ----: | ----: | --:
foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80 | 23716 | 529 | 49
foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70 | 3969 | 16 | 121
foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 97 | 9025 | 361 | 1
db<>fiddle here
我知道有很多类似的问题,但是我没有通过阅读来解决我的问题。我会很感激一些指示。
下面是我的 dummy
table 中的一些示例数据:
id foo bar baz moo ins_date percentage yes no maybe 38 foothing bar_one pizazz amoosing 2018-05-26 06:59:00 81 25 529 196 41 foothing bar_one pizazz amoosing 2018-05-29 06:43:00 83 441 144 49 23 foothing bar_one pizazz amoosing 2018-06-24 08:48:00 62 9 1 16 20 foothing bar_one pizazz amoosing 2018-06-27 10:37:00 94 676 16 400 65 foothing bar_one pizazz amoosing 2018-07-01 08:34:00 92 121 64 225 68 foothing bar_one pizazz amoosing 2018-07-04 01:46:00 91 324 25 289 71 foothing bar_one pizazz amoosing 2018-07-06 23:44:00 65 196 676 100 74 foothing bar_one pizazz amoosing 2018-07-10 09:41:00 92 1024 121 81 77 foothing bar_one pizazz amoosing 2018-07-13 06:47:00 64 576 169 1 96 foothing bar_one pizazz amoosing 2018-08-02 10:34:00 78 1369 256 81 99 foothing bar_one pizazz amoosing 2018-08-04 08:25:00 82 2809 9 256 102 foothing bar_one pizazz amoosing 2018-08-07 06:49:00 87 576 9 676 105 foothing bar_one pizazz amoosing 2018-08-10 03:29:00 68 4225 1089 196 108 foothing bar_one pizazz amoosing 2018-08-13 03:59:00 92 1156 169 484 111 foothing bar_one pizazz amoosing 2018-08-16 05:34:00 63 1764 100 108
我想通过单次查询实现:
- 过滤所有行以获得 ins_date 我选择的日期范围
- 获取每组 foo、bar、baz 和 moo 的 最大值 ins_date
- 能够按 foo、bar、baz 和 moo 筛选行
- 另外在查询中显示未分组的值,例如百分比、是、否和可能。
总的来说,这被证明是复杂的。到目前为止,我已经设法实现了下面查询中的前 3 点,希望这能解释我正在寻找的内容:
SELECT
s.foo,
s.bar,
s.baz,
s.moo,
MAX(s.ins_date) mdate
FROM
(
SELECT *
FROM dummy
WHERE ins_date
-- My arbitrary date range goes here
BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
) s
GROUP BY foo, bar, baz, moo
-- I could add other filters into the 'HAVING' clause
HAVING moo LIKE "%moo%"
AND baz = "baz"
这给出了输出:
foo bar baz moo mdate foothing bar_one baz amoosing 2018-11-29 05:31:00 foothing bar_one baz mooman_being 2019-04-21 10:31:00 foothing bar_one baz strawberry_moosse 2019-03-17 06:37:00
在此示例中,如果我要更改日期限制以仅显示 2018-05-01
和 2018-05-29
之间的日期,那么第一行 mdate
将显示 2018-05-29 06:43:00
,因为对于 foo/bar/baz/moo.
但我无法附加与该分组无关的其他列。我试过使用 JOIN ...
SELECT
s1.foo,
s1.bar,
s1.baz,
s1.moo,
MAX(s1.ins_date) mdate,
s2.percentage,
s2.yes,
s2.maybe,
s2.no
FROM
(
SELECT *
FROM dummy
WHERE ins_date
-- My arbitrary date range goes here
BETWEEN '2018-07-01 00:00:00'
AND '2019-11-01 23:59:59'
) s1
-- Attempting to a do a self-join to get the non-aggregated columns
INNER JOIN
(
SELECT id, percentage, yes, maybe, no
FROM dummy
) s2
ON s2.id = s1.id
GROUP BY foo, bar, baz, moo
-- I could add other filters into the 'HAVING' clause
HAVING moo LIKE "%moo%"
AND baz = "baz"
但是这个returns错误:
Expression #6 of SELECT list is not in GROUP BY clause and contains nonaggregated column 's2.percentage' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
如何在不破坏查询的情况下将非聚合列添加到查询中?
我正在使用 mysql 5.7,所以 mysql 8 个选项不可用。
Fiddle 下面:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=1980dd582c2235dc0938cb14c781e3c6
也许通过聚合额外的列?
SELECT foo, bar, baz, moo , MAX(ins_date) AS mdate , AVG(percentage) AS avg_perc , MAX(yes) AS YEAHBABY , MAX(maybe) AS MAYBEBABY , MAX(no) AS NONONONONOOO FROM dummy dum WHERE ins_date BETWEEN '2018-07-01 00:00:00' AND '2019-11-01 23:59:59' AND moo LIKE '%moo%' AND baz = 'baz' GROUP BY foo, bar, baz, moo
foo | bar | baz | moo | mdate | avg_perc | YEAHBABY | MAYBEBABY | NONONONONOOO :------- | :------ | :-- | :---------------- | :------------------ | -------: | -------: | --------: | -----------: foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 82.8000 | 11236 | 625 | 841 foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70.0000 | 3969 | 16 | 121 foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80.0000 | 23716 | 529 | 49
db<>fiddle here
或加入分组字段和最大日期。
或使用模拟 row_number。
或使用 EXISTS
.
SELECT foo, bar, baz, moo , ins_date , percentage , yes, maybe, no FROM dummy dum WHERE EXISTS ( SELECT 1 FROM dummy dum2 WHERE dum2.ins_date BETWEEN '2018-07-01 00:00:00' AND '2019-11-01 23:59:59' AND dum2.moo LIKE '%moo%' AND dum2.baz = 'baz' AND dum2.foo = dum.foo AND dum2.bar = dum.bar AND dum2.baz = dum.baz AND dum2.moo = dum.moo GROUP BY foo, bar, baz, moo HAVING MAX(dum2.ins_date) = dum.ins_date );
foo | bar | baz | moo | ins_date | percentage | yes | maybe | no :------- | :------ | :-- | :---------------- | :------------------ | ---------: | ----: | ----: | --: foothing | bar_one | baz | strawberry_moosse | 2019-03-17 06:37:00 | 80 | 23716 | 529 | 49 foothing | bar_one | baz | mooman_being | 2019-04-21 10:31:00 | 70 | 3969 | 16 | 121 foothing | bar_one | baz | amoosing | 2018-11-29 05:31:00 | 97 | 9025 | 361 | 1
db<>fiddle here