Mysql 个不同的 json 个键值

Mysql distinct json key values

我有这个 mysql table,其中有一列包含 json 和随机 keys/values。 使用下面的查询,我可以获得所有 ID 的 keys/values,但如您所见;它包含重复的包。

CREATE TABLE `my_table` (
  `package` mediumtext NOT NULL,
  `id` varchar(255) NOT NULL,
  `time` timestamp NOT NULL DEFAULT current_timestamp(),
  KEY `id` (`id`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb3
INSERT INTO my_table (id, time, package) VALUES
    ('myhost', '2022-05-08 09:00:00', '{"acl": "2.3.1-1", "apparmor": "2.0.4-2ubuntu2", "at": "3.2.5-1ubuntu1"}'),
    ('myhost', '2022-05-09 09:00:00', '{"acl": "2.3.1-1", "apparmor": "2.0.4-2ubuntu2", "at": "3.2.5-1ubuntu1"}'),
    ('myhost', '2022-05-10 09:00:00', '{"acl": "3.4.5-6", "apparmor": "2.0.4-2ubuntu2", "at": "3.2.5-1ubuntu1"}'),
    ('host123', '2022-05-10 09:00:00', '{"httpd": "2.4.6-97-el7.centos.5", "kpartx": "0.4.9-135.el7_9", "libcap": "2.22-11.el7"}');
select id, time, package from my_table;
+---------+---------------------+------------------------------------------------------------------------------------------+
| id      | time                | package                                                                                  |
+---------+---------------------+------------------------------------------------------------------------------------------+
| myhost  | 2022-05-08 09:00:00 | {"acl": "2.3.1-1", "apparmor": "2.0.4-2ubuntu2", "at": "3.2.5-1ubuntu1"}                 |
| myhost  | 2022-05-09 09:00:00 | {"acl": "2.3.1-1", "apparmor": "2.0.4-2ubuntu2", "at": "3.2.5-1ubuntu1"}                 |
| myhost  | 2022-05-10 09:00:00 | {"acl": "3.4.5-6", "apparmor": "2.0.4-2ubuntu2", "at": "3.2.5-1ubuntu1"}                 |
| host123 | 2022-05-10 09:00:00 | {"httpd": "2.4.6-97-el7.centos.5", "kpartx": "0.4.9-135.el7_9", "libcap": "2.22-11.el7"} |
+---------+---------------------+------------------------------------------------------------------------------------------+
SELECT     id,time,pkg,Json_unquote(Json_extract(package, Concat('$.', pkg))) AS version
FROM       my_table
CROSS JOIN json_table(Json_keys(package,'$'), '$[*]' columns (pkg text path '$')) j
ORDER BY   pkg;
+---------+---------------------+----------+-----------------------+
| id      | time                | pkg      | version               |
+---------+---------------------+----------+-----------------------+
| myhost  | 2022-05-08 09:00:00 | acl      | 2.3.1-1               |
| myhost  | 2022-05-09 09:00:00 | acl      | 2.3.1-1               |
| myhost  | 2022-05-10 09:00:00 | acl      | 3.4.5-6               |
| myhost  | 2022-05-08 09:00:00 | apparmor | 2.0.4-2ubuntu2        |
| myhost  | 2022-05-09 09:00:00 | apparmor | 2.0.4-2ubuntu2        |
| myhost  | 2022-05-10 09:00:00 | apparmor | 2.0.4-2ubuntu2        |
| myhost  | 2022-05-08 09:00:00 | at       | 3.2.5-1ubuntu1        |
| myhost  | 2022-05-09 09:00:00 | at       | 3.2.5-1ubuntu1        |
| myhost  | 2022-05-10 09:00:00 | at       | 3.2.5-1ubuntu1        |
| host123 | 2022-05-10 09:00:00 | httpd    | 2.4.6-97-el7.centos.5 |
| host123 | 2022-05-10 09:00:00 | kpartx   | 0.4.9-135.el7_9       |
| host123 | 2022-05-10 09:00:00 | libcap   | 2.22-11.el7           |
+---------+---------------------+----------+-----------------------+

如何调整查询以过滤重复的包裹?我只想为每个 id 保留 1 pkg + version 行,按 time:

排序
+---------+---------------------+----------+-----------------------+
| id      | time                | pkg      | version               |
+---------+---------------------+----------+-----------------------+
| myhost  | 2022-05-10 09:00:00 | acl      | 3.4.5-6               |
| myhost  | 2022-05-10 09:00:00 | apparmor | 2.0.4-2ubuntu2        |
| myhost  | 2022-05-10 09:00:00 | at       | 3.2.5-1ubuntu1        |
| host123 | 2022-05-10 09:00:00 | httpd    | 2.4.6-97-el7.centos.5 |
| host123 | 2022-05-10 09:00:00 | kpartx   | 0.4.9-135.el7_9       |
| host123 | 2022-05-10 09:00:00 | libcap   | 2.22-11.el7           |
+---------+---------------------+----------+-----------------------+

您可以尝试将 ROW_NUMBER window 函数与子查询一起使用,以获得每个 idpkg

中的最小值

查询 #1

SELECT id,time,pkg,version
FROM (
 SELECT     id,time,pkg,Json_unquote(Json_extract(package, Concat('$.', pkg))) AS version,
            ROW_NUMBER() OVER(PARTITION BY id,pkg ORDER BY time DESC,Json_unquote(Json_extract(package, Concat('$.', pkg)))) rn
 FROM       my_table
 CROSS JOIN json_table(Json_keys(package,'$'), '$[*]' columns (pkg text path '$')) j
) t1
WHERE rn = 1;
id time pkg version
host123 2022-05-10 09:00:00 httpd 2.4.6-97-el7.centos.5
host123 2022-05-10 09:00:00 kpartx 0.4.9-135.el7_9
host123 2022-05-10 09:00:00 libcap 2.22-11.el7
myhost 2022-05-10 09:00:00 acl 3.4.5-6
myhost 2022-05-10 09:00:00 apparmor 2.0.4-2ubuntu2
myhost 2022-05-10 09:00:00 at 3.2.5-1ubuntu1

View on DB Fiddle