Select NESTED 列中的多个键和值

Select multiple keys and values from a NESTED column

我正在将 Google Analytics 4 事件链接到 BigQuery table。 我可以仅基于一个键检索数据,但是如何获取存储在与该键相同的记录中的另一个键中的值?

具体来说,我想按文章名称对浏览量进行排名,然后在单独的一栏中提供作者姓名作为补充数据(作者的名称存储在同一记录中,但嵌套列中具有不同的键)。

环境

Google 分析 4 在 Google 跟踪代码管理器

中设置事件

table 架构如下所示,在 event_params.key 中包含 article_name author_name 等键,在 event_params.value.string_value 中包含您想要获取的值。

table 预览如下所示:

+-----+------------+-----------------+--------------+------------------+---------------------------------+
| Row | event_date | event_timestamp | event_name   | event_params.key | event_params.value.string_value |
+-----+------------+-----------------+--------------+------------------+---------------------------------+
| 1   | 20201127   | 160394324324231 | view_article | article_name     | My Article A                    |
|     |            |                 |              | author_name      | Author A                        |
|     |            |                 |              | random key1      | random value1                   |
|     |            |                 |              | random key2      | random value2                   |
| 2   | 20201127   | 160394324324112 | view_article | article_name     | My Article B                    |
|     |            |                 |              | author_name      | Author B                        |
|     |            |                 |              | random key1      | random value3                   |
|     |            |                 |              | random key2      | random value4                   |
 ...
+-----+------------+-----------------+--------------+------------------+---------------------------------+

我试过的

能够在没有作者姓名的情况下自行获得文章排名。

#standardSQL

WITH _data AS (
    SELECT 
        value.string_value AS article_name 
    FROM 
        `my-new-project.analytics_000000000.events_*`, 
        UNNEST(event_params) 
    WHERE 
        event_name = 'article_view'
)

SELECT
    article_name,
    COUNT(*) AS cnt
FROM
    _data
GROUP BY
    1
ORDER BY
    2 DESC

结果:

+-----+--------------+-----+
| Row | article_name | cnt |
+-----+--------------+-----+
| 1   | My Article A | 20  |
| 2   | My Article D | 18  |
| 3   | My Article C | 11  |
| 4   | My Article B | 9   |
  ...
+-----+--------------+-----+

我想在此处的 article_name 旁边为 author_name 添加一列,因此我认为使用 CASE WHEN 是个好主意。 但事实证明,author_name 将全部为空,这可能意味着它被视为单独的记录。

#standardSQL

WITH _data AS (
    SELECT 
        CASE WHEN key = 'article_name' THEN value.string_value
        END AS article_name,
        CASE WHEN key = 'author_name' THEN value.string_value
        END AS author_name
    FROM 
        `my-new-project.analytics_000000000.events_*`, 
        UNNEST(event_params) 
    WHERE 
        key = 'article_name'
)

SELECT
    article_name,
    MAX(author_name),
    COUNT(*) AS cnt
FROM
    _data
GROUP BY
    1
ORDER BY
    3 DESC

结果:

+-----+--------------+-------------+-----+
| Row | article_name | author_name | cnt |
+-----+--------------+-------------+-----+
| 1   | My Article A | null        | 20  |
| 2   | My Article D | null        | 18  |
| 3   | My Article C | null        | 11  |
| 4   | My Article B | null        | 9   |
  ...
+-----+--------------+-------------+-----+

当我用author_name按降序GROUP BY时,作者的名字正确出现了,但是这次article_name全是null。是否可以在同一条记录中同时出现 article_name 和 author_name,并且同一排名结果中的文章名称旁边有作者姓名?

好像你在过滤掉 'article_authors ,你如何将 cte 中的 where 子句更改为 WHERE key IN ( 'article_name' , 'author_name')

因此您的查询应如下所示:

#standardSQL

WITH _data AS (
    SELECT 
        CASE WHEN key = 'article_name' THEN value.string_value
        END AS article_name,
        CASE WHEN key = 'author_name' THEN value.string_value
        END AS author_name
    FROM 
        `my-new-project.analytics_000000000.events_*`, 
        UNNEST(event_params) 
    WHERE 
        key IN ( 'article_name' , 'author_name') 
)

SELECT
    article_name,
    MAX(author_name),
    COUNT(*) AS cnt
FROM
    _data
GROUP BY
    1
ORDER BY
    3 DESC

我认为您正在寻找子select 解决方案:

SELECT
   (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'article_name') AS article_name,
   (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'author_name') AS author_name,
count(1) as cnt
FROM 
  `my-new-project.analytics_000000000.events_*`
GROUP BY 1,2
ORDER BY 3 DESC