如何删除 google Data Studio 上的重复值

Question

我有一个名为 products 的维度（google 工作表中的一列）具有以下值：

product = [apple , apple_old_2019, pineapple , pineapple_old_2020, pineapple_old_2017 ...]

然后我需要进行正则表达式并删除模式 old_****，然后按名称聚合值。

在 Google Sheets 中我会替换值然后使用 Unique 公式，但在 Google Data Studio 中没有这样的功能。

我使用以下公式创建了一个名为 Product_pre 的自定义字段：

REGEXP_EXTRACT(Product , '^(.+?)(_old_[0-9]{2}-[0-9]{4})' )

然后我使用以下公式创建了另一个自定义字段：

CASE
    WHEN Product_pre_process is null THEN Product
    ELSE Product_pre_process 
END

问题是结果有重复值：

product_processed = [apple , apple, pineapple , pineapple, pineapple ...]

我该如何解决？

Answer 1

1) 提取第一个单词
下面的 REGEXP_EXTRACT 函数可以解决问题（从每个字符串的开头提取所有字符，直到 _ 的第一个实例）：

REGEXP_EXTRACT(Product , "^([^_]*)")

2) 整合
如果图表类型是 Table，则删除其余维度并仅保留新创建的维度将导致指标值根据维度中的两个值（apple 和 pineapple).

Google Data Studio Report 以及 GIF 图像来形象化以上内容：

How to remove duplicate values on google data studio