如何在 Vega-Lite 中向数据集添加额外字段
How to add an extra field to the dataset in Vega-Lite
我的数据集是以下形式的数组:
[
{ "DATE" : "2020-01-02", "COUNTRY" : "Spain", "COUNT" : 110 },
{ ... },
{ ... }
]
有多个国家和多天。日期没有间隔。
我想注入字段 DAYS_PASSED
(然后将其用于 X 轴)
使用以下算法:
- 检查同一国家/地区前一天
DAYS_PASSED
的值
并将其分配给变量 TEMP
。 (如果前一天不存在,则假设为0);
- 使用以下公式计算
DAYS_PASSED
:
if TEMP > 0, then DAYS_PASSED = TEMP + 1
else-if COUNT > 100 then DAYS_PASSED = 1
else DAYS_PASSED = 0
到目前为止,我已经在预处理步骤中完成了这项工作(在 Vega-Lite 之外),但我是
想知道是否有可能将计算迁移到 Vega-Lite,也许
通过某种方式插入 JavaScript 函数?
我还希望能够在 COUNT > 100
条件下公开 100
图形,以便用户可以将其调整为 200。
您可以通过一系列转换来做到这一点;例如:
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
这是一个更完整的示例,显示了一个小型数据集 (vega editor):
{
"data": {
"values": [
{"DATE": "2020-02-02", "COUNTRY": "Spain", "COUNT": 50},
{"DATE": "2020-02-03", "COUNTRY": "Spain", "COUNT": 70},
{"DATE": "2020-02-04", "COUNTRY": "Spain", "COUNT": 110},
{"DATE": "2020-02-05", "COUNTRY": "Spain", "COUNT": 150},
{"DATE": "2020-02-06", "COUNTRY": "Spain", "COUNT": 200},
{"DATE": "2020-02-02", "COUNTRY": "Italy", "COUNT": 90},
{"DATE": "2020-02-03", "COUNTRY": "Italy", "COUNT": 100},
{"DATE": "2020-02-04", "COUNTRY": "Italy", "COUNT": 140},
{"DATE": "2020-02-05", "COUNTRY": "Italy", "COUNT": 190},
{"DATE": "2020-02-06", "COUNTRY": "Italy", "COUNT": 250}
]
},
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
"concat": [
{
"mark": "line",
"encoding": {
"x": {"field": "DATE", "type": "temporal"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
},
{
"mark": "line",
"transform": [{"filter": "datum.daysPassed > 0"}],
"encoding": {
"x": {"field": "daysPassed", "type": "quantitative"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
}
]
}
我的数据集是以下形式的数组:
[
{ "DATE" : "2020-01-02", "COUNTRY" : "Spain", "COUNT" : 110 },
{ ... },
{ ... }
]
有多个国家和多天。日期没有间隔。
我想注入字段 DAYS_PASSED
(然后将其用于 X 轴)
使用以下算法:
- 检查同一国家/地区前一天
DAYS_PASSED
的值 并将其分配给变量TEMP
。 (如果前一天不存在,则假设为0); - 使用以下公式计算
DAYS_PASSED
:
if TEMP > 0, then DAYS_PASSED = TEMP + 1
else-if COUNT > 100 then DAYS_PASSED = 1
else DAYS_PASSED = 0
到目前为止,我已经在预处理步骤中完成了这项工作(在 Vega-Lite 之外),但我是 想知道是否有可能将计算迁移到 Vega-Lite,也许 通过某种方式插入 JavaScript 函数?
我还希望能够在 COUNT > 100
条件下公开 100
图形,以便用户可以将其调整为 200。
您可以通过一系列转换来做到这一点;例如:
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
这是一个更完整的示例,显示了一个小型数据集 (vega editor):
{
"data": {
"values": [
{"DATE": "2020-02-02", "COUNTRY": "Spain", "COUNT": 50},
{"DATE": "2020-02-03", "COUNTRY": "Spain", "COUNT": 70},
{"DATE": "2020-02-04", "COUNTRY": "Spain", "COUNT": 110},
{"DATE": "2020-02-05", "COUNTRY": "Spain", "COUNT": 150},
{"DATE": "2020-02-06", "COUNTRY": "Spain", "COUNT": 200},
{"DATE": "2020-02-02", "COUNTRY": "Italy", "COUNT": 90},
{"DATE": "2020-02-03", "COUNTRY": "Italy", "COUNT": 100},
{"DATE": "2020-02-04", "COUNTRY": "Italy", "COUNT": 140},
{"DATE": "2020-02-05", "COUNTRY": "Italy", "COUNT": 190},
{"DATE": "2020-02-06", "COUNTRY": "Italy", "COUNT": 250}
]
},
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
"concat": [
{
"mark": "line",
"encoding": {
"x": {"field": "DATE", "type": "temporal"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
},
{
"mark": "line",
"transform": [{"filter": "datum.daysPassed > 0"}],
"encoding": {
"x": {"field": "daysPassed", "type": "quantitative"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
}
]
}