根据数字连接行(Google Refine,Excel/Google Spreadsheet)
Concatenate the rows based on number (Google Refine, Excel/Google Spreadsheet)
我的 csv 文件中有大量行,看起来像:
name a,1
name b,1
name c,1
name d,2
name e,2
我需要根据数字连接行。结果应该是:
name a|name b|name c
name d|name e
如何在 Google 优化或 Excel/Google 电子表格中完成?
我在想,但是没有办法。
非常感谢!
如果你可以使用Python,那么这个操作就很容易了。在下面的代码中,名称和组从 "input.csv" 中读取,分组的名称(连同组)写入 "output.csv"。 defaultdict
用于创建空列表来存储组成员。
import collections
import csv
grouped = collections.defaultdict(list)
with open('input.csv') as fp:
reader = csv.reader(fp)
for row in reader:
name, group = row
grouped[group].append(name)
with open('output.csv', 'w', newline='') as fp:
writer = csv.writer(fp, delimiter='|')
for key in sorted(grouped.keys()):
writer.writerow([key] + grouped[key])
这是一个带有 Open refine 的提案。我使用的唯一 Grel 公式是:
row.record.cells['myColumn'].value.join('|')
screencast
这是 JSOn,假设您的第一列命名为 "myColumn",第二列命名为 "number":
[
{
"op": "core/column-addition",
"description": "Create column test at index 2 based on column number using expression grel:value",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectError": false,
"invert": false,
"name": "ee",
"omitBlank": false,
"type": "list",
"columnName": "ee"
}
]
},
"newColumnName": "test",
"columnInsertIndex": 2,
"baseColumnName": "number",
"expression": "grel:value",
"onError": "set-to-blank"
},
{
"op": "core/column-move",
"description": "Move column test to position 0",
"columnName": "test",
"index": 0
},
{
"op": "core/blank-down",
"description": "Blank down cells in column test",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectError": false,
"invert": false,
"name": "ee",
"omitBlank": false,
"type": "list",
"columnName": "ee"
}
]
},
"columnName": "test"
},
{
"op": "core/column-addition",
"description": "Create column concatenation at index 2 based on column myColumn using expression grel:row.record.cells['myColumn'].value.join('|')",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectError": false,
"invert": false,
"name": "ee",
"omitBlank": false,
"type": "list",
"columnName": "ee"
}
]
},
"newColumnName": "concatenation",
"columnInsertIndex": 2,
"baseColumnName": "myColumn",
"expression": "grel:row.record.cells['myColumn'].value.join('|')",
"onError": "set-to-blank"
}
]
我的 csv 文件中有大量行,看起来像:
name a,1
name b,1
name c,1
name d,2
name e,2
我需要根据数字连接行。结果应该是:
name a|name b|name c
name d|name e
如何在 Google 优化或 Excel/Google 电子表格中完成?
我在想,但是没有办法。
非常感谢!
如果你可以使用Python,那么这个操作就很容易了。在下面的代码中,名称和组从 "input.csv" 中读取,分组的名称(连同组)写入 "output.csv"。 defaultdict
用于创建空列表来存储组成员。
import collections
import csv
grouped = collections.defaultdict(list)
with open('input.csv') as fp:
reader = csv.reader(fp)
for row in reader:
name, group = row
grouped[group].append(name)
with open('output.csv', 'w', newline='') as fp:
writer = csv.writer(fp, delimiter='|')
for key in sorted(grouped.keys()):
writer.writerow([key] + grouped[key])
这是一个带有 Open refine 的提案。我使用的唯一 Grel 公式是:
row.record.cells['myColumn'].value.join('|')
screencast
这是 JSOn,假设您的第一列命名为 "myColumn",第二列命名为 "number":
[
{
"op": "core/column-addition",
"description": "Create column test at index 2 based on column number using expression grel:value",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectError": false,
"invert": false,
"name": "ee",
"omitBlank": false,
"type": "list",
"columnName": "ee"
}
]
},
"newColumnName": "test",
"columnInsertIndex": 2,
"baseColumnName": "number",
"expression": "grel:value",
"onError": "set-to-blank"
},
{
"op": "core/column-move",
"description": "Move column test to position 0",
"columnName": "test",
"index": 0
},
{
"op": "core/blank-down",
"description": "Blank down cells in column test",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectError": false,
"invert": false,
"name": "ee",
"omitBlank": false,
"type": "list",
"columnName": "ee"
}
]
},
"columnName": "test"
},
{
"op": "core/column-addition",
"description": "Create column concatenation at index 2 based on column myColumn using expression grel:row.record.cells['myColumn'].value.join('|')",
"engineConfig": {
"mode": "row-based",
"facets": [
{
"omitError": false,
"expression": "isBlank(value)",
"selectBlank": false,
"selection": [
{
"v": {
"v": false,
"l": "false"
}
}
],
"selectError": false,
"invert": false,
"name": "ee",
"omitBlank": false,
"type": "list",
"columnName": "ee"
}
]
},
"newColumnName": "concatenation",
"columnInsertIndex": 2,
"baseColumnName": "myColumn",
"expression": "grel:row.record.cells['myColumn'].value.join('|')",
"onError": "set-to-blank"
}
]