Ramdajs,带参数的组数组
Ramdajs, group array with arguments
要分组的列表:
const arr = [
{
"Global Id": "1231",
"TypeID": "FD1",
"Size": 160,
"Flöde": 55,
},
{
"Global Id": "5433",
"TypeID": "FD1",
"Size": 160,
"Flöde": 100,
},
{
"Global Id": "50433",
"TypeID": "FD1",
"Size": 120,
"Flöde": 100,
},
{
"Global Id": "452",
"TypeID": "FD2",
"Size": 120,
"Flöde": 100,
},
]
指定要分组的键的函数输入:
const columns = [
{
"dataField": "TypeID",
"summarize": false,
},
{
"dataField": "Size",
"summarize": false,
},
{
"dataField": "Flöde",
"summarize": true,
},
]
预期输出:
const output = [
{
"TypeID": "FD1",
"Size": 160,
"Flöde": 155 // 55 + 100
"nrOfItems": 2
},
{
"TypeID": "FD1",
"Size": 120,
"Flöde": 100,
"nrOfItems": 1
},
{
"TypeID": "FD2",
"Size": 120,
"Flöde": 100,
"nrOfItems": 1
}
]
// nrOfItems adds up 4. 2 + 1 +1. The totalt nr of items.
函数:
const groupArr = (columns) => R.pipe(...);
"summarize"
属性 告诉 属性 是否应该总结。
数据集非常大,超过 100k 个项目。所以我不想重复不必要的东西。
我看过 R.group
但我不确定它是否适用于此处?
也许是 R.reduce
?将组存储在累加器中,汇总值并添加到计数中(如果该组已存在)?需要快速找到群组,以便将群组存储为密钥?
还是在这种情况下使用原版 javascript 更好?
这里先用 vanilla javascipt 回答,因为我对 Ramda 不是很熟悉 API。我很确定这种方法与 Ramda 非常相似。
代码有解释每一步的注释。我将尝试跟进对 Ramda 的重写。
const arr=[{"Global Id":"1231",TypeID:"FD1",Size:160,"Flöde":55},{"Global Id":"5433",TypeID:"FD1",Size:160,"Flöde":100},{"Global Id":"50433",TypeID:"FD1",Size:120,"Flöde":100},{"Global Id":"452",TypeID:"FD2",Size:120,"Flöde":100}],columns=[{dataField:"TypeID",summarize:!1},{dataField:"Size",summarize:!1},{dataField:"Flöde",summarize:!0}];
// The columns that don't summarize
// give us the keys we need to group on
const groupKeys = columns
.filter(c => c.summarize === false)
.map(g => g.dataField);
// We compose a hash function that create
// a hash out of all the items' properties
// that are in our groupKeys
const groupHash = groupKeys
.map(k => x => x[k])
.reduce(
(f, g) => x => `${f(x)}___${g(x)}`,
() => "GROUPKEY"
);
// The columns that summarize tell us which
// properties to sum for the items within the
// same group
const sumKeys = columns
.filter(c => c.summarize === true)
.map(c => c.dataField);
// Again, we compose in to a single function.
// This function concats two items, taking the
// "last" item with only applying the sum
// logic for keys in concatKeys
const concats = sumKeys
.reduce(
(f, k) => (a, b) => Object.assign(f(a, b), {
[k]: (a[k] || 0) + b[k]
}),
(a, b) => Object.assign({}, a, b)
)
// Now, we take our data and group by the groupHash
const groups = arr.reduce(
(groups, x) => {
const k = groupHash(x);
if (!groups[k]) groups[k] = [x];
else groups[k].push(x);
return groups;
},
{}
);
// These are the keys we want our final objects to have...
const allKeys = ["nrTotal"]
.concat(groupKeys)
.concat(sumKeys);
// ...baked in to a helper to remove other keys
const cleanKeys = obj => Object.assign(
...allKeys.map(k => ({ [k]: obj[k] }))
);
// With the items neatly grouped, we can reduce each
// group using the composed concatenator
const items = Object
.values(groups)
.flatMap(
xs => cleanKeys(
xs.reduce(concats, { nrTotal: xs.length })
),
);
console.log(items);
这是移植到 Ramda 的尝试,但除了用 Ramda 等价物替换 vanilla js 方法外,我没有取得更多进展。很想知道我错过了哪些很酷的实用程序和功能概念!我相信会有更了解 Ramda 细节的人插话!
const arr=[{"Global Id":"1231",TypeID:"FD1",Size:160,"Flöde":55},{"Global Id":"5433",TypeID:"FD1",Size:160,"Flöde":100},{"Global Id":"50433",TypeID:"FD1",Size:120,"Flöde":100},{"Global Id":"452",TypeID:"FD2",Size:120,"Flöde":100}],columns=[{dataField:"TypeID",summarize:!1},{dataField:"Size",summarize:!1},{dataField:"Flöde",summarize:!0}];
const [ sumCols, groupCols ] = R.partition(
R.prop("summarize"),
columns
);
const groupKeys = R.pluck("dataField", groupCols);
const sumKeys = R.pluck("dataField", sumCols);
const grouper = R.reduce(
(f, g) => x => `${f(x)}___${g(x)}`,
R.always("GROUPKEY"),
R.map(R.prop, groupKeys)
);
const reducer = R.reduce(
(f, k) => (a, b) => R.mergeRight(
f(a, b),
{ [k]: (a[k] || 0) + b[k] }
),
R.mergeRight,
sumKeys
);
const allowedKeys = new Set(
[ "nrTotal" ].concat(sumKeys).concat(groupKeys)
);
const cleanKeys = R.pipe(
R.toPairs,
R.filter(([k, v]) => allowedKeys.has(k)),
R.fromPairs
);
const items = R.flatten(
R.values(
R.map(
xs => cleanKeys(
R.reduce(
reducer,
{ nrTotal: xs.length },
xs
)
),
R.groupBy(grouper, arr)
)
)
);
console.log(items);
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>
这是我最初的方法。除了 summarize
之外的所有内容都是一个辅助函数,我想如果你真的想要的话,它可以被内联。我发现这种分离更干净。
const getKeys = (val) => pipe (
filter (propEq ('summarize', val) ),
pluck ('dataField')
)
const keyMaker = (columns, keys = getKeys (false) (columns)) => pipe (
pick (keys),
JSON .stringify
)
const makeReducer = (
columns,
toSum = getKeys (true) (columns),
toInclude = getKeys (false) (columns),
) => (a, b) => ({
...mergeAll (map (k => ({ [k]: b[k] }), toInclude ) ),
...mergeAll (map (k => ({ [k]: (a[k] || 0) + b[k] }), toSum ) ),
nrOfItems: (a .nrOfItems || 0) + 1
})
const summarize = (columns) => pipe (
groupBy (keyMaker (columns) ),
values,
map (reduce (makeReducer (columns), {} ))
)
const arr = [{"Flöde": 55, "Global Id": "1231", "Size": 160, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "5433", "Size": 160, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "50433", "Size": 120, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "452", "Size": 120, "TypeID": "FD2"}]
const columns = [{"dataField": "TypeID", "summarize": false}, {"dataField": "Size", "summarize": false}, {"dataField": "Flöde", "summarize": true}]
console .log (
summarize (columns) (arr)
)
<script src="https://bundle.run/ramda@0.26.1"></script><script>
const {pipe, filter, propEq, pluck, pick, mergeAll, map, groupBy, values, reduce} = ramda</script>
与 Joe 的解决方案有很多重叠之处,但也有一些真正的区别。看到问题的时候他已经贴出来了,但是我不想影响自己的做法,所以写了上面才去看。请注意我们的哈希函数的不同之处。我的 JSON.stringify
像 {TypeID: "FD1", Size: 160}
这样的值,而 Joe 的创建 "GROUPKEY___FD1___160"
。我想我更喜欢我的简单。另一方面,Joe 的解决方案在处理 nrOfItems
方面肯定比我的好。我在每次 reduce
迭代时更新它,并且必须使用 || 0
来处理初始情况。 Joe 只是以 already-known 值开始折叠。但总的来说,解决方案非常相似。
您提到想要减少通过数据的次数。我编写 Ramda 代码的方式往往对此没有帮助。此代码迭代整个列表以将其分组为类似的项目,然后迭代这些组中的每一个以折叠成单独的值。 (在 values
中可能还有一个小迭代。)当然可以更改这些以组合这两个迭代。它甚至可以缩短代码。但在我看来,它会变得更难理解。
更新
我对 single-pass 方法很好奇,发现我可以使用我为 multi-pass 构建的所有基础设施,只重写主要功能:
const summarize2 = (columns) => (
arr,
makeKey = keyMaker (columns),
reducer = makeReducer (columns)
) => values (reduce (
(a, item, key = makeKey (item) ) => assoc (key, reducer (key in a ? a[key]: {}, item), a),
{},
arr
))
console .log (
summarize2 (columns) (arr)
)
除非测试表明这段代码是我的应用程序的瓶颈,否则我不会选择它而不是原来的代码。但它并没有我想象的那么复杂,而且它在一次迭代中完成了所有事情(嗯,除了 values
所做的任何事情。)有趣的是,它让我改变了我对 values
的处理方式的想法。 =17=]。我的助手代码只适用于这个版本,我从来不需要知道组的总大小。如果我使用 Joe 的方法,就不会发生这种情况。
要分组的列表:
const arr = [
{
"Global Id": "1231",
"TypeID": "FD1",
"Size": 160,
"Flöde": 55,
},
{
"Global Id": "5433",
"TypeID": "FD1",
"Size": 160,
"Flöde": 100,
},
{
"Global Id": "50433",
"TypeID": "FD1",
"Size": 120,
"Flöde": 100,
},
{
"Global Id": "452",
"TypeID": "FD2",
"Size": 120,
"Flöde": 100,
},
]
指定要分组的键的函数输入:
const columns = [
{
"dataField": "TypeID",
"summarize": false,
},
{
"dataField": "Size",
"summarize": false,
},
{
"dataField": "Flöde",
"summarize": true,
},
]
预期输出:
const output = [
{
"TypeID": "FD1",
"Size": 160,
"Flöde": 155 // 55 + 100
"nrOfItems": 2
},
{
"TypeID": "FD1",
"Size": 120,
"Flöde": 100,
"nrOfItems": 1
},
{
"TypeID": "FD2",
"Size": 120,
"Flöde": 100,
"nrOfItems": 1
}
]
// nrOfItems adds up 4. 2 + 1 +1. The totalt nr of items.
函数:
const groupArr = (columns) => R.pipe(...);
"summarize"
属性 告诉 属性 是否应该总结。
数据集非常大,超过 100k 个项目。所以我不想重复不必要的东西。
我看过 R.group
但我不确定它是否适用于此处?
也许是 R.reduce
?将组存储在累加器中,汇总值并添加到计数中(如果该组已存在)?需要快速找到群组,以便将群组存储为密钥?
还是在这种情况下使用原版 javascript 更好?
这里先用 vanilla javascipt 回答,因为我对 Ramda 不是很熟悉 API。我很确定这种方法与 Ramda 非常相似。
代码有解释每一步的注释。我将尝试跟进对 Ramda 的重写。
const arr=[{"Global Id":"1231",TypeID:"FD1",Size:160,"Flöde":55},{"Global Id":"5433",TypeID:"FD1",Size:160,"Flöde":100},{"Global Id":"50433",TypeID:"FD1",Size:120,"Flöde":100},{"Global Id":"452",TypeID:"FD2",Size:120,"Flöde":100}],columns=[{dataField:"TypeID",summarize:!1},{dataField:"Size",summarize:!1},{dataField:"Flöde",summarize:!0}];
// The columns that don't summarize
// give us the keys we need to group on
const groupKeys = columns
.filter(c => c.summarize === false)
.map(g => g.dataField);
// We compose a hash function that create
// a hash out of all the items' properties
// that are in our groupKeys
const groupHash = groupKeys
.map(k => x => x[k])
.reduce(
(f, g) => x => `${f(x)}___${g(x)}`,
() => "GROUPKEY"
);
// The columns that summarize tell us which
// properties to sum for the items within the
// same group
const sumKeys = columns
.filter(c => c.summarize === true)
.map(c => c.dataField);
// Again, we compose in to a single function.
// This function concats two items, taking the
// "last" item with only applying the sum
// logic for keys in concatKeys
const concats = sumKeys
.reduce(
(f, k) => (a, b) => Object.assign(f(a, b), {
[k]: (a[k] || 0) + b[k]
}),
(a, b) => Object.assign({}, a, b)
)
// Now, we take our data and group by the groupHash
const groups = arr.reduce(
(groups, x) => {
const k = groupHash(x);
if (!groups[k]) groups[k] = [x];
else groups[k].push(x);
return groups;
},
{}
);
// These are the keys we want our final objects to have...
const allKeys = ["nrTotal"]
.concat(groupKeys)
.concat(sumKeys);
// ...baked in to a helper to remove other keys
const cleanKeys = obj => Object.assign(
...allKeys.map(k => ({ [k]: obj[k] }))
);
// With the items neatly grouped, we can reduce each
// group using the composed concatenator
const items = Object
.values(groups)
.flatMap(
xs => cleanKeys(
xs.reduce(concats, { nrTotal: xs.length })
),
);
console.log(items);
这是移植到 Ramda 的尝试,但除了用 Ramda 等价物替换 vanilla js 方法外,我没有取得更多进展。很想知道我错过了哪些很酷的实用程序和功能概念!我相信会有更了解 Ramda 细节的人插话!
const arr=[{"Global Id":"1231",TypeID:"FD1",Size:160,"Flöde":55},{"Global Id":"5433",TypeID:"FD1",Size:160,"Flöde":100},{"Global Id":"50433",TypeID:"FD1",Size:120,"Flöde":100},{"Global Id":"452",TypeID:"FD2",Size:120,"Flöde":100}],columns=[{dataField:"TypeID",summarize:!1},{dataField:"Size",summarize:!1},{dataField:"Flöde",summarize:!0}];
const [ sumCols, groupCols ] = R.partition(
R.prop("summarize"),
columns
);
const groupKeys = R.pluck("dataField", groupCols);
const sumKeys = R.pluck("dataField", sumCols);
const grouper = R.reduce(
(f, g) => x => `${f(x)}___${g(x)}`,
R.always("GROUPKEY"),
R.map(R.prop, groupKeys)
);
const reducer = R.reduce(
(f, k) => (a, b) => R.mergeRight(
f(a, b),
{ [k]: (a[k] || 0) + b[k] }
),
R.mergeRight,
sumKeys
);
const allowedKeys = new Set(
[ "nrTotal" ].concat(sumKeys).concat(groupKeys)
);
const cleanKeys = R.pipe(
R.toPairs,
R.filter(([k, v]) => allowedKeys.has(k)),
R.fromPairs
);
const items = R.flatten(
R.values(
R.map(
xs => cleanKeys(
R.reduce(
reducer,
{ nrTotal: xs.length },
xs
)
),
R.groupBy(grouper, arr)
)
)
);
console.log(items);
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>
这是我最初的方法。除了 summarize
之外的所有内容都是一个辅助函数,我想如果你真的想要的话,它可以被内联。我发现这种分离更干净。
const getKeys = (val) => pipe (
filter (propEq ('summarize', val) ),
pluck ('dataField')
)
const keyMaker = (columns, keys = getKeys (false) (columns)) => pipe (
pick (keys),
JSON .stringify
)
const makeReducer = (
columns,
toSum = getKeys (true) (columns),
toInclude = getKeys (false) (columns),
) => (a, b) => ({
...mergeAll (map (k => ({ [k]: b[k] }), toInclude ) ),
...mergeAll (map (k => ({ [k]: (a[k] || 0) + b[k] }), toSum ) ),
nrOfItems: (a .nrOfItems || 0) + 1
})
const summarize = (columns) => pipe (
groupBy (keyMaker (columns) ),
values,
map (reduce (makeReducer (columns), {} ))
)
const arr = [{"Flöde": 55, "Global Id": "1231", "Size": 160, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "5433", "Size": 160, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "50433", "Size": 120, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "452", "Size": 120, "TypeID": "FD2"}]
const columns = [{"dataField": "TypeID", "summarize": false}, {"dataField": "Size", "summarize": false}, {"dataField": "Flöde", "summarize": true}]
console .log (
summarize (columns) (arr)
)
<script src="https://bundle.run/ramda@0.26.1"></script><script>
const {pipe, filter, propEq, pluck, pick, mergeAll, map, groupBy, values, reduce} = ramda</script>
与 Joe 的解决方案有很多重叠之处,但也有一些真正的区别。看到问题的时候他已经贴出来了,但是我不想影响自己的做法,所以写了上面才去看。请注意我们的哈希函数的不同之处。我的 JSON.stringify
像 {TypeID: "FD1", Size: 160}
这样的值,而 Joe 的创建 "GROUPKEY___FD1___160"
。我想我更喜欢我的简单。另一方面,Joe 的解决方案在处理 nrOfItems
方面肯定比我的好。我在每次 reduce
迭代时更新它,并且必须使用 || 0
来处理初始情况。 Joe 只是以 already-known 值开始折叠。但总的来说,解决方案非常相似。
您提到想要减少通过数据的次数。我编写 Ramda 代码的方式往往对此没有帮助。此代码迭代整个列表以将其分组为类似的项目,然后迭代这些组中的每一个以折叠成单独的值。 (在 values
中可能还有一个小迭代。)当然可以更改这些以组合这两个迭代。它甚至可以缩短代码。但在我看来,它会变得更难理解。
更新
我对 single-pass 方法很好奇,发现我可以使用我为 multi-pass 构建的所有基础设施,只重写主要功能:
const summarize2 = (columns) => (
arr,
makeKey = keyMaker (columns),
reducer = makeReducer (columns)
) => values (reduce (
(a, item, key = makeKey (item) ) => assoc (key, reducer (key in a ? a[key]: {}, item), a),
{},
arr
))
console .log (
summarize2 (columns) (arr)
)
除非测试表明这段代码是我的应用程序的瓶颈,否则我不会选择它而不是原来的代码。但它并没有我想象的那么复杂,而且它在一次迭代中完成了所有事情(嗯,除了 values
所做的任何事情。)有趣的是,它让我改变了我对 values
的处理方式的想法。 =17=]。我的助手代码只适用于这个版本,我从来不需要知道组的总大小。如果我使用 Joe 的方法,就不会发生这种情况。