如何根据 PHP 中的键清理具有半重复值的数组数组?
How to Clean Up Array of Arrays with Semi-duplicate Values Base on Keys in PHP?
假设我们正在做某种抓取,最后我们可以获得重复和半重复的结果。
给定一个可能看起来有点像这样的输入数组:
$inputArr = [
[
'title' => 'Test0',
'desc' => 'Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Short Desc',
],
[
'title' => 'Test0',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test0.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test1',
'desc' => 'Short Desc',
],
[
'title' => 'Test1',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test1.5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3',
'desc' => 'Short Desc',
],
[
'title' => 'Test2',
'desc' => 'Short Desc',
],
[
'title' => 'Test3.75',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3.25',
'desc' => 'Short Desc',
],
[
'title' => 'Test2',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test4',
'desc' => 'Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test4.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test4',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
];
生成的数组必须仅包含具有 title
值的一个实例的数组,其中 desc
是最长的字符串值,同时删除除 desc
具有相等字符串长度的一个以外的所有数组对他人的价值。
例如最终输出应如下所示:
$resultArr = [
[
'title' => 'Test0',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test0.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test1',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test1.5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test2',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3.25',
'desc' => 'Short Desc',
],
[
'title' => 'Test3.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test3.75',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test4',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test4.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
];
我尝试了几种不同的解决方案,但我都不喜欢其中任何一种。不管我是怎么想出来的,感觉就像是一团糟,我觉得我错过了一个明显而优雅的解决方案。
我知道有人会提出比我尝试过的排序、循环和过滤更干净的建议。
你可以这样做:
foreach($inputArr as $item) {
if ( isset($result[$item['title']]) && strlen($result[$item['title']]['desc']) > strlen($item['desc']) )
continue;
$result[$item['title']] = $item;
}
$result = array_values($result);
print_r($result);
您使用标题作为键构建了一个新的关联数组。循环原始数组,当键存在时,检查 desc 的长度是否更长,否则继续,将结果数组中的项目替换为当前项目。
你也可以使用array_reduce
:
$result = array_reduce($inputArr, function ($c, $i) {
if ( !isset($c[$i['title']]) || strlen($c[$i['title']]['desc']) < strlen($i['desc']) )
$c[$i['title']] = $i;
return $c;
});
$result = array_values($result);
print_r($result);
假设我们正在做某种抓取,最后我们可以获得重复和半重复的结果。
给定一个可能看起来有点像这样的输入数组:
$inputArr = [
[
'title' => 'Test0',
'desc' => 'Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Short Desc',
],
[
'title' => 'Test0',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test0.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test1',
'desc' => 'Short Desc',
],
[
'title' => 'Test1',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test1.5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3',
'desc' => 'Short Desc',
],
[
'title' => 'Test2',
'desc' => 'Short Desc',
],
[
'title' => 'Test3.75',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3.25',
'desc' => 'Short Desc',
],
[
'title' => 'Test2',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test4',
'desc' => 'Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test4.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test4',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
];
生成的数组必须仅包含具有 title
值的一个实例的数组,其中 desc
是最长的字符串值,同时删除除 desc
具有相等字符串长度的一个以外的所有数组对他人的价值。
例如最终输出应如下所示:
$resultArr = [
[
'title' => 'Test0',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test0.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test1',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test1.5',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test2',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test3.25',
'desc' => 'Short Desc',
],
[
'title' => 'Test3.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test3.75',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test4',
'desc' => 'Much Longer Than Short Desc',
],
[
'title' => 'Test4.5',
'desc' => 'Short Desc',
],
[
'title' => 'Test5',
'desc' => 'Much Longer Than Short Desc',
],
];
我尝试了几种不同的解决方案,但我都不喜欢其中任何一种。不管我是怎么想出来的,感觉就像是一团糟,我觉得我错过了一个明显而优雅的解决方案。
我知道有人会提出比我尝试过的排序、循环和过滤更干净的建议。
你可以这样做:
foreach($inputArr as $item) {
if ( isset($result[$item['title']]) && strlen($result[$item['title']]['desc']) > strlen($item['desc']) )
continue;
$result[$item['title']] = $item;
}
$result = array_values($result);
print_r($result);
您使用标题作为键构建了一个新的关联数组。循环原始数组,当键存在时,检查 desc 的长度是否更长,否则继续,将结果数组中的项目替换为当前项目。
你也可以使用array_reduce
:
$result = array_reduce($inputArr, function ($c, $i) {
if ( !isset($c[$i['title']]) || strlen($c[$i['title']]['desc']) < strlen($i['desc']) )
$c[$i['title']] = $i;
return $c;
});
$result = array_values($result);
print_r($result);