从关联数组中检索和删除重复值
Retrieve and remove duplicate values from an associative array
我有如下的关联数组
$arr = [1=>0, 2=>1, 3=>1, 4=>2, 5=>2, 6=>3]
我想从初始数组中删除重复值,return 将这些重复值作为重复数组的新数组。所以我最终会得到类似的东西;
$arr = [1=>0, 6=>3]
$new_arr = [[2=>1, 3=>1],[4=>2, 5=>2]]
PHP是否提供这样的功能,如果没有,我该如何实现?
我试过了;
$array = [];
$array[1] = 5;
$array[2] = 5;
$array[3] = 4;
$array[5] = 6;
$array[7] = 7;
$array[8] = 7;
$counts = array_count_values($array);
print_r($counts);
$duplicates = array_filter($array, function ($value) use ($counts) {
return $counts[$value] > 1;
});
print_r($duplicates);
$result = array_diff($array, $duplicates);
print_r($result);
这输出;
[1] => 5
[2] => 5
[7] => 7
[8] => 7
&
[3] => 4
[5] => 6
这几乎就是我想要的。
代码
以下对我有用...虽然我没有在复杂性和性能方面做出任何承诺,但总的想法是...另外,我已经很多年没有写 PHP 了,所以记住这一点。
<?php
function nubDups( $arr ) {
$seen = [];
$dups = [];
foreach ( $arr as $k => $v) {
if ( array_key_exists( $v, $seen ) ) {
// duplicate found!
if ( !array_key_exists( $v, $dups ) )
$dups[$v] = [$seen[$v]];
$dups[$v][] = $k;
} else
// First time seen, record!
$seen[$v] = $k;
}
$uniques = [];
foreach ( $seen as $v => $k ) {
if ( !array_key_exists( $v, $dups ) ) $uniques[$k] = $v;
}
return [$uniques, $dups];
}
function nubDups2( $arr ) {
for ( $seen = $dups = []; list( $k, $v ) = each( $arr ); )
if ( key_exists( $v, $dups ) ) $dups[$v][] = $k;
else if ( key_exists( $v, $seen ) ) $dups[$v] = [$seen[$v], $k];
else $seen[$v] = $k;
return [array_flip( array_diff_key( $seen, $dups ) ), $dups];
}
$arr = [0, 1, 4, 1, 2, 2, 3];
print_r( nubDups( $arr ) );
print_r( nubDups2( $arr ) );
输出(两者)
$ php Test.php
Array
(
[0] => 0
[2] => 4
[6] => 3
)
Array
(
[1] => Array
(
[0] => 1
[1] => 3
)
[2] => Array
(
[0] => 4
[1] => 5
)
)
缩短
- 删除,指定为
[(k, v)]
:[(0, 0), (2, 4), (6, 3)]
- 重复,指定为
[(v, [k])]
:[(1, [1, 3]), (2, [4, 5])]
在Haskell
此版本滥用哈希表进行快速查找。
一个更简单的版本,几乎做同样的事情但忽略索引,写在 haskell:
-- | 'nubDupsBy': for a given list yields a pair where the fst contains the
-- the list without any duplicates, and snd contains the duplicate elements.
-- This is determined by a user specified binary predicate function.
nubDupsBy :: (a -> a -> Bool) -> [a] -> ([a], [a])
nubDupsBy p = foldl f ([], [])
where f (seen, dups) x | any (p x) seen = (seen, dups ++ [x])
| otherwise = (seen ++ [x], dups)
我有如下的关联数组
$arr = [1=>0, 2=>1, 3=>1, 4=>2, 5=>2, 6=>3]
我想从初始数组中删除重复值,return 将这些重复值作为重复数组的新数组。所以我最终会得到类似的东西;
$arr = [1=>0, 6=>3]
$new_arr = [[2=>1, 3=>1],[4=>2, 5=>2]]
PHP是否提供这样的功能,如果没有,我该如何实现?
我试过了;
$array = [];
$array[1] = 5;
$array[2] = 5;
$array[3] = 4;
$array[5] = 6;
$array[7] = 7;
$array[8] = 7;
$counts = array_count_values($array);
print_r($counts);
$duplicates = array_filter($array, function ($value) use ($counts) {
return $counts[$value] > 1;
});
print_r($duplicates);
$result = array_diff($array, $duplicates);
print_r($result);
这输出;
[1] => 5
[2] => 5
[7] => 7
[8] => 7
&
[3] => 4
[5] => 6
这几乎就是我想要的。
代码
以下对我有用...虽然我没有在复杂性和性能方面做出任何承诺,但总的想法是...另外,我已经很多年没有写 PHP 了,所以记住这一点。
<?php
function nubDups( $arr ) {
$seen = [];
$dups = [];
foreach ( $arr as $k => $v) {
if ( array_key_exists( $v, $seen ) ) {
// duplicate found!
if ( !array_key_exists( $v, $dups ) )
$dups[$v] = [$seen[$v]];
$dups[$v][] = $k;
} else
// First time seen, record!
$seen[$v] = $k;
}
$uniques = [];
foreach ( $seen as $v => $k ) {
if ( !array_key_exists( $v, $dups ) ) $uniques[$k] = $v;
}
return [$uniques, $dups];
}
function nubDups2( $arr ) {
for ( $seen = $dups = []; list( $k, $v ) = each( $arr ); )
if ( key_exists( $v, $dups ) ) $dups[$v][] = $k;
else if ( key_exists( $v, $seen ) ) $dups[$v] = [$seen[$v], $k];
else $seen[$v] = $k;
return [array_flip( array_diff_key( $seen, $dups ) ), $dups];
}
$arr = [0, 1, 4, 1, 2, 2, 3];
print_r( nubDups( $arr ) );
print_r( nubDups2( $arr ) );
输出(两者)
$ php Test.php
Array
(
[0] => 0
[2] => 4
[6] => 3
)
Array
(
[1] => Array
(
[0] => 1
[1] => 3
)
[2] => Array
(
[0] => 4
[1] => 5
)
)
缩短
- 删除,指定为
[(k, v)]
:[(0, 0), (2, 4), (6, 3)]
- 重复,指定为
[(v, [k])]
:[(1, [1, 3]), (2, [4, 5])]
在Haskell
此版本滥用哈希表进行快速查找。 一个更简单的版本,几乎做同样的事情但忽略索引,写在 haskell:
-- | 'nubDupsBy': for a given list yields a pair where the fst contains the
-- the list without any duplicates, and snd contains the duplicate elements.
-- This is determined by a user specified binary predicate function.
nubDupsBy :: (a -> a -> Bool) -> [a] -> ([a], [a])
nubDupsBy p = foldl f ([], [])
where f (seen, dups) x | any (p x) seen = (seen, dups ++ [x])
| otherwise = (seen ++ [x], dups)