PHP array_merge() 仅优先使用第一个数组和唯一值?
PHP array_merge() with preference of first array and unique values only?
我想将多个数组合并在一起,同时优先使用第一个数组中的值并且仅具有唯一值。有没有比使用 array_merge()
、array_unique()
和 +
运算符更快的方法?
function foo(...$params) {
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$merged = array_merge($a, ...$params);
$unique = array_unique($merged);
print_r($merged);
print_r($unique);
print_r($a + $unique);
}
foo(
['col4', 'col5_alias' => 'col5', 'col6'],
['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);
只是合并数组给我重复的值,并覆盖第一个数组中的值:
Array
(
[0] => col1 // duplicate
[col2_alias] => col10 // overwritten
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
[5] => col1 // duplicate
)
使用 array_unique()
显然可以修复重复值,但不能修复覆盖值:
Array
(
[0] => col1
[col2_alias] => col10
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
使用+
运算符后,数组就是我想要的样子
Array
(
[0] => col1
[col2_alias] => col2
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
实际上我没有发现您的脚本有任何重大问题,我不知道您为什么要改进它。但是我已经编写了你的函数的实现,它似乎运行得更快一些,看看(我还添加了一些参数来测试函数结果):
<?php
function foo(...$params) {
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$merged = array_merge($a, ...$params);
$unique = array_unique($merged);
return $a + $unique;
}
function foo2(...$params) {
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$merged = array_merge(array_diff(array_merge(...$params), $a), $a);
return $merged;
}
$timeFoo = microtime(true);
for($i = 0; $i < 1000000; $i++) {
foo(
['col13', 'col5_alias' => 'col3', 'col8'],
['col21', 'col5_alias' => 'col1', 'col9'],
['col4', 'col5_alias' => 'col5', 'col6'],
['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);
}
$timeFoo = microtime(true) - $timeFoo;
$timeFoo2 = microtime(true);
for($i = 0; $i < 1000000; $i++) {
foo2(
['col13', 'col5_alias' => 'col3', 'col8'],
['col21', 'col5_alias' => 'col1', 'col9'],
['col4', 'col5_alias' => 'col5', 'col6'],
['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);
}
$timeFoo2 = microtime(true) - $timeFoo2;
echo "'foo' time: $timeFoo \n";
echo "'foo2' time: $timeFoo2 \n";
结果有时会有所不同,但差别不大:
'foo' time: 3.4310319423676
'foo2' time: 2.5314350128174
所以它给了我们近 30% 的性能提升。
您认为使用 array_merge
、array_unique
函数和 +
运算符会很慢是对的。我写了一些代码来衡量每个组合的速度...
这是代码...
<?php
class ArraySpeeds
{
public $la = ['col1', 'col2_alias' => 'col2', 'col3'];
public $a = ['col4', 'col5_alias' => 'col5', 'col6'];
public $b = ['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10'];
public $c = [];
public function executionTime ($callback)
{
$start = microtime (true);
for ($i = 0; $i < 1000000; $i++) {
$callback ();
}
return round ((microtime (true) - $start) * 1000) . '/ms' . PHP_EOL;
}
public function getTimes ()
{
$array_merge_time = $this->executionTime (function () {
$this->c[0] = array_merge ($this->la, $this->a, $this->b);
});
$array_unique_time = $this->executionTime (function () {
$merged = array_merge ($this->la, $this->a, $this->b);
$this->c[1] = array_unique ($merged);
});
$addition_time = $this->executionTime (function () {
$merged = array_merge ($this->la, $this->a, $this->b);
$unique = array_unique ($merged);
$this->c[2] = $this->la + $unique;
});
$array_diff_time = $this->executionTime (function () {
$merged = array_merge ($this->a, $this->b);
$diffed = array_diff ($merged, $this->la);
$this->c[3] = array_merge ($diffed, $this->la);
});
echo print_r ($this->c[0], true), PHP_EOL;
echo print_r ($this->c[1], true), PHP_EOL;
echo print_r ($this->c[2], true), PHP_EOL;
natsort ($this->c[3]);
echo print_r ($this->c[3], true), PHP_EOL;
echo 'array_merge: ', $array_merge_time;
echo 'array_unique: ', $array_unique_time;
echo 'addition: ', $addition_time;
echo 'array_diff: ', $array_diff_time;
}
}
$arrayspeeds = new ArraySpeeds ();
$arrayspeeds->getTimes ();
这是输出...
Array
(
[0] => col1
[col2_alias] => col10
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
[5] => col1
)
Array
(
[0] => col1
[col2_alias] => col10
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
Array
(
[0] => col1
[col2_alias] => col2
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
Array
(
[3] => col1
[col2_alias] => col2
[4] => col3
[0] => col4
[col5_alias] => col5
[1] => col6
[2] => col7
)
array_merge: 403/ms
array_unique: 1039/ms
addition: 1267/ms
array_diff: 993/ms
您可以看到每次添加函数调用时执行时间都会变长,其中 array_merge
、array_unique
函数和 +
运算符最慢,速度是其两倍多。
然而,使用 array_diff
将获得不错的性能,输出正确,但排序不正确。向数组添加一个 natsort
函数调用可以解决这个问题。
例如...
function foo (...$params)
{
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$diff = array_diff (array_merge (...$params), $a);
$merged = array_merge ($diff, $a);
natsort ($merged);
print_r ($merged);
}
我想将多个数组合并在一起,同时优先使用第一个数组中的值并且仅具有唯一值。有没有比使用 array_merge()
、array_unique()
和 +
运算符更快的方法?
function foo(...$params) {
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$merged = array_merge($a, ...$params);
$unique = array_unique($merged);
print_r($merged);
print_r($unique);
print_r($a + $unique);
}
foo(
['col4', 'col5_alias' => 'col5', 'col6'],
['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);
只是合并数组给我重复的值,并覆盖第一个数组中的值:
Array
(
[0] => col1 // duplicate
[col2_alias] => col10 // overwritten
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
[5] => col1 // duplicate
)
使用 array_unique()
显然可以修复重复值,但不能修复覆盖值:
Array
(
[0] => col1
[col2_alias] => col10
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
使用+
运算符后,数组就是我想要的样子
Array
(
[0] => col1
[col2_alias] => col2
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
实际上我没有发现您的脚本有任何重大问题,我不知道您为什么要改进它。但是我已经编写了你的函数的实现,它似乎运行得更快一些,看看(我还添加了一些参数来测试函数结果):
<?php
function foo(...$params) {
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$merged = array_merge($a, ...$params);
$unique = array_unique($merged);
return $a + $unique;
}
function foo2(...$params) {
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$merged = array_merge(array_diff(array_merge(...$params), $a), $a);
return $merged;
}
$timeFoo = microtime(true);
for($i = 0; $i < 1000000; $i++) {
foo(
['col13', 'col5_alias' => 'col3', 'col8'],
['col21', 'col5_alias' => 'col1', 'col9'],
['col4', 'col5_alias' => 'col5', 'col6'],
['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);
}
$timeFoo = microtime(true) - $timeFoo;
$timeFoo2 = microtime(true);
for($i = 0; $i < 1000000; $i++) {
foo2(
['col13', 'col5_alias' => 'col3', 'col8'],
['col21', 'col5_alias' => 'col1', 'col9'],
['col4', 'col5_alias' => 'col5', 'col6'],
['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10']);
}
$timeFoo2 = microtime(true) - $timeFoo2;
echo "'foo' time: $timeFoo \n";
echo "'foo2' time: $timeFoo2 \n";
结果有时会有所不同,但差别不大:
'foo' time: 3.4310319423676
'foo2' time: 2.5314350128174
所以它给了我们近 30% 的性能提升。
您认为使用 array_merge
、array_unique
函数和 +
运算符会很慢是对的。我写了一些代码来衡量每个组合的速度...
这是代码...
<?php
class ArraySpeeds
{
public $la = ['col1', 'col2_alias' => 'col2', 'col3'];
public $a = ['col4', 'col5_alias' => 'col5', 'col6'];
public $b = ['col7', 'col1', 'col5_alias' => 'col5', 'col2_alias' => 'col10'];
public $c = [];
public function executionTime ($callback)
{
$start = microtime (true);
for ($i = 0; $i < 1000000; $i++) {
$callback ();
}
return round ((microtime (true) - $start) * 1000) . '/ms' . PHP_EOL;
}
public function getTimes ()
{
$array_merge_time = $this->executionTime (function () {
$this->c[0] = array_merge ($this->la, $this->a, $this->b);
});
$array_unique_time = $this->executionTime (function () {
$merged = array_merge ($this->la, $this->a, $this->b);
$this->c[1] = array_unique ($merged);
});
$addition_time = $this->executionTime (function () {
$merged = array_merge ($this->la, $this->a, $this->b);
$unique = array_unique ($merged);
$this->c[2] = $this->la + $unique;
});
$array_diff_time = $this->executionTime (function () {
$merged = array_merge ($this->a, $this->b);
$diffed = array_diff ($merged, $this->la);
$this->c[3] = array_merge ($diffed, $this->la);
});
echo print_r ($this->c[0], true), PHP_EOL;
echo print_r ($this->c[1], true), PHP_EOL;
echo print_r ($this->c[2], true), PHP_EOL;
natsort ($this->c[3]);
echo print_r ($this->c[3], true), PHP_EOL;
echo 'array_merge: ', $array_merge_time;
echo 'array_unique: ', $array_unique_time;
echo 'addition: ', $addition_time;
echo 'array_diff: ', $array_diff_time;
}
}
$arrayspeeds = new ArraySpeeds ();
$arrayspeeds->getTimes ();
这是输出...
Array
(
[0] => col1
[col2_alias] => col10
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
[5] => col1
)
Array
(
[0] => col1
[col2_alias] => col10
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
Array
(
[0] => col1
[col2_alias] => col2
[1] => col3
[2] => col4
[col5_alias] => col5
[3] => col6
[4] => col7
)
Array
(
[3] => col1
[col2_alias] => col2
[4] => col3
[0] => col4
[col5_alias] => col5
[1] => col6
[2] => col7
)
array_merge: 403/ms
array_unique: 1039/ms
addition: 1267/ms
array_diff: 993/ms
您可以看到每次添加函数调用时执行时间都会变长,其中 array_merge
、array_unique
函数和 +
运算符最慢,速度是其两倍多。
然而,使用 array_diff
将获得不错的性能,输出正确,但排序不正确。向数组添加一个 natsort
函数调用可以解决这个问题。
例如...
function foo (...$params)
{
$a = [
'col1',
'col2_alias' => 'col2',
'col3'
];
$diff = array_diff (array_merge (...$params), $a);
$merged = array_merge ($diff, $a);
natsort ($merged);
print_r ($merged);
}