比较 URL 数组中的主机名并获取唯一值

Question

我需要比较 URL 并从数组中删除重复项，但我只想比较来自 url 的主机。当我比较时，我需要跳过 http 和 https 以及 www 和其他类似最后的斜线。所以当我有数组时：

    $urls = array(
'http://www.google.com/test', 
'https://www.google.com/test',
'https://www.google.com/example', 
'https://www.facebook.com/example',
'http://www.facebook.com/example');

结果只会是

http://www.google.com/test
http://www.google.com/example
http://www.facebook.com/example

我试过这样比较：

$urls = array_udiff($urls, $urls, function ($a, $b) {
                 return strcmp(preg_replace('|^https?://(www\.)?|', '', rtrim($a,'/')), preg_replace('|^https?://(www\.)?|', '', rtrim($b,'/')));
            });

但它 return 我是空数组。

Answer 1

试试这个方法：

<?php
function parseURLs(array $urls){
    $rs = [];
    foreach($urls as $url){
        $segments = parse_url($url);
        if(!in_array($segments['host'], $rs))
            $rs[] = $segments['host'];
    }
    return $rs;
}

然后：

<?php
$urls = array(
    'http://www.google.com',
    'https://www.google.com',
    'https://www.google.com/',
    'https://www.facebook.com',
    'http://www.facebook.com'
);
$uniqueURLs = parseURLs($urls);
print_r($uniqueURLs);

/* result :
Array
(
    [0] => www.google.com
    [1] => www.facebook.com
)
*/

Answer 2

<?php
   $urls = array(
    'http://www.google.com/test',
    'https://www.google.com/test',
    'https://www.google.com/example',
    'https://www.facebook.com/example',
    'http://www.facebook.com/example');


$MyArray = [];
for($i=0;$i<count($urls);$i++)  {

preg_match_all('/www.(.*)/', $urls[$i], $matches);

    if (!in_array($matches[1], $MyArray))
        $MyArray[] = $matches[1];
}

echo "<pre>";
print_r($MyArray);
echo "</pre>";

输出为

Array
(
    [0] => Array
        (
            [0] => google.com/test
        )

    [1] => Array
        (
            [0] => google.com/example
        )

    [2] => Array
        (
            [0] => facebook.com/example
        )

)

修剪并仅保留主机名

Answer 3

您需要循环遍历 URL，使用 PHP 的 url_parse() 函数解析 URL，并使用 array_unique 从中删除重复项数组，所以我们正在检查主机和路径..

我写了一个class给你：

<?php
/** Get Unique Values from array Values **/
Class Parser {
    //Url Parser Function
    public function arrayValuesUrlParser($urls) {
        //Create Container
        $parsed = [];
        //Loop Through the Urls
        foreach($urls as $url) {
            $parse = parse_url($url);
            $parsed[] = $parse["host"].$parse["path"];
            //Delete Duplicates
            $result = array_unique($parsed);
        }
        //Dump result
        print_r($result);
    }

}

?>

使用 Class

<?php
//Inlcude tghe Parser
include_once "Parser.php";

    $urls = array(
    'http://www.google.com/test', 
    'https://www.google.com/test',
    'https://www.google.com/example', 
    'https://www.facebook.com/example',
    'http://www.facebook.com/example');
    //Instantiate
    $parse = new Parser();
    $parse->arrayValuesUrlParser($urls);

?>

如果您不需要分隔文件，您可以在一个文件中执行此操作，但如果您使用一个 php 文件，则必须删除 include_once。这个 class 也在 PHP Classes 上，是为了好玩！

祝你好运！

比较 URL 数组中的主机名并获取唯一值

Compare host name from array of URLs and get unique values

php

arrays

unique

distinct