preg_match 想要检测多个网址时无法正常工作

preg_match not working when wanting to detect multiple urls

我想自动检测字符串中的 link 并将其替换为 [index of link]。例如,如果我有一个像 test https://www.google.com/ mmh http://whosebug.com/ 这样的字符串,结果将是 test [0] mmh [1].

现在我试过这个

$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match($reg_exUrl, $_POST['commento'], $url)) {
    for ($i = 0; $i < count($url); $i++) { 
        $_POST['commento'] = preg_replace($reg_exUrl, "[" . $i . "]", $_POST['commento']);
    }
}

但我一直得到 test [0] mmh [0],如果我尝试 var_dump(count($url)) 结果总是得到 1。我该如何解决这个问题?

因此,这里更好的解决方案是将传入字符串拆分为每个 url 段之间的字符串数组,然后在连续的非 url 组件之间插入 [$i] .

# better solution, perform a split.
function process_line2($input) {
    $regex_url = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
    # split the incoming string into an array of non-url segments
    # preg_split does not trim leading or trailing empty segments
    $non_url_segments = preg_split($regex_url, $input, -1);

    # inside the array, combine each successive non-url segment
    # with the next index
    $out = [];
    $count = count($non_url_segments);
    for ($i = 0; $i < $count; $i++) {
        # add the segment
        array_push($out, $non_url_segments[$i]);
        # add its index surrounded by brackets on all segments but the last one
        if ($i < $count -1) {
            array_push($out, '[' . $i . ']');
        }
    }
    # join strings with no whitespace
    return implode('', $out);
}

preg_match 仅 returns 第一个结果,因此它不会为您提供与您的正则表达式匹配的 url 的数量。您需要提取 preg_match_all.

返回的数组的第一个元素

第二个错误是您没有使用 preg_replacelimit 参数,因此您所有的 url 都同时被替换。

来自 preg_replace 的文档:http://php.net/manual/en/function.preg-replace.php

参数为

mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

特别是限制参数默认为-1(无限制)

limit: The maximum possible replacements for each pattern in each subject string. Defaults to -1 (no limit).

您需要将明确的限制设置为 1。

详细说明如何将 preg_match 替换为 preg_match_all,您需要从中提取 [0] 组件,因为 preg_match_all returns 是一个数组数组。例如:

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(23) "https://www.google.com/"
    [1]=>
    string(25) "http://whosebug.com/"
  }
}

这是包含修复程序的示例。

<?php 

# original function
function process_line($input) {

    $reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
    if(preg_match($reg_exUrl, $input, $url)) {
        for ($i = 0; $i < count($url); $i++) { 
            $input = preg_replace($reg_exUrl, "[" . $i . "]", $input);
        }
    }

    return $input;

}

# function with fixes incorporated
function process_line1($input) {

    $reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
    if(preg_match_all($reg_exUrl, $input, $url)) {
        $url_matches = $url[0];
        for ($i = 0; $i < count($url_matches); $i++) { 
            echo $i;
            # add explicit limit of 1 to arguments of preg_replace
            $input = preg_replace($reg_exUrl, "[" . $i . "]", $input, 1);
        }
    }

    return $input;

}

$input = "test https://www.google.com/ mmh http://whosebug.com/";

$input = process_line1($input);

echo $input;

?>