preg_match 想要检测多个网址时无法正常工作
preg_match not working when wanting to detect multiple urls
我想自动检测字符串中的 link 并将其替换为 [index of link]
。例如,如果我有一个像 test https://www.google.com/ mmh http://whosebug.com/
这样的字符串,结果将是 test [0] mmh [1]
.
现在我试过这个
$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match($reg_exUrl, $_POST['commento'], $url)) {
for ($i = 0; $i < count($url); $i++) {
$_POST['commento'] = preg_replace($reg_exUrl, "[" . $i . "]", $_POST['commento']);
}
}
但我一直得到 test [0] mmh [0]
,如果我尝试 var_dump(count($url))
结果总是得到 1。我该如何解决这个问题?
因此,这里更好的解决方案是将传入字符串拆分为每个 url
段之间的字符串数组,然后在连续的非 url 组件之间插入 [$i]
.
# better solution, perform a split.
function process_line2($input) {
$regex_url = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
# split the incoming string into an array of non-url segments
# preg_split does not trim leading or trailing empty segments
$non_url_segments = preg_split($regex_url, $input, -1);
# inside the array, combine each successive non-url segment
# with the next index
$out = [];
$count = count($non_url_segments);
for ($i = 0; $i < $count; $i++) {
# add the segment
array_push($out, $non_url_segments[$i]);
# add its index surrounded by brackets on all segments but the last one
if ($i < $count -1) {
array_push($out, '[' . $i . ']');
}
}
# join strings with no whitespace
return implode('', $out);
}
preg_match
仅 returns 第一个结果,因此它不会为您提供与您的正则表达式匹配的 url 的数量。您需要提取 preg_match_all
.
返回的数组的第一个元素
第二个错误是您没有使用 preg_replace
的 limit
参数,因此您所有的 url 都同时被替换。
来自 preg_replace
的文档:http://php.net/manual/en/function.preg-replace.php
参数为
mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )
特别是限制参数默认为-1
(无限制)
limit: The maximum possible replacements for each pattern in each subject string. Defaults to -1 (no limit).
您需要将明确的限制设置为 1。
详细说明如何将 preg_match
替换为 preg_match_all
,您需要从中提取 [0] 组件,因为 preg_match_all
returns 是一个数组数组。例如:
array(1) {
[0]=>
array(2) {
[0]=>
string(23) "https://www.google.com/"
[1]=>
string(25) "http://whosebug.com/"
}
}
这是包含修复程序的示例。
<?php
# original function
function process_line($input) {
$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match($reg_exUrl, $input, $url)) {
for ($i = 0; $i < count($url); $i++) {
$input = preg_replace($reg_exUrl, "[" . $i . "]", $input);
}
}
return $input;
}
# function with fixes incorporated
function process_line1($input) {
$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match_all($reg_exUrl, $input, $url)) {
$url_matches = $url[0];
for ($i = 0; $i < count($url_matches); $i++) {
echo $i;
# add explicit limit of 1 to arguments of preg_replace
$input = preg_replace($reg_exUrl, "[" . $i . "]", $input, 1);
}
}
return $input;
}
$input = "test https://www.google.com/ mmh http://whosebug.com/";
$input = process_line1($input);
echo $input;
?>
我想自动检测字符串中的 link 并将其替换为 [index of link]
。例如,如果我有一个像 test https://www.google.com/ mmh http://whosebug.com/
这样的字符串,结果将是 test [0] mmh [1]
.
现在我试过这个
$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match($reg_exUrl, $_POST['commento'], $url)) {
for ($i = 0; $i < count($url); $i++) {
$_POST['commento'] = preg_replace($reg_exUrl, "[" . $i . "]", $_POST['commento']);
}
}
但我一直得到 test [0] mmh [0]
,如果我尝试 var_dump(count($url))
结果总是得到 1。我该如何解决这个问题?
因此,这里更好的解决方案是将传入字符串拆分为每个 url
段之间的字符串数组,然后在连续的非 url 组件之间插入 [$i]
.
# better solution, perform a split.
function process_line2($input) {
$regex_url = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
# split the incoming string into an array of non-url segments
# preg_split does not trim leading or trailing empty segments
$non_url_segments = preg_split($regex_url, $input, -1);
# inside the array, combine each successive non-url segment
# with the next index
$out = [];
$count = count($non_url_segments);
for ($i = 0; $i < $count; $i++) {
# add the segment
array_push($out, $non_url_segments[$i]);
# add its index surrounded by brackets on all segments but the last one
if ($i < $count -1) {
array_push($out, '[' . $i . ']');
}
}
# join strings with no whitespace
return implode('', $out);
}
preg_match
仅 returns 第一个结果,因此它不会为您提供与您的正则表达式匹配的 url 的数量。您需要提取 preg_match_all
.
第二个错误是您没有使用 preg_replace
的 limit
参数,因此您所有的 url 都同时被替换。
来自 preg_replace
的文档:http://php.net/manual/en/function.preg-replace.php
参数为
mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )
特别是限制参数默认为-1
(无限制)
limit: The maximum possible replacements for each pattern in each subject string. Defaults to -1 (no limit).
您需要将明确的限制设置为 1。
详细说明如何将 preg_match
替换为 preg_match_all
,您需要从中提取 [0] 组件,因为 preg_match_all
returns 是一个数组数组。例如:
array(1) {
[0]=>
array(2) {
[0]=>
string(23) "https://www.google.com/"
[1]=>
string(25) "http://whosebug.com/"
}
}
这是包含修复程序的示例。
<?php
# original function
function process_line($input) {
$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match($reg_exUrl, $input, $url)) {
for ($i = 0; $i < count($url); $i++) {
$input = preg_replace($reg_exUrl, "[" . $i . "]", $input);
}
}
return $input;
}
# function with fixes incorporated
function process_line1($input) {
$reg_exUrl = '/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i';
if(preg_match_all($reg_exUrl, $input, $url)) {
$url_matches = $url[0];
for ($i = 0; $i < count($url_matches); $i++) {
echo $i;
# add explicit limit of 1 to arguments of preg_replace
$input = preg_replace($reg_exUrl, "[" . $i . "]", $input, 1);
}
}
return $input;
}
$input = "test https://www.google.com/ mmh http://whosebug.com/";
$input = process_line1($input);
echo $input;
?>