如何排除 php 正则表达式中的大小写
How to exclude cases in php regex
我正在尝试捕获存在冒号的特定数据。我试过这个:
preg_match_all("/^(.+):(.+)/im", $input_lines, $output_array);
在此输入数据上
last_name, first_name
bjorge philip: hello world
bjorge:world
kardashian, kim
some http://hi.com ok
jim https://hey.com yes
same http://www.vim.com:2018 why
it's about 20/08/2018 1:23 pm
time is 01:20:24 now
capture my name : my name is micky mouse
mercury, freddie
I need to be:
captured
capture me :
if you can
where is : freddie
freddie is not:
home
我需要捕获 bjorge philip: hello world
、bjorge:world
、I need to be: captured
、capture me : if you can
、where is : freddie
、freddie is not: home
和 capture my name : my name is micky mouse
行并排除任何包含时间或 URL
的行
<?php
$input_lines="last_name, first_name
bjorge philip: hello world
bjorge:world
kardashian, kim
some http://hi.com ok
jim https://hey.com yes
same http://www.vim.com:2018 why
it's about 20/08/2018 1:23 pm
time is 01:20:24 now
capture my name : my name is micky mouse
mercury, freddie
I need to be:
captured
capture me :
if you can
where is : freddie
freddie is not:
home ";
preg_match_all("/^|\n(?![^:]*$|.*?https?:|.*\d:\d+)(.*?:\s*\r?\n.*|.*?:\s?.+)/",$input_lines,$output_array);
// \r? can be omitted from regex depending on system
foreach($output_array[0] as $output){
echo $output,"<br>";
}
正则表达式模式细分:
^|\n #start string from beginning of $input_lines or after any newline
(?! #begin negative lookahead group
[^:]*$ #ignore lines with no colon
| #OR
.*?https?: #ignore lines with http: or https:
| #OR
.*\d:\d #ignore lines with digit colon digit
) #end negative lookahead group
( #begin capture group
.*?:\s*\r?\n.* #capture 2 lines if 1st line has a colon then 0 or more
# spaces with no non-white characters before the newline
| #OR
.*?:\s?.+ #capture 1 line when it contains a colon followed by
# 0 or 1 space then 1 or more non-white characters
) #end capture group
这个returns:
bjorge philip: hello world
bjorge:world
capture my name : my name is micky mouse
I need to be: captured
capture me : if you can
where is : freddie
freddie is not: home
我花了相当多的时间为您编写这个解决方案。如果样本集没有进一步的扩展,我希望它能赢得你的认可。
我正在尝试捕获存在冒号的特定数据。我试过这个:
preg_match_all("/^(.+):(.+)/im", $input_lines, $output_array);
在此输入数据上
last_name, first_name
bjorge philip: hello world
bjorge:world
kardashian, kim
some http://hi.com ok
jim https://hey.com yes
same http://www.vim.com:2018 why
it's about 20/08/2018 1:23 pm
time is 01:20:24 now
capture my name : my name is micky mouse
mercury, freddie
I need to be:
captured
capture me :
if you can
where is : freddie
freddie is not:
home
我需要捕获 bjorge philip: hello world
、bjorge:world
、I need to be: captured
、capture me : if you can
、where is : freddie
、freddie is not: home
和 capture my name : my name is micky mouse
行并排除任何包含时间或 URL
<?php
$input_lines="last_name, first_name
bjorge philip: hello world
bjorge:world
kardashian, kim
some http://hi.com ok
jim https://hey.com yes
same http://www.vim.com:2018 why
it's about 20/08/2018 1:23 pm
time is 01:20:24 now
capture my name : my name is micky mouse
mercury, freddie
I need to be:
captured
capture me :
if you can
where is : freddie
freddie is not:
home ";
preg_match_all("/^|\n(?![^:]*$|.*?https?:|.*\d:\d+)(.*?:\s*\r?\n.*|.*?:\s?.+)/",$input_lines,$output_array);
// \r? can be omitted from regex depending on system
foreach($output_array[0] as $output){
echo $output,"<br>";
}
正则表达式模式细分:
^|\n #start string from beginning of $input_lines or after any newline
(?! #begin negative lookahead group
[^:]*$ #ignore lines with no colon
| #OR
.*?https?: #ignore lines with http: or https:
| #OR
.*\d:\d #ignore lines with digit colon digit
) #end negative lookahead group
( #begin capture group
.*?:\s*\r?\n.* #capture 2 lines if 1st line has a colon then 0 or more
# spaces with no non-white characters before the newline
| #OR
.*?:\s?.+ #capture 1 line when it contains a colon followed by
# 0 or 1 space then 1 or more non-white characters
) #end capture group
这个returns:
bjorge philip: hello world
bjorge:world
capture my name : my name is micky mouse
I need to be: captured
capture me : if you can
where is : freddie
freddie is not: home
我花了相当多的时间为您编写这个解决方案。如果样本集没有进一步的扩展,我希望它能赢得你的认可。