使用正则表达式删除其他域
Remove other domains with Regex
所以我有一个 preg_replace 将字符串中的所有 link 替换为“[link removed]” :
/((https?:\/\/)?(\w+\.)+[a-z|A-Z]{2,}(:\d+)?((\/\w+)+(\.\w+)?)?\/?)/
Simplified:
http/https, subdomain, domain, tld, port, folder/file, extension, "/"
但我需要以一种方式进行过滤,如果域是 "example.com",则不会像 :
那样替换任何内容
"http://notmydomain.com" -> "[link removed]"
"example.com" -> "example.com"
使用 negative lookahead assertion:
/((https?:\/\/)?(?![^:\/\s]*\bexample\.com)(\b\w+\.)+[a-z|A-Z]{2,}(:\d+)?((\/\w+)+(\.\w+)?)?\/?)/
解释:
(?! # Assert that it's impossible to match this from the current location:
[^:\/\s]* # Any number of characters except colon, slash or whitespace
\b # followed by a start-of-word anchor
example\.com # followed by example.com.
) # End of lookahead.
此外,我在 \w+
部分之前添加了另一个 word boundary anchor 以确保在给定 example.com
作为输入时我们不匹配 xample.com
。
测试一下live on regex101.com。
所以我有一个 preg_replace 将字符串中的所有 link 替换为“[link removed]” :
/((https?:\/\/)?(\w+\.)+[a-z|A-Z]{2,}(:\d+)?((\/\w+)+(\.\w+)?)?\/?)/
Simplified:
http/https, subdomain, domain, tld, port, folder/file, extension, "/"
但我需要以一种方式进行过滤,如果域是 "example.com",则不会像 :
那样替换任何内容"http://notmydomain.com" -> "[link removed]"
"example.com" -> "example.com"
使用 negative lookahead assertion:
/((https?:\/\/)?(?![^:\/\s]*\bexample\.com)(\b\w+\.)+[a-z|A-Z]{2,}(:\d+)?((\/\w+)+(\.\w+)?)?\/?)/
解释:
(?! # Assert that it's impossible to match this from the current location:
[^:\/\s]* # Any number of characters except colon, slash or whitespace
\b # followed by a start-of-word anchor
example\.com # followed by example.com.
) # End of lookahead.
此外,我在 \w+
部分之前添加了另一个 word boundary anchor 以确保在给定 example.com
作为输入时我们不匹配 xample.com
。
测试一下live on regex101.com。