在正则表达式捕获组中,排除一个词
In regex capture group, exclude one word
我有这种url:
https://example.com/en/app/893245
https://example.com/ru/app/wq23245
https://example.com/app/8984245
我只想提取 com
和 app
之间的单词
https://example.com/en/app/893245 -> en
https://example.com/ru/app/wq23245 -> ru
https://example.com/app/8984245 ->
我试图从捕获组中排除应用程序,但我不知道该怎么做,除非像这样:
.*com\/((?!app).*)\/app
是否可以像这样但不捕获单词应用程序? example\.com\/(\w+|?!app)\/
红色 link: https://rubular.com/r/NnojSgQK7EuelE
如果你需要一个普通的正则表达式,你可以使用 lookarounds:
/(?<=example\.com\/)\w+(?=\/app)/
或者,在 URL 的上下文中可能更好:
/(?<=example\.com\/)[^\/]+(?=\/app)/
参见Rubular demo。
In Ruby,你可以使用
strs = ['https://example.com/en/app/893245','https://example.com/ru/app/wq23245','https://example.com/app/8984245']
strs.each { |s|
p s[/example\.com\/(\w+)\/app/, 1]
}
# => ["en", "ru", nil]
你可以使用 sed
sed -n -f script.sed yourinput.txt
并在 script.sed 内:
s/.*com\/\(.*\)\/app.*//p
示例输入:
https://example.com/en/app/893245
https://example.com/ru/app/wq23245
https://example.com/app/8984245
示例输出:
$ sed -n -f comapp.sed comapp.txt
en
ru
我有这种url:
https://example.com/en/app/893245
https://example.com/ru/app/wq23245
https://example.com/app/8984245
我只想提取 com
和 app
https://example.com/en/app/893245 -> en
https://example.com/ru/app/wq23245 -> ru
https://example.com/app/8984245 ->
我试图从捕获组中排除应用程序,但我不知道该怎么做,除非像这样:
.*com\/((?!app).*)\/app
是否可以像这样但不捕获单词应用程序? example\.com\/(\w+|?!app)\/
红色 link: https://rubular.com/r/NnojSgQK7EuelE
如果你需要一个普通的正则表达式,你可以使用 lookarounds:
/(?<=example\.com\/)\w+(?=\/app)/
或者,在 URL 的上下文中可能更好:
/(?<=example\.com\/)[^\/]+(?=\/app)/
参见Rubular demo。
In Ruby,你可以使用
strs = ['https://example.com/en/app/893245','https://example.com/ru/app/wq23245','https://example.com/app/8984245']
strs.each { |s|
p s[/example\.com\/(\w+)\/app/, 1]
}
# => ["en", "ru", nil]
你可以使用 sed
sed -n -f script.sed yourinput.txt
并在 script.sed 内:
s/.*com\/\(.*\)\/app.*//p
示例输入:
https://example.com/en/app/893245
https://example.com/ru/app/wq23245
https://example.com/app/8984245
示例输出:
$ sed -n -f comapp.sed comapp.txt
en
ru