带子组的组的编号反向引用
Numbered back references for groups with subgroups
我有 'fan(s)' 这个词,我想用下面看到的代词动词组合前面的词 fanatic(s) 替换。
gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\b[Ff]an)(s?\b)",
'\1\2atic\3',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)
## [1] "He's the bigest He'saticHe's I know."
我知道带编号的反向引用指的是第一组的内括号。有没有办法让它们只引用外部三个括号,其中三个组是:伪代码中的(stuff before fan)(fan)(s\b)
。
我知道我的正则表达式可以替换 wll 组 si 我知道它是有效的。这只是反向引用部分。
gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\b[Ff]an)(s?\b)",
'',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)
## [1] " I know."
期望的输出:
## [1] "He's the bigest fanatic I know."
匹配示例
inputs <- c(
"He's the bigest fan I know.",
"I am a huge fan of his.",
"I know she has lots of fans in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)
outputs <- c(
"He's the bigest fanatic I know.",
"I am a huge fanatic of his.",
"I know she has lots of fanatics in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)
我了解到您对过多的捕获组感到困扰。把你不感兴趣的变成 non-capturing 个,或者去掉那些完全多余的:
((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(Fan)(s?)\b
请注意,[Ff]
可以变成 F
或 f
,因为您使用的是 ignore.case=TRUE
参数。
gsub(
"((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(fan)(s?)\b",
'\1\2atic\3',
inputs,
perl = TRUE, ignore.case = TRUE
)
输出:
[1] "He's the bigest fanatic I know."
[2] "I am a huge fanatic of his."
[3] "I know she has lots of fans in his club"
[4] "I was cold and turned on the fan"
[5] "An air conditioner is better than 2 fans at cooling."
我有 'fan(s)' 这个词,我想用下面看到的代词动词组合前面的词 fanatic(s) 替换。
gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\b[Ff]an)(s?\b)",
'\1\2atic\3',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)
## [1] "He's the bigest He'saticHe's I know."
我知道带编号的反向引用指的是第一组的内括号。有没有办法让它们只引用外部三个括号,其中三个组是:伪代码中的(stuff before fan)(fan)(s\b)
。
我知道我的正则表达式可以替换 wll 组 si 我知道它是有效的。这只是反向引用部分。
gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\b[Ff]an)(s?\b)",
'',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)
## [1] " I know."
期望的输出:
## [1] "He's the bigest fanatic I know."
匹配示例
inputs <- c(
"He's the bigest fan I know.",
"I am a huge fan of his.",
"I know she has lots of fans in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)
outputs <- c(
"He's the bigest fanatic I know.",
"I am a huge fanatic of his.",
"I know she has lots of fanatics in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)
我了解到您对过多的捕获组感到困扰。把你不感兴趣的变成 non-capturing 个,或者去掉那些完全多余的:
((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(Fan)(s?)\b
请注意,[Ff]
可以变成 F
或 f
,因为您使用的是 ignore.case=TRUE
参数。
gsub(
"((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(fan)(s?)\b",
'\1\2atic\3',
inputs,
perl = TRUE, ignore.case = TRUE
)
输出:
[1] "He's the bigest fanatic I know."
[2] "I am a huge fanatic of his."
[3] "I know she has lots of fans in his club"
[4] "I was cold and turned on the fan"
[5] "An air conditioner is better than 2 fans at cooling."