使用 lookbehinds 提取 string_extract() 中的字符串组

Question

库（stringr）

我尝试遵循建议但无法解决我的问题。使用 stringr 我需要提取第一个字母字符串后的所有字符加上一个下划线。

以下摘录正是我不想要的

str_extract("mean_q4.8_addiction_critCount", "(^[a-z]*_)")

# [1] "mean_"

我要的是

# [1] "q4.8_addiction_critCount"

基于我在上面插入的 link 我尝试了一个积极的回顾

str_extract("mean_q4.8_addiction_critCount", "(?<=^[a-z]*_)\w+")

但是报错

# Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
#  Look-Behind pattern matches must have a bounded maximum length. (U_REGEX_LOOK_BEHIND_LIMIT)

而且我不知道如何限制最大长度。

非常感谢任何建议。

Answer 1

你就不能反其道而行之吗？删除第一个下划线之前的所有内容。

sub('.*?_', '', 'mean_q4.8_addiction_critCount')
#[1] "q4.8_addiction_critCount"

就 look-behind 正则表达式而言，您可以提取第一个下划线后的所有内容 ?

stringr::str_extract("mean_q4.8_addiction_critCount", "(?<=_).*")

使用 lookbehinds 提取 string_extract() 中的字符串组

Using lookbehinds to extract groups of strings in string_extract()

r

stringr

regex-lookarounds

tidyverse