删除模式中任何数字后的所有内容

Remove everything after any digit in a pattern

我正在尝试使用 gsub 删除我的数据框列的每个值中任何数字后的每个字符:

Tumoral_stage   Methastatic_stage
    T1a                M0
    T1b                M0
    T2c                M0
    T3b                M0
    T1c                M0
    T2                 M0
    T3a                M1

我想得到这个数据框:

Tumoral_stage   Methastatic_stage
    T1                 M0
    T1                 M0
    T2                 M0
    T3                 M0
    T1                 M0
    T2                 M0
    T3                 M1

我想应用 gsub 指令来实现此目的,但我不知道如何指示删除 any 数字字符后的所有内容。

使用sub()正向回顾

x <- c("T1a", "T1b", "T2c", "T3b", "T1c", "T2", "T3a")

sub("(?<=[0-9]).+", "", x, perl = TRUE)

# [1] "T1" "T1" "T2" "T3" "T1" "T2" "T3"

我们也可以使用substr

substr(x, 1, 2)

str_remove

library(stringr)
str_remove(x, "[^0-9]+$")

数据

x <- c("T1a", "T1b", "T2c", "T3b", "T1c", "T2", "T3a")

考虑捕获您想要保留的部分并使用反向引用 \1:

sub("(.*\d)\w", "\1", x)
[1] "T1" "T1" "T2" "T3" "T1" "T2" "T3"