创建可能基于两列之一的第三列

Question

我有这样的数据：（此处未显示更多列）

    df<-structure(list(email = c("lbelcher@place.org", "bbelchery@place.org", 
"b.smith@place.org", "jsmith1@place.org"), employee_number = c(123456, 
654321, 664422, 321458)), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"))

我需要创建第三列，名为“用户名”。用户名通常只是他们电子邮件中 @ 之前的所有内容，除非该名称中有句点或数字，否则它将是他们的员工编号。

换句话说，我希望得到这样的结果：

如有任何帮助，我们将不胜感激！

Answer 1

我们可以在 'email' 的子字符串（@ 之前）上使用 str_detect 来查找 . 或数字，然后 return 'employee_number' 或者用 str_remove

删除 'email' 的后缀部分

library(dplyr)
library(stringr)
df <- df %>% 
   mutate(username = case_when(str_detect(trimws(email,
      whitespace = "@.*"), "[.0-9]")
     ~ as.character(employee_number), TRUE ~ str_remove(email, "@.*")))

-输出

df
# A tibble: 4 × 3
  email               employee_number username 
  <chr>                         <dbl> <chr>    
1 lbelcher@place.org           123456 lbelcher 
2 bbelchery@place.org          654321 bbelchery
3 b.smith@place.org            664422 664422   
4 jsmith1@place.org            321458 321458

创建可能基于两列之一的第三列

Create third column that might be based on one of two columns

r

tidyverse