是否有任何代码使用 apply 来优化它？

Question

加载库

library(engsoccerdata)
library(dplyr)
library(lubridate)

从英格兰联赛数据中提取利物浦数据

england$Date <- ymd(england$Date)
Liverpool.home <- england %>% filter(Date > '2001-08-01', home == 'Liverpool')
Liverpool.away <- england %>% filter(Date > '2001-08-01', visitor == 'Liverpool')

做变量点

Liverpool.home$points = 0

for(i in 1:nrow(Liverpool.home)){

  if(Liverpool.home[i,]$result == 'H'){
    Liverpool.home[i,]$points = 3
  }
  else if(Liverpool.home[i,]$result == 'D'){
    Liverpool.home[i,]$points = 1
  }

}

我知道如何使用 apply 函数是 Whosebug 中非常无聊和常见的问题，但是我无法使用 apply 函数解决这个问题。有什么方法吗？ :)

Answer 1

因此，您想将其中一个字符类型的列重新编码为整数列。其中一个选项是简单地使用 ifelse，它是矢量化的并且在这种情况下使用起来很方便，并且您不想使用 apply 这意味着循环 matrix:

Liverpool.home$points <- with(Liverpool.home, ifelse(result == "H", 3, 
                                                     ifelse(result == "D", 1, 0)))

head(Liverpool.home[c("result", "points")])

#  result points
#1      A      0
#2      A      0
#3      H      3
#4      D      1
#5      H      3
#6      H      3

Answer 2

dplyr

dplyr 中的函数 case_when ("a vectorised set of if and else ifs") 相当于 SQL CASE WHEN 语句。我们需要在mutate.

里面使用.$

library(dplyr)
Liverpool.home %>% 
  mutate(points = case_when(.$result == 'H' ~ 3,
                            .$result == 'D' ~ 1,
                            TRUE ~ 0))

sqldf

sqldf:

SQL 中的 CASE WHEN 语句

library(sqldf)
df <- sqldf('SELECT result, 
                     CASE WHEN result = "H" THEN 3 
                          WHEN result = "D" THEN 1
                          ELSE 0
                     END AS points
             FROM [Liverpool.home]')
head(df)

输出：

  result points
1      A      0
2      A      0
3      H      3
4      D      1
5      H      3
6      H      3

Answer 3

试试这个。

transform(Liverpool.home, points = 3 * (result == "H") + (result == "D"))

是否有任何代码使用 apply 来优化它？

Is there any code using apply to optimize this?

r

data-manipulation

apply

加载库

从英格兰联赛数据中提取利物浦数据

做变量点