R中的字典和列表理解
dictionary and list comprehension in R
有什么通用的方法可以更快地执行以下 R 代码?例如,在 python 中,字典理解(参见下面的等价物)将是一个不错的更快的替代方案。
回复:
l1 <- 1:3
l2 <- c("a", "b", "c")
foo <- function(x) {return(5*x)}
bar <- list()
for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])
Python
l1 = range(1, 4)
l2 = ["a", "b", "c"]
def foo(x):
return 5*x
{b: foo(a) for a, b in zip(l1, l2)}
我们正在谈论速度,所以让我们做一些基准测试:
library(microbenchmark)
microbenchmark(op = {for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])},
lapply = setNames(lapply(l1,foo),l2),
vectorised = setNames(as.list(foo(l1)), l2))
Unit: microseconds
expr min lq mean median uq max neval
op 7.982 9.122 10.81052 9.693 10.548 36.206 100
lapply 5.987 6.557 7.73159 6.842 7.270 55.877 100
vectorised 4.561 5.132 6.72526 5.417 5.987 80.964 100
但这些小值意义不大,所以我将矢量长度提高到 10,000,您会真正看到差异:
l <- 10000
l1 <- seq_len(l)
l2 <- sample(letters, l, replace = TRUE)
microbenchmark(op = {bar <- list(); for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])},
lapply = setNames(lapply(l1,foo),l2),
vectorised = setNames(as.list(foo(l1)), l2),
times = 100)
Unit: microseconds
expr min lq mean median uq max neval
op 30122.865 33325.788 34914.8339 34769.8825 36721.428 41515.405 100
lapply 13526.397 14446.078 15217.5309 14829.2320 15351.933 19241.767 100
vectorised 199.559 259.997 349.0544 296.9155 368.614 3189.523 100
但是根据其他人所说的,它不一定是一个列表。如果删除列表要求:
microbenchmark(setNames(foo(l1), l2))
Unit: microseconds
expr min lq mean median uq max neval
setNames(foo(l1), l2) 22.522 23.8045 58.06888 25.0875 48.322 1427.417 100
有什么通用的方法可以更快地执行以下 R 代码?例如,在 python 中,字典理解(参见下面的等价物)将是一个不错的更快的替代方案。
回复:
l1 <- 1:3
l2 <- c("a", "b", "c")
foo <- function(x) {return(5*x)}
bar <- list()
for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])
Python
l1 = range(1, 4)
l2 = ["a", "b", "c"]
def foo(x):
return 5*x
{b: foo(a) for a, b in zip(l1, l2)}
我们正在谈论速度,所以让我们做一些基准测试:
library(microbenchmark)
microbenchmark(op = {for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])},
lapply = setNames(lapply(l1,foo),l2),
vectorised = setNames(as.list(foo(l1)), l2))
Unit: microseconds
expr min lq mean median uq max neval
op 7.982 9.122 10.81052 9.693 10.548 36.206 100
lapply 5.987 6.557 7.73159 6.842 7.270 55.877 100
vectorised 4.561 5.132 6.72526 5.417 5.987 80.964 100
但这些小值意义不大,所以我将矢量长度提高到 10,000,您会真正看到差异:
l <- 10000
l1 <- seq_len(l)
l2 <- sample(letters, l, replace = TRUE)
microbenchmark(op = {bar <- list(); for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])},
lapply = setNames(lapply(l1,foo),l2),
vectorised = setNames(as.list(foo(l1)), l2),
times = 100)
Unit: microseconds
expr min lq mean median uq max neval
op 30122.865 33325.788 34914.8339 34769.8825 36721.428 41515.405 100
lapply 13526.397 14446.078 15217.5309 14829.2320 15351.933 19241.767 100
vectorised 199.559 259.997 349.0544 296.9155 368.614 3189.523 100
但是根据其他人所说的,它不一定是一个列表。如果删除列表要求:
microbenchmark(setNames(foo(l1), l2))
Unit: microseconds
expr min lq mean median uq max neval
setNames(foo(l1), l2) 22.522 23.8045 58.06888 25.0875 48.322 1427.417 100