如何从 try 函数产生错误的列表中删除那些元素?
How to remove those elements from a list for which the try function produced error?
我想做什么
我有一组车辆运动数据。因此,每个车辆ID重复多次。我正在使用 purrr::map()
函数分别为每辆车拟合一个 segmented::segmented()
回归模型。因为,该模型可能不适合每辆车(可能是由于给定车辆的数据点很少),它可能会产生错误。因此,我将 segmented()
函数包装在 try()
中。
数据
以下是我的data
的结构:
data = my_df %>% split(., .$per.Vehicle.ID2)
> str(data, max.level = 1)
List of 2
$ 3.544.534:Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 30 obs. of 4 variables:
$ 3.553.545:Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 72 obs. of 4 variables:
这是 data
再现性列表:
> dput(data)
structure(list(`3.544.534` = structure(list(per.Vehicle.ID2 = c("3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534"), Time = c(307.1,
307.7, 308.3, 308.9, 309.5, 310.1, 310.7, 311.3, 311.9, 312.5,
313.1, 313.7, 314.3, 314.9, 315.5, 316.1, 316.7, 317.3, 317.9,
318.5, 319.1, 319.7, 320.3, 320.9, 321.5, 322.1, 322.7, 323.3,
323.9, 324.5), svel.mps_mean = c(NA, NA, NA, NA, NA, NA, NA,
12.7755159281222, 12.5036616661267, 12.2395719427147, 11.9923745340627,
11.7738694424139, 11.558525429244, 11.3191973673818, 11.0522994308264,
10.7788324802049, 10.5051145516082, 10.2349319889114, 9.97501528086885,
9.73263129457631, 9.49870470544252, 9.26388495185967, 9.03860711857004,
NA, NA, NA, NA, NA, NA, NA), dssvel = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA,
NA, NA, NA, NA, NA)), .Names = c("per.Vehicle.ID2", "Time", "svel.mps_mean",
"dssvel"), row.names = c(NA, -30L), class = c("tbl_df", "tbl",
"data.frame")), `3.553.545` = structure(list(per.Vehicle.ID2 = c("3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545"), Time = c(262, 262.6, 263.2, 263.8, 264.4, 265,
265.6, 266.2, 266.8, 267.4, 268, 268.6, 269.2, 269.8, 270.4,
271, 271.6, 272.2, 272.8, 273.4, 274, 274.6, 275.2, 275.8, 276.4,
277, 277.6, 278.2, 278.8, 279.4, 280, 280.6, 281.2, 281.8, 282.4,
283, 283.6, 284.2, 284.8, 285.4, 286, 286.6, 287.2, 287.8, 288.4,
289, 289.6, 290.2, 290.8, 291.4, 292, 292.6, 293.2, 293.8, 294.4,
295, 295.6, 296.2, 296.8, 297.4, 298, 298.6, 299.2, 299.8, 300.4,
301, 301.6, 302.2, 302.8, 303.4, 304, 304.6), svel.mps_mean = c(NA,
NA, NA, NA, NA, NA, NA, 5.41298285821819, 5.48497881688925, 5.55898102091842,
5.63821570373546, 5.73023228642822, 5.84505407541773, 5.98954476445736,
6.1455976413909, 6.29775534569644, 6.4475118875263, 6.59939228553705,
6.75929997962276, 6.92825864041472, 7.10600376881863, 7.29418216320438,
7.48845217271764, 7.68381738580354, 7.87513283133227, 8.05995699864641,
8.21465371209303, 8.31097200556874, 8.3417386030748, 8.32304537754036,
8.26198297864187, 8.15886518084024, 8.02894718462323, 7.87911840872659,
7.71538338260088, 7.54358017038221, 7.36910128510413, 7.1920560779047,
7.00992171675244, 6.81783765068062, 6.61630770462671, 6.42117981828069,
6.24687579703188, 6.09559517163776, 5.96909261287346, 5.87826537515735,
5.83640038089119, 5.84922602270984, 5.9161965850754, 6.02778813388058,
6.18611611187481, 6.38709031522456, 6.61991746112876, 6.88184116355984,
7.1817199521547, 7.51057503223919, 7.8581088613562, 8.22211301486075,
8.60478211935657, 9.01154624501708, 9.42860178480699, 9.83720909606077,
10.2152191362441, 10.5568662978488, 10.8733070569773, NA, NA,
NA, NA, NA, NA, NA), dssvel = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, -2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, NA,
NA, NA, NA, NA)), .Names = c("per.Vehicle.ID2", "Time", "svel.mps_mean",
"dssvel"), row.names = c(NA, -72L), class = c("tbl_df", "tbl",
"data.frame"))), .Names = c("3.544.534", "3.553.545"))
函数
library(segmented)
segf2_1 <- function(df){
try(segmented(lm(svel.mps_mean ~ Time, data=df), seg.Z = ~Time,
psi = list(Time = df$Time[which(df$dssvel != 0)]),
control = seg.control(seed=1, n.boot = 50)
),
silent=TRUE)
}
在 data
上应用函数:
library(purrr)
model_subject = data %>% map(segf2_1)
它产生以下内容:
> str(model_subject, max.level = 1)
List of 2
$ 3.544.534:Class 'try-error' atomic [1:1] Error in Z <= PSI : non-conformable arrays
.. ..- attr(*, "condition")=List of 2
.. .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
$ 3.553.545:List of 23
..- attr(*, "class")= chr [1:2] "segmented" "lm"
You can see that the model does not fit for the 1st element of data
list. I want to remove it from both the model_subject
and data
lists.
正在删除有错误的元素:
我可以从 model_subject
中删除产生错误的元素,如下所示:
# Removing the vehicles which have error:
model_subject[grep("Error", model_subject)] <- NULL
问题:如何在从 model_subject
列表中删除相应元素后从 data
中删除它们? data
列表不包含 Error
,但由于模型不适合 data
的第一个元素,我想删除它
手动,我可以data[c(1)] <- NULL
。但我想自动执行此操作(尤其是在有多个 Error
元素的情况下)。请帮忙。
我们可以使用Filter
Filter(function(x) length(x) > 1, model_subject)
或 keep
来自 purrr
library(purrr)
keep(model_subject, ~ length(.x) > 1)
或者另一种选择是使用 class
创建一个逻辑向量
model_subject[sapply(model_subject, function(x) !"try-error" %in% class(x))]
或
model_subject[sapply(model_subject, function(x) !inherits(x, "try-error"))]
更新
如果 OP 已将这些错误元素分配给 NULL
model_subject[grep("Error", model_subject)] <- list(NULL)
然后,我们可以在 is.null
上使用 Filter
和 Negate
Filter(Negate(is.null), model_subject)
或
keep(model_subject, Negate(is.null))
如果我们需要得到一个逻辑索引
i1 <- !sapply(model_subject, is.null)
这可用于对 'data'
进行子集化
data[i1]
更新2
使用 OP 的可重现示例
data[i1]
#$`3.553.545`
# A tibble: 72 x 4
# per.Vehicle.ID2 Time svel.mps_mean dssvel
# <chr> <dbl> <dbl> <dbl>
# 1 3.553.545 262 NA NA
# 2 3.553.545 263 NA NA
# 3 3.553.545 263 NA NA
# 4 3.553.545 264 NA NA
# 5 3.553.545 264 NA NA
# 6 3.553.545 265 NA NA
# 7 3.553.545 266 NA NA
# 8 3.553.545 266 5.41 NA
# 9 3.553.545 267 5.48 NA
#10 3.553.545 267 5.56 0
# ... with 62 more rows
考虑使用 purrr
包中的 possibly()
。
在拟合分段回归时,很多事情都可能出错。如果您要执行以下操作,则不必使用 try()
并以尝试错误 类 结束。我没有测试过你的功能,但我正在对我自己的一些代码做同样的事情。
library(segmented)
# your function but without try(), and better formatted for readibility
segf2_1 <- function(df){
segmented(lm(svel.mps_mean ~ Time, data = df),
seg.Z = ~ Time,
psi = list(Time = df$Time[which(df$dssvel != 0)]),
control = seg.control(seed=1, n.boot = 50),
silent=TRUE)
}
library(purrr)
model_subject = map(data, possibly(segf2_1, otherwise = NULL))
我想做什么
我有一组车辆运动数据。因此,每个车辆ID重复多次。我正在使用 purrr::map()
函数分别为每辆车拟合一个 segmented::segmented()
回归模型。因为,该模型可能不适合每辆车(可能是由于给定车辆的数据点很少),它可能会产生错误。因此,我将 segmented()
函数包装在 try()
中。
数据
以下是我的data
的结构:
data = my_df %>% split(., .$per.Vehicle.ID2)
> str(data, max.level = 1)
List of 2
$ 3.544.534:Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 30 obs. of 4 variables:
$ 3.553.545:Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 72 obs. of 4 variables:
这是 data
再现性列表:
> dput(data)
structure(list(`3.544.534` = structure(list(per.Vehicle.ID2 = c("3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534", "3.544.534",
"3.544.534", "3.544.534", "3.544.534", "3.544.534"), Time = c(307.1,
307.7, 308.3, 308.9, 309.5, 310.1, 310.7, 311.3, 311.9, 312.5,
313.1, 313.7, 314.3, 314.9, 315.5, 316.1, 316.7, 317.3, 317.9,
318.5, 319.1, 319.7, 320.3, 320.9, 321.5, 322.1, 322.7, 323.3,
323.9, 324.5), svel.mps_mean = c(NA, NA, NA, NA, NA, NA, NA,
12.7755159281222, 12.5036616661267, 12.2395719427147, 11.9923745340627,
11.7738694424139, 11.558525429244, 11.3191973673818, 11.0522994308264,
10.7788324802049, 10.5051145516082, 10.2349319889114, 9.97501528086885,
9.73263129457631, 9.49870470544252, 9.26388495185967, 9.03860711857004,
NA, NA, NA, NA, NA, NA, NA), dssvel = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA,
NA, NA, NA, NA, NA)), .Names = c("per.Vehicle.ID2", "Time", "svel.mps_mean",
"dssvel"), row.names = c(NA, -30L), class = c("tbl_df", "tbl",
"data.frame")), `3.553.545` = structure(list(per.Vehicle.ID2 = c("3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545", "3.553.545", "3.553.545", "3.553.545", "3.553.545",
"3.553.545"), Time = c(262, 262.6, 263.2, 263.8, 264.4, 265,
265.6, 266.2, 266.8, 267.4, 268, 268.6, 269.2, 269.8, 270.4,
271, 271.6, 272.2, 272.8, 273.4, 274, 274.6, 275.2, 275.8, 276.4,
277, 277.6, 278.2, 278.8, 279.4, 280, 280.6, 281.2, 281.8, 282.4,
283, 283.6, 284.2, 284.8, 285.4, 286, 286.6, 287.2, 287.8, 288.4,
289, 289.6, 290.2, 290.8, 291.4, 292, 292.6, 293.2, 293.8, 294.4,
295, 295.6, 296.2, 296.8, 297.4, 298, 298.6, 299.2, 299.8, 300.4,
301, 301.6, 302.2, 302.8, 303.4, 304, 304.6), svel.mps_mean = c(NA,
NA, NA, NA, NA, NA, NA, 5.41298285821819, 5.48497881688925, 5.55898102091842,
5.63821570373546, 5.73023228642822, 5.84505407541773, 5.98954476445736,
6.1455976413909, 6.29775534569644, 6.4475118875263, 6.59939228553705,
6.75929997962276, 6.92825864041472, 7.10600376881863, 7.29418216320438,
7.48845217271764, 7.68381738580354, 7.87513283133227, 8.05995699864641,
8.21465371209303, 8.31097200556874, 8.3417386030748, 8.32304537754036,
8.26198297864187, 8.15886518084024, 8.02894718462323, 7.87911840872659,
7.71538338260088, 7.54358017038221, 7.36910128510413, 7.1920560779047,
7.00992171675244, 6.81783765068062, 6.61630770462671, 6.42117981828069,
6.24687579703188, 6.09559517163776, 5.96909261287346, 5.87826537515735,
5.83640038089119, 5.84922602270984, 5.9161965850754, 6.02778813388058,
6.18611611187481, 6.38709031522456, 6.61991746112876, 6.88184116355984,
7.1817199521547, 7.51057503223919, 7.8581088613562, 8.22211301486075,
8.60478211935657, 9.01154624501708, 9.42860178480699, 9.83720909606077,
10.2152191362441, 10.5568662978488, 10.8733070569773, NA, NA,
NA, NA, NA, NA, NA), dssvel = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, -2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, NA,
NA, NA, NA, NA)), .Names = c("per.Vehicle.ID2", "Time", "svel.mps_mean",
"dssvel"), row.names = c(NA, -72L), class = c("tbl_df", "tbl",
"data.frame"))), .Names = c("3.544.534", "3.553.545"))
函数
library(segmented)
segf2_1 <- function(df){
try(segmented(lm(svel.mps_mean ~ Time, data=df), seg.Z = ~Time,
psi = list(Time = df$Time[which(df$dssvel != 0)]),
control = seg.control(seed=1, n.boot = 50)
),
silent=TRUE)
}
在 data
上应用函数:
library(purrr)
model_subject = data %>% map(segf2_1)
它产生以下内容:
> str(model_subject, max.level = 1)
List of 2
$ 3.544.534:Class 'try-error' atomic [1:1] Error in Z <= PSI : non-conformable arrays
.. ..- attr(*, "condition")=List of 2
.. .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
$ 3.553.545:List of 23
..- attr(*, "class")= chr [1:2] "segmented" "lm"
You can see that the model does not fit for the 1st element of
data
list. I want to remove it from both themodel_subject
anddata
lists.
正在删除有错误的元素:
我可以从 model_subject
中删除产生错误的元素,如下所示:
# Removing the vehicles which have error:
model_subject[grep("Error", model_subject)] <- NULL
问题:如何在从 model_subject
列表中删除相应元素后从 data
中删除它们? data
列表不包含 Error
,但由于模型不适合 data
的第一个元素,我想删除它
手动,我可以data[c(1)] <- NULL
。但我想自动执行此操作(尤其是在有多个 Error
元素的情况下)。请帮忙。
我们可以使用Filter
Filter(function(x) length(x) > 1, model_subject)
或 keep
来自 purrr
library(purrr)
keep(model_subject, ~ length(.x) > 1)
或者另一种选择是使用 class
model_subject[sapply(model_subject, function(x) !"try-error" %in% class(x))]
或
model_subject[sapply(model_subject, function(x) !inherits(x, "try-error"))]
更新
如果 OP 已将这些错误元素分配给 NULL
model_subject[grep("Error", model_subject)] <- list(NULL)
然后,我们可以在 is.null
Filter
和 Negate
Filter(Negate(is.null), model_subject)
或
keep(model_subject, Negate(is.null))
如果我们需要得到一个逻辑索引
i1 <- !sapply(model_subject, is.null)
这可用于对 'data'
进行子集化data[i1]
更新2
使用 OP 的可重现示例
data[i1]
#$`3.553.545`
# A tibble: 72 x 4
# per.Vehicle.ID2 Time svel.mps_mean dssvel
# <chr> <dbl> <dbl> <dbl>
# 1 3.553.545 262 NA NA
# 2 3.553.545 263 NA NA
# 3 3.553.545 263 NA NA
# 4 3.553.545 264 NA NA
# 5 3.553.545 264 NA NA
# 6 3.553.545 265 NA NA
# 7 3.553.545 266 NA NA
# 8 3.553.545 266 5.41 NA
# 9 3.553.545 267 5.48 NA
#10 3.553.545 267 5.56 0
# ... with 62 more rows
考虑使用 purrr
包中的 possibly()
。
在拟合分段回归时,很多事情都可能出错。如果您要执行以下操作,则不必使用 try()
并以尝试错误 类 结束。我没有测试过你的功能,但我正在对我自己的一些代码做同样的事情。
library(segmented)
# your function but without try(), and better formatted for readibility
segf2_1 <- function(df){
segmented(lm(svel.mps_mean ~ Time, data = df),
seg.Z = ~ Time,
psi = list(Time = df$Time[which(df$dssvel != 0)]),
control = seg.control(seed=1, n.boot = 50),
silent=TRUE)
}
library(purrr)
model_subject = map(data, possibly(segf2_1, otherwise = NULL))