查找 data.frame 中具有非前导 NA 值的所有列
find all columns in data.frame with non leading NA values
AA AFGE
2015-09-30 NA 22.9170
2015-12-31 NA 23.1427
2016-03-31 NA 23.9825
2016-06-30 NA 24.6085
2016-09-30 NA 25.0717
2016-12-31 28.08 23.5920
2017-03-31 34.40 25.0819
2017-06-30 32.65 26.1776
2017-09-30 46.62 25.8541
2017-12-31 53.87 26.2200
2018-03-31 44.96 25.8608
2018-06-30 46.88 25.8300
2018-09-30 40.40 25.2347
2018-12-31 26.58 25.3200
2019-03-31 28.16 25.5000
2019-06-30 23.41 25.7900
2019-09-30 20.07 25.3400
2019-12-31 21.51 25.3600
2020-03-31 6.16 NA
2020-06-30 11.24 NA
2020-09-30 11.63 27.2300
我有一个包含 6000 列的数据框,其中大部分都具有前导 NA 值,例如示例中的 AA。但是我想找到所有具有非前导 NA 值的列(前导 NA 值或不那么重要),请参阅 AFGE with NA values for row 2020-03-31 & 2020-06-30。
如果我可以获得所有具有非前导 NA 值的列(名称),那会很棒,但如果我有一个带有 TRUE/FALSE 的非前导 NA 值的数据框,我会更好。
因此,在这种情况下,除了第 2020-03-31 行和 2020-06-30
行中 AFGE 的两个 NA 值外,每个值的数据框都将为 FALSE
你需要这样的东西吗?
sapply(df, function(x) {
with(rle(is.na(x)), rep(values & seq_along(values) != 1, lengths))
})
# AA AFGE
# [1,] FALSE FALSE
# [2,] FALSE FALSE
# [3,] FALSE FALSE
# [4,] FALSE FALSE
# [5,] FALSE FALSE
# [6,] FALSE FALSE
# [7,] FALSE FALSE
# [8,] FALSE FALSE
# [9,] FALSE FALSE
#[10,] FALSE FALSE
#[11,] FALSE FALSE
#[12,] FALSE FALSE
#[13,] FALSE FALSE
#[14,] FALSE FALSE
#[15,] FALSE FALSE
#[16,] FALSE FALSE
#[17,] FALSE FALSE
#[18,] FALSE FALSE
#[19,] FALSE TRUE
#[20,] FALSE TRUE
#[21,] FALSE FALSE
数据
df <- structure(list(AA = c(NA, NA, NA, NA, NA, 28.08, 34.4, 32.65,
46.62, 53.87, 44.96, 46.88, 40.4, 26.58, 28.16, 23.41, 20.07,
21.51, 6.16, 11.24, 11.63), AFGE = c(22.917, 23.1427, 23.9825,
24.6085, 25.0717, 23.592, 25.0819, 26.1776, 25.8541, 26.22, 25.8608,
25.83, 25.2347, 25.32, 25.5, 25.79, 25.34, 25.36, NA, NA, 27.23
)), class = "data.frame", row.names = c("2015-09-30", "2015-12-31",
"2016-03-31", "2016-06-30", "2016-09-30", "2016-12-31", "2017-03-31",
"2017-06-30", "2017-09-30", "2017-12-31", "2018-03-31", "2018-06-30",
"2018-09-30", "2018-12-31", "2019-03-31", "2019-06-30", "2019-09-30",
"2019-12-31", "2020-03-31", "2020-06-30", "2020-09-30"))
AA AFGE
2015-09-30 NA 22.9170
2015-12-31 NA 23.1427
2016-03-31 NA 23.9825
2016-06-30 NA 24.6085
2016-09-30 NA 25.0717
2016-12-31 28.08 23.5920
2017-03-31 34.40 25.0819
2017-06-30 32.65 26.1776
2017-09-30 46.62 25.8541
2017-12-31 53.87 26.2200
2018-03-31 44.96 25.8608
2018-06-30 46.88 25.8300
2018-09-30 40.40 25.2347
2018-12-31 26.58 25.3200
2019-03-31 28.16 25.5000
2019-06-30 23.41 25.7900
2019-09-30 20.07 25.3400
2019-12-31 21.51 25.3600
2020-03-31 6.16 NA
2020-06-30 11.24 NA
2020-09-30 11.63 27.2300
我有一个包含 6000 列的数据框,其中大部分都具有前导 NA 值,例如示例中的 AA。但是我想找到所有具有非前导 NA 值的列(前导 NA 值或不那么重要),请参阅 AFGE with NA values for row 2020-03-31 & 2020-06-30。
如果我可以获得所有具有非前导 NA 值的列(名称),那会很棒,但如果我有一个带有 TRUE/FALSE 的非前导 NA 值的数据框,我会更好。 因此,在这种情况下,除了第 2020-03-31 行和 2020-06-30
行中 AFGE 的两个 NA 值外,每个值的数据框都将为 FALSE你需要这样的东西吗?
sapply(df, function(x) {
with(rle(is.na(x)), rep(values & seq_along(values) != 1, lengths))
})
# AA AFGE
# [1,] FALSE FALSE
# [2,] FALSE FALSE
# [3,] FALSE FALSE
# [4,] FALSE FALSE
# [5,] FALSE FALSE
# [6,] FALSE FALSE
# [7,] FALSE FALSE
# [8,] FALSE FALSE
# [9,] FALSE FALSE
#[10,] FALSE FALSE
#[11,] FALSE FALSE
#[12,] FALSE FALSE
#[13,] FALSE FALSE
#[14,] FALSE FALSE
#[15,] FALSE FALSE
#[16,] FALSE FALSE
#[17,] FALSE FALSE
#[18,] FALSE FALSE
#[19,] FALSE TRUE
#[20,] FALSE TRUE
#[21,] FALSE FALSE
数据
df <- structure(list(AA = c(NA, NA, NA, NA, NA, 28.08, 34.4, 32.65,
46.62, 53.87, 44.96, 46.88, 40.4, 26.58, 28.16, 23.41, 20.07,
21.51, 6.16, 11.24, 11.63), AFGE = c(22.917, 23.1427, 23.9825,
24.6085, 25.0717, 23.592, 25.0819, 26.1776, 25.8541, 26.22, 25.8608,
25.83, 25.2347, 25.32, 25.5, 25.79, 25.34, 25.36, NA, NA, 27.23
)), class = "data.frame", row.names = c("2015-09-30", "2015-12-31",
"2016-03-31", "2016-06-30", "2016-09-30", "2016-12-31", "2017-03-31",
"2017-06-30", "2017-09-30", "2017-12-31", "2018-03-31", "2018-06-30",
"2018-09-30", "2018-12-31", "2019-03-31", "2019-06-30", "2019-09-30",
"2019-12-31", "2020-03-31", "2020-06-30", "2020-09-30"))