从一组行中随机选择一个值并将值添加到下面的新行
Randomly pick a value from a set of rows and add value to new row below
我的 R 技能不足以解决这个问题,所以我希望有人能提供帮助。
我的数据是这样的:
head(human.players,25)
Season
Episode
Round
Player
Player_type
Crowd_size
q1_a
q2_a
q3_a
q4_a
q5_a
2020
1
1
1
1
3
0
1
0
0
NA
2020
1
1
2
1
3
0
1
1
1
NA
2020
1
1
3
1
3
0
0
0
1
NA
2020
1
2
1
1
3
1
1
0
1
NA
2020
1
2
2
1
3
1
0
1
0
NA
2020
1
2
3
1
3
1
1
1
0
NA
2020
1
3
1
1
3
0
1
0
0
NA
2020
1
3
2
1
3
0
1
1
1
NA
2020
1
3
3
1
3
0
0
1
1
NA
2020
1
4
1
1
3
0
0
1
1
NA
2020
1
4
2
1
3
0
0
1
1
NA
2020
1
4
3
1
3
0
0
1
1
NA
2020
1
5
1
1
2
1
1
0
0
NA
2020
1
5
2
1
2
1
1
1
0
NA
2020
1
5
3
1
2
NA
NA
NA
NA
NA
2020
1
6
1
1
2
0
0
0
0
NA
2020
1
6
2
1
2
0
0
0
0
NA
2020
1
6
3
1
2
NA
NA
NA
NA
NA
2020
1
7
1
1
2
0
1
1
1
NA
2020
1
7
2
1
2
1
0
0
1
NA
2020
1
7
3
1
2
NA
NA
NA
NA
NA
2020
2
1
1
1
3
1
1
0
0
NA
2020
2
1
2
1
3
0
0
0
1
NA
2020
2
1
3
1
3
0
1
1
0
NA
来自 q1_a:q5_a 的变量表示玩家是答错了 (0) 还是答对了 (1)。每个玩家都在特定的回合中玩(每集有 7 轮)。在前4轮中,有3名球员。然而,在第 5-7 轮中,只有 2 名玩家(被淘汰的一名玩家有 NA - 例如,在第 1 集中,这是玩家 3 - 见上文 table)。
我需要创建一个随机播放器。这意味着在前 4 轮中,我需要随机 select 从该轮中的三名玩家中选择一个答案(针对 5 个问题中的每一个),并添加“随机玩家”行值。对于第 5 到 7 轮,我需要 select 两个玩家的答案(忽略 NA)并添加“随机玩家”行值。
排序算法必须查看第 1 轮(仅那些行),从三行中抽取一个值,将其粘贴到第 1 轮(即,在本例中创建第 4 行)并为每个执行此操作5 个问题中。然后是第 2 轮...
这就是我添加玩家 4(随机玩家)的地方的样子:
Season
Episode
Round
Player
Player_type
Crowd_size
q1_a
q2_a
q3_a
q4_a
q5_a
2020
1
1
1
1
3
0
1
0
0
NA
2020
1
1
2
1
3
0
1
1
1
NA
2020
1
1
3
1
3
0
0
0
1
NA
2020
1
1
4
1
3
0
0
1
1
NA
2020
1
2
1
1
3
1
1
0
1
NA
2020
1
2
2
1
3
1
0
1
0
NA
2020
1
2
3
1
3
1
1
1
0
NA
2020
1
2
4
1
3
1
1
1
0
NA
2020
1
3
1
1
3
0
1
0
0
NA
2020
1
3
2
1
3
0
1
1
1
NA
2020
1
3
3
1
3
0
0
1
1
NA
2020
1
3
4
1
3
0
0
0
1
NA
2020
1
4
1
1
3
0
0
1
1
NA
2020
1
4
2
1
3
0
0
1
1
NA
2020
1
4
3
1
3
0
0
1
1
NA
2020
1
4
4
1
3
0
0
1
1
NA
写这篇文章时,我认为这可能是不可能的,或者至少很难做到,所以这个问题更像是一个“万岁玛丽”。我假设 sample()、apply() 的某种组合,并且创建自定义函数是必要的,但我很困惑。
这里有一个管道将新玩家及其分数采样到一个单独的帧中,然后您可以 bind_rows
返回原始数据。
set.seed(2021)
newplayers <- dat %>%
filter(!is.na(q1_a)) %>%
group_by(Season, Episode, Round) %>%
summarize(across(everything(), ~ sample(., size=1)), .groups = "drop") %>%
mutate(Player = NA_integer_, Player_type = NA_integer_)
newplayers
# # A tibble: 8 x 11
# Season Episode Round Player Player_type Crowd_size q1_a q2_a q3_a q4_a q5_a
# <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <lgl>
# 1 2020 1 1 NA NA 3 0 0 1 1 NA
# 2 2020 1 2 NA NA 3 1 1 0 0 NA
# 3 2020 1 3 NA NA 3 0 1 1 0 NA
# 4 2020 1 4 NA NA 3 0 0 1 1 NA
# 5 2020 1 5 NA NA 2 1 1 1 0 NA
# 6 2020 1 6 NA NA 2 0 0 0 0 NA
# 7 2020 1 7 NA NA 2 0 0 1 1 NA
# 8 2020 2 1 NA NA 3 0 1 0 0 NA
bind_rows(dat, newplayers) %>%
arrange(Season, Episode, Round, is.na(Player), Player) %>%
head(.)
# Season Episode Round Player Player_type Crowd_size q1_a q2_a q3_a q4_a q5_a
# 1 2020 1 1 1 1 3 0 1 0 0 NA
# 2 2020 1 1 2 1 3 0 1 1 1 NA
# 3 2020 1 1 3 1 3 0 0 0 1 NA
# 4 2020 1 1 NA NA 3 0 0 1 1 NA
# 5 2020 1 2 1 1 3 1 1 0 1 NA
# 6 2020 1 2 2 1 3 1 0 1 0 NA
我不知道要给 Player*
赋什么值,所以我选择了 NA
。
数据
# dput(dat)
dat <- structure(list(Season = c(2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L), Episode = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), Round = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 1L), Player = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), Player_type = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Crowd_size = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L), q1_a = c(0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, NA, 0L, 0L, NA, 0L, 1L, NA, 1L, 0L, 0L), q2_a = c(1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, NA, 0L, 0L, NA, 1L, 0L, NA, 1L, 0L, 1L), q3_a = c(0L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, NA, 0L, 0L, NA, 1L, 0L, NA, 0L, 0L, 1L), q4_a = c(0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, NA, 0L, 0L, NA, 1L, 1L, NA, 0L, 1L, 0L), q5_a = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -24L))
我的 R 技能不足以解决这个问题,所以我希望有人能提供帮助。
我的数据是这样的:
head(human.players,25)
Season | Episode | Round | Player | Player_type | Crowd_size | q1_a | q2_a | q3_a | q4_a | q5_a |
---|---|---|---|---|---|---|---|---|---|---|
2020 | 1 | 1 | 1 | 1 | 3 | 0 | 1 | 0 | 0 | NA |
2020 | 1 | 1 | 2 | 1 | 3 | 0 | 1 | 1 | 1 | NA |
2020 | 1 | 1 | 3 | 1 | 3 | 0 | 0 | 0 | 1 | NA |
2020 | 1 | 2 | 1 | 1 | 3 | 1 | 1 | 0 | 1 | NA |
2020 | 1 | 2 | 2 | 1 | 3 | 1 | 0 | 1 | 0 | NA |
2020 | 1 | 2 | 3 | 1 | 3 | 1 | 1 | 1 | 0 | NA |
2020 | 1 | 3 | 1 | 1 | 3 | 0 | 1 | 0 | 0 | NA |
2020 | 1 | 3 | 2 | 1 | 3 | 0 | 1 | 1 | 1 | NA |
2020 | 1 | 3 | 3 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 4 | 1 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 4 | 2 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 4 | 3 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 5 | 1 | 1 | 2 | 1 | 1 | 0 | 0 | NA |
2020 | 1 | 5 | 2 | 1 | 2 | 1 | 1 | 1 | 0 | NA |
2020 | 1 | 5 | 3 | 1 | 2 | NA | NA | NA | NA | NA |
2020 | 1 | 6 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | NA |
2020 | 1 | 6 | 2 | 1 | 2 | 0 | 0 | 0 | 0 | NA |
2020 | 1 | 6 | 3 | 1 | 2 | NA | NA | NA | NA | NA |
2020 | 1 | 7 | 1 | 1 | 2 | 0 | 1 | 1 | 1 | NA |
2020 | 1 | 7 | 2 | 1 | 2 | 1 | 0 | 0 | 1 | NA |
2020 | 1 | 7 | 3 | 1 | 2 | NA | NA | NA | NA | NA |
2020 | 2 | 1 | 1 | 1 | 3 | 1 | 1 | 0 | 0 | NA |
2020 | 2 | 1 | 2 | 1 | 3 | 0 | 0 | 0 | 1 | NA |
2020 | 2 | 1 | 3 | 1 | 3 | 0 | 1 | 1 | 0 | NA |
来自 q1_a:q5_a 的变量表示玩家是答错了 (0) 还是答对了 (1)。每个玩家都在特定的回合中玩(每集有 7 轮)。在前4轮中,有3名球员。然而,在第 5-7 轮中,只有 2 名玩家(被淘汰的一名玩家有 NA - 例如,在第 1 集中,这是玩家 3 - 见上文 table)。
我需要创建一个随机播放器。这意味着在前 4 轮中,我需要随机 select 从该轮中的三名玩家中选择一个答案(针对 5 个问题中的每一个),并添加“随机玩家”行值。对于第 5 到 7 轮,我需要 select 两个玩家的答案(忽略 NA)并添加“随机玩家”行值。
排序算法必须查看第 1 轮(仅那些行),从三行中抽取一个值,将其粘贴到第 1 轮(即,在本例中创建第 4 行)并为每个执行此操作5 个问题中。然后是第 2 轮...
这就是我添加玩家 4(随机玩家)的地方的样子:
Season | Episode | Round | Player | Player_type | Crowd_size | q1_a | q2_a | q3_a | q4_a | q5_a |
---|---|---|---|---|---|---|---|---|---|---|
2020 | 1 | 1 | 1 | 1 | 3 | 0 | 1 | 0 | 0 | NA |
2020 | 1 | 1 | 2 | 1 | 3 | 0 | 1 | 1 | 1 | NA |
2020 | 1 | 1 | 3 | 1 | 3 | 0 | 0 | 0 | 1 | NA |
2020 | 1 | 1 | 4 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 2 | 1 | 1 | 3 | 1 | 1 | 0 | 1 | NA |
2020 | 1 | 2 | 2 | 1 | 3 | 1 | 0 | 1 | 0 | NA |
2020 | 1 | 2 | 3 | 1 | 3 | 1 | 1 | 1 | 0 | NA |
2020 | 1 | 2 | 4 | 1 | 3 | 1 | 1 | 1 | 0 | NA |
2020 | 1 | 3 | 1 | 1 | 3 | 0 | 1 | 0 | 0 | NA |
2020 | 1 | 3 | 2 | 1 | 3 | 0 | 1 | 1 | 1 | NA |
2020 | 1 | 3 | 3 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 3 | 4 | 1 | 3 | 0 | 0 | 0 | 1 | NA |
2020 | 1 | 4 | 1 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 4 | 2 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 4 | 3 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
2020 | 1 | 4 | 4 | 1 | 3 | 0 | 0 | 1 | 1 | NA |
写这篇文章时,我认为这可能是不可能的,或者至少很难做到,所以这个问题更像是一个“万岁玛丽”。我假设 sample()、apply() 的某种组合,并且创建自定义函数是必要的,但我很困惑。
这里有一个管道将新玩家及其分数采样到一个单独的帧中,然后您可以 bind_rows
返回原始数据。
set.seed(2021)
newplayers <- dat %>%
filter(!is.na(q1_a)) %>%
group_by(Season, Episode, Round) %>%
summarize(across(everything(), ~ sample(., size=1)), .groups = "drop") %>%
mutate(Player = NA_integer_, Player_type = NA_integer_)
newplayers
# # A tibble: 8 x 11
# Season Episode Round Player Player_type Crowd_size q1_a q2_a q3_a q4_a q5_a
# <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <lgl>
# 1 2020 1 1 NA NA 3 0 0 1 1 NA
# 2 2020 1 2 NA NA 3 1 1 0 0 NA
# 3 2020 1 3 NA NA 3 0 1 1 0 NA
# 4 2020 1 4 NA NA 3 0 0 1 1 NA
# 5 2020 1 5 NA NA 2 1 1 1 0 NA
# 6 2020 1 6 NA NA 2 0 0 0 0 NA
# 7 2020 1 7 NA NA 2 0 0 1 1 NA
# 8 2020 2 1 NA NA 3 0 1 0 0 NA
bind_rows(dat, newplayers) %>%
arrange(Season, Episode, Round, is.na(Player), Player) %>%
head(.)
# Season Episode Round Player Player_type Crowd_size q1_a q2_a q3_a q4_a q5_a
# 1 2020 1 1 1 1 3 0 1 0 0 NA
# 2 2020 1 1 2 1 3 0 1 1 1 NA
# 3 2020 1 1 3 1 3 0 0 0 1 NA
# 4 2020 1 1 NA NA 3 0 0 1 1 NA
# 5 2020 1 2 1 1 3 1 1 0 1 NA
# 6 2020 1 2 2 1 3 1 0 1 0 NA
我不知道要给 Player*
赋什么值,所以我选择了 NA
。
数据
# dput(dat)
dat <- structure(list(Season = c(2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L), Episode = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), Round = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 1L, 1L, 1L), Player = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), Player_type = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Crowd_size = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L), q1_a = c(0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, NA, 0L, 0L, NA, 0L, 1L, NA, 1L, 0L, 0L), q2_a = c(1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, NA, 0L, 0L, NA, 1L, 0L, NA, 1L, 0L, 1L), q3_a = c(0L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, NA, 0L, 0L, NA, 1L, 0L, NA, 0L, 0L, 1L), q4_a = c(0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, NA, 0L, 0L, NA, 1L, 1L, NA, 0L, 1L, 0L), q5_a = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -24L))