如何引用满足特定条件的另一个值,然后可以在计算中使用该值
How to reference another value that meets certain conditions which can then be used in a calculation
我有两个数据框:
测试
Group.1 x
1 25.5
2 51
3 51.5
4 50
5 51.5
6 60
...
53 35.5
日历
Week Hours HourSpent
1 8.5
1 8.5
1 0
2 8.5
2 8.5
2 8.5
2 8.5
2 8.5
2 6.5
2 8.5
3 7.0
3 7.0
3 8.2
...
我想做的是通过执行以下计算来填充日历 df 中的 'HourSpent' 列:(('Hours' / 'HourSpent') * 0.79)
我希望能够遍历日历 df 中的每一行并将行 'Hours' 值除以匹配的 'HourSpent' 值。 'HoursSpent' 值可以从 'Test' df 中确定...因此,如果日历 df 中 'Week' 列中的值与 'Group.1' 列中的任何值相匹配 'Test' df 然后我希望 'Test' df 的 'x' 列中的相应值是 'HourSpent' 值。
例如
日历 df 中的第 1 行将为 8.5 / 25.5 * 0.79
...这将应用于前 3 行,因为周数为 1。然后当我们到达第 4 行时,计算将更改为 8.5/ 51 * 0.79
等等...等等
所需输出 - 日历 df
Week Hours HourSpent
1 8.5 0.2633
1 8.5 0.2633
1 0 0
2 8.5 0.1317
2 8.5 0.1317
2 8.5 0.1317
2 8.5 0.1317
2 8.5 0.1317
2 6.5 0.1007
2 8.5 0.1317
3 7.0 0.1074
...
已尝试代码
for (i in 1:nrow(Calendar)){
Calendar$'HourSpent' <- ifelse(Calendar$Week == Test$Group.1,
(Calendar$Hours/Test$x)*0.79,
0)
}
问题是这似乎只适用于一行然后其他一切都是 0...这个问题有更好的解决方案吗?
非常感谢
Test <- data.frame(`Group.1` = c(1, 2, 3, 4), x = c(25.5, 51, 51.5, 50))
Calendar <- data.frame(Week = c(1, 1, 1, 2, 2, 2, 3, 3, 3), Hours = c(8.5, 8.5, 0, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5))
Calendar <- dplyr::inner_join(Calendar, Test, by = c("Week" = "Group.1")) %>%
dplyr::mutate(Hours_spent = (Hours/x)*0.79)
输出
Calendar
Week Hours x Hours_spent
1 1 8.5 25.5 0.2633333
2 1 8.5 25.5 0.2633333
3 1 0.0 25.5 0.0000000
4 2 8.5 51.0 0.1316667
5 2 8.5 51.0 0.1316667
6 2 8.5 51.0 0.1316667
7 3 8.5 51.5 0.1303883
8 3 8.5 51.5 0.1303883
9 3 8.5 51.5 0.1303883
基础R
解法:
Test <- data.frame(Group.1 = 1:4, x = runif(4)*100, stringsAsFactors = FALSE)
Calendar <- data.frame(Week = sort(sample(1:4, 10, replace = TRUE)), Hours = runif(10)*100, HourSpent = NA, stringsAsFactors = FALSE)
head(Test)
# Group.1 x
# 1 1 7.163006
# 2 2 55.743758
# 3 3 48.983705
# 4 4 49.429236
head(Calendar)
# Week Hours HourSpent
# 1 1 41.22831 NA
# 2 1 68.30103 NA
# 3 1 65.34278 NA
# 4 2 91.59863 NA
# 5 2 81.31131 NA
# 6 2 67.58900 NA
names(Test)[which(names(Test) == "Group.1")] <- "Week"
Calendar <- merge(Calendar, Test, by = "Week", all.x = TRUE)
Calendar$HourSpent <- ((Calendar$Hours/Calendar$x) * 0.79)
head(Calendar)
# Week Hours HourSpent x
# 1 1 41.22831 4.5470251 7.163006
# 2 1 68.30103 7.5328452 7.163006
# 3 1 65.34278 7.2065835 7.163006
# 4 2 91.59863 1.2981349 55.743758
# 5 2 81.31131 1.1523431 55.743758
# 6 2 67.58900 0.9578707 55.743758
我想
What I am trying to do is to populate the 'HourSpent' column in the Calendar df by doing the following calculation: (('Hours' / 'HourSpent') * 0.79)
有错字吗?因为那需要解决 Hours - HourSpent^2 = 0
.
形式的问题
编辑:
此外,使用 for loop
也没什么问题(尤其是如果您是初学者;但这在大型数据集上可能会很慢)。如果我们适当地充实其逻辑,那么这就是您的 for loop
的样子:
for(i in 1:nrow(Calendar)){
for(j in 1:nrow(Test)){
if(Calendar$Week[i] == Test$Group.1[j] & is.na(Calendar$HourSpent[i])){
Calendar$HourSpent[i] <- ((Calendar$Hours[i]/Test$x[j]) * 0.79)
}
}
}
(基本思路:如果Week
值和Group.1
值为equal/identical,且对应的HourSpent
列还没有填写,然后计算HourSpent
.)
我有两个数据框:
测试
Group.1 x
1 25.5
2 51
3 51.5
4 50
5 51.5
6 60
...
53 35.5
日历
Week Hours HourSpent
1 8.5
1 8.5
1 0
2 8.5
2 8.5
2 8.5
2 8.5
2 8.5
2 6.5
2 8.5
3 7.0
3 7.0
3 8.2
...
我想做的是通过执行以下计算来填充日历 df 中的 'HourSpent' 列:(('Hours' / 'HourSpent') * 0.79)
我希望能够遍历日历 df 中的每一行并将行 'Hours' 值除以匹配的 'HourSpent' 值。 'HoursSpent' 值可以从 'Test' df 中确定...因此,如果日历 df 中 'Week' 列中的值与 'Group.1' 列中的任何值相匹配 'Test' df 然后我希望 'Test' df 的 'x' 列中的相应值是 'HourSpent' 值。
例如
日历 df 中的第 1 行将为 8.5 / 25.5 * 0.79
...这将应用于前 3 行,因为周数为 1。然后当我们到达第 4 行时,计算将更改为 8.5/ 51 * 0.79
等等...等等
所需输出 - 日历 df
Week Hours HourSpent
1 8.5 0.2633
1 8.5 0.2633
1 0 0
2 8.5 0.1317
2 8.5 0.1317
2 8.5 0.1317
2 8.5 0.1317
2 8.5 0.1317
2 6.5 0.1007
2 8.5 0.1317
3 7.0 0.1074
...
已尝试代码
for (i in 1:nrow(Calendar)){
Calendar$'HourSpent' <- ifelse(Calendar$Week == Test$Group.1,
(Calendar$Hours/Test$x)*0.79,
0)
}
问题是这似乎只适用于一行然后其他一切都是 0...这个问题有更好的解决方案吗?
非常感谢
Test <- data.frame(`Group.1` = c(1, 2, 3, 4), x = c(25.5, 51, 51.5, 50))
Calendar <- data.frame(Week = c(1, 1, 1, 2, 2, 2, 3, 3, 3), Hours = c(8.5, 8.5, 0, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5))
Calendar <- dplyr::inner_join(Calendar, Test, by = c("Week" = "Group.1")) %>%
dplyr::mutate(Hours_spent = (Hours/x)*0.79)
输出
Calendar
Week Hours x Hours_spent
1 1 8.5 25.5 0.2633333
2 1 8.5 25.5 0.2633333
3 1 0.0 25.5 0.0000000
4 2 8.5 51.0 0.1316667
5 2 8.5 51.0 0.1316667
6 2 8.5 51.0 0.1316667
7 3 8.5 51.5 0.1303883
8 3 8.5 51.5 0.1303883
9 3 8.5 51.5 0.1303883
基础R
解法:
Test <- data.frame(Group.1 = 1:4, x = runif(4)*100, stringsAsFactors = FALSE)
Calendar <- data.frame(Week = sort(sample(1:4, 10, replace = TRUE)), Hours = runif(10)*100, HourSpent = NA, stringsAsFactors = FALSE)
head(Test)
# Group.1 x
# 1 1 7.163006
# 2 2 55.743758
# 3 3 48.983705
# 4 4 49.429236
head(Calendar)
# Week Hours HourSpent
# 1 1 41.22831 NA
# 2 1 68.30103 NA
# 3 1 65.34278 NA
# 4 2 91.59863 NA
# 5 2 81.31131 NA
# 6 2 67.58900 NA
names(Test)[which(names(Test) == "Group.1")] <- "Week"
Calendar <- merge(Calendar, Test, by = "Week", all.x = TRUE)
Calendar$HourSpent <- ((Calendar$Hours/Calendar$x) * 0.79)
head(Calendar)
# Week Hours HourSpent x
# 1 1 41.22831 4.5470251 7.163006
# 2 1 68.30103 7.5328452 7.163006
# 3 1 65.34278 7.2065835 7.163006
# 4 2 91.59863 1.2981349 55.743758
# 5 2 81.31131 1.1523431 55.743758
# 6 2 67.58900 0.9578707 55.743758
我想
What I am trying to do is to populate the 'HourSpent' column in the Calendar df by doing the following calculation: (('Hours' / 'HourSpent') * 0.79)
有错字吗?因为那需要解决 Hours - HourSpent^2 = 0
.
编辑:
此外,使用 for loop
也没什么问题(尤其是如果您是初学者;但这在大型数据集上可能会很慢)。如果我们适当地充实其逻辑,那么这就是您的 for loop
的样子:
for(i in 1:nrow(Calendar)){
for(j in 1:nrow(Test)){
if(Calendar$Week[i] == Test$Group.1[j] & is.na(Calendar$HourSpent[i])){
Calendar$HourSpent[i] <- ((Calendar$Hours[i]/Test$x[j]) * 0.79)
}
}
}
(基本思路:如果Week
值和Group.1
值为equal/identical,且对应的HourSpent
列还没有填写,然后计算HourSpent
.)