通过从另一个 table 中查找值来更新列值
update column values by looking up value from another table
所以我希望能够根据 df table 找到 ScoreLU 值。例如,DSCRpd 中的值 1.3730682 应该 return ScoreLU 值 60,因为它大于 1.35 但小于下一个值 1.65。
另一方面,对于杠杆列,它需要按降序排列,即第一个值 2.01 应该 return 值 60,因为它小于 2.5 但大于下一个值 2.0 .
[df][1]
DSCRpd Leverage TCB
1 1.3730682 2.010122 -1590099.11
2 1.0449597 2.680051 493370.85
3 1.0311141 4.790531 21594.63
4 1.3923007 3.279903 -499326.76
5 1.6443938 3.853003 988780.79
6 0.6265976 1.814359 1003736.73
7 2.1025253 4.412528 1245305.83
8 1.2872873 2.074424 -688305.83
9 0.5088294 2.504510 1406986.68
10 1.7794307 3.724905 1132513.33
[ScoreLU][2]
Score DSCRpd Leverage TCB
1: 0 0.65 5.0 0
2: 10 0.80 4.5 100000
3: 20 0.95 4.0 250000
4: 30 1.10 3.5 500000
5: 40 1.20 3.0 850000
6: 50 1.26 2.5 1250000
7: 60 1.35 2.0 1700000
8: 70 1.65 1.5 2300000
9: 80 2.00 1.0 2900000
10: 90 2.30 0.5 3600000
是的,就像excel中的vlookup函数一样,具有Asc和Desc排序能力。帮助。
我有一个函数可以正确获取值...但是我如何在每一列上使用它来将值填充到正确的列中,即对于 DSCRpd 分数,结果应该更新到名为 DSCRpdScore 的列.
此函数查看列号为 CN 的数据框 'df',并且 return 是基于 x 的适当值。
myFUN = function(df, x, CN){
if (dtScoreLU[1,CN] <= median(dtScoreLU[,CN])){
myMax = max(dtScoreLU[(dtScoreLU[,CN] <= x),CN])
return(dtScoreLU %>% select(Score) %>%
filter(dtScoreLU[,CN] == myMax))
} else {
myMin = min(dtScoreLU[as.vector(dtScoreLU[,CN] >= x),CN])
return(dtScoreLU %>% select(Score) %>%
filter(dtScoreLU[,CN] == myMin))
}
}
据我了解,这似乎是 data.table
滚动连接功能的一个很好的候选者。
So I want to be able to find the ScoreLU value based on the df table.
For example the value of 1.3730682 in DSCRpd should return the ScoreLU
value of 60 because it is larger than 1.35 but less than the next
value of 1.65.
library(data.table)
ScoreLU[, .(DSCRpd, Score)][df, ,on = 'DSCRpd', roll = TRUE]
DSCRpd Score Leverage TCB
1: 1.3730682 60 2.010122 -1590099.11
2: 1.0449597 20 2.680051 493370.85
3: 1.0311141 20 4.790531 21594.63
4: 1.3923007 60 3.279903 -499326.76
5: 1.6443938 60 3.853003 988780.79
6: 0.6265976 NA 1.814359 1003736.73
7: 2.1025253 80 4.412528 1245305.83
8: 1.2872873 50 2.074424 -688305.83
9: 0.5088294 NA 2.504510 1406986.68
10: 1.7794307 70 3.724905 1132513.33
On the other hand for the Leverage column it needs to be in Desc order
i.e. the first value of 2.01 should return the value of 60 as it is
less than 2.5 but greater than the next value of 2.0.
ScoreLU[, .(Leverage, Score)][df, , on = 'Leverage', roll = TRUE]
Leverage Score DSCRpd TCB
1: 2.010122 60 1.3730682 -1590099.11
2: 2.680051 50 1.0449597 493370.85
3: 4.790531 10 1.0311141 21594.63
4: 3.279903 40 1.3923007 -499326.76
5: 3.853003 30 1.6443938 988780.79
6: 1.814359 70 0.6265976 1003736.73
7: 4.412528 20 2.1025253 1245305.83
8: 2.074424 60 1.2872873 -688305.83
9: 2.504510 50 0.5088294 1406986.68
10: 3.724905 30 1.7794307 1132513.33
如果您愿意,可以将它们组合起来:
ScoreLU[, .(Leverage, Score)][
ScoreLU[, .(DSCRpd, Score)][
df, ,on = 'DSCRpd', roll = TRUE
], , on = 'Leverage', roll = TRUE]
Leverage Score DSCRpd i.Score TCB
1: 2.010122 60 1.3730682 60 -1590099.11
2: 2.680051 50 1.0449597 20 493370.85
3: 4.790531 10 1.0311141 20 21594.63
4: 3.279903 40 1.3923007 60 -499326.76
5: 3.853003 30 1.6443938 60 988780.79
6: 1.814359 70 0.6265976 NA 1003736.73
7: 4.412528 20 2.1025253 80 1245305.83
8: 2.074424 60 1.2872873 50 -688305.83
9: 2.504510 50 0.5088294 NA 1406986.68
10: 3.724905 30 1.7794307 70 1132513.33
对于Score
两个变量的结尾,您可以根据需要指定rollends
参数。如果您也有时间,我会给 ?data.table
通读一遍。它有助于入门,因为语法有时可能有点不透明。
我是 data.table
的新手,欢迎有更多专业知识的人加入。
数据
ScoreLU <- structure(list(Score = c(0L, 10L, 20L, 30L, 40L, 50L, 60L, 70L, 80L, 90L),
DSCRpd = c(0.65, 0.8, 0.95, 1.1, 1.2, 1.26, 1.35, 1.65, 2, 2.3),
Leverage = c(5, 4.5, 4, 3.5, 3, 2.5, 2, 1.5, 1, 0.5),
TCB = c(0L, 100000L, 250000L, 500000L, 850000L, 1250000L, 1700000L, 2300000L, 2900000L, 3600000L)),
.Names = c("Score", "DSCRpd", "Leverage", "TCB"), row.names = c(NA, -10L), class = c("data.table", "data.frame"))
df <- structure(list(DSCRpd = c(1.3730682, 1.0449597, 1.0311141, 1.3923007, 1.6443938, 0.6265976, 2.1025253, 1.2872873, 0.5088294, 1.7794307),
Leverage = c(2.010122, 2.680051, 4.790531, 3.279903, 3.853003, 1.814359, 4.412528, 2.074424, 2.50451, 3.724905),
TCB = c(-1590099.11, 493370.85, 21594.63, -499326.76, 988780.79, 1003736.73, 1245305.83, -688305.83, 1406986.68, 1132513.33)),
.Names = c("DSCRpd", "Leverage", "TCB"),
row.names = c(NA, -10L), class = c("data.table", "data.frame" ))
所以我希望能够根据 df table 找到 ScoreLU 值。例如,DSCRpd 中的值 1.3730682 应该 return ScoreLU 值 60,因为它大于 1.35 但小于下一个值 1.65。
另一方面,对于杠杆列,它需要按降序排列,即第一个值 2.01 应该 return 值 60,因为它小于 2.5 但大于下一个值 2.0 .
[df][1]
DSCRpd Leverage TCB
1 1.3730682 2.010122 -1590099.11
2 1.0449597 2.680051 493370.85
3 1.0311141 4.790531 21594.63
4 1.3923007 3.279903 -499326.76
5 1.6443938 3.853003 988780.79
6 0.6265976 1.814359 1003736.73
7 2.1025253 4.412528 1245305.83
8 1.2872873 2.074424 -688305.83
9 0.5088294 2.504510 1406986.68
10 1.7794307 3.724905 1132513.33
[ScoreLU][2]
Score DSCRpd Leverage TCB
1: 0 0.65 5.0 0
2: 10 0.80 4.5 100000
3: 20 0.95 4.0 250000
4: 30 1.10 3.5 500000
5: 40 1.20 3.0 850000
6: 50 1.26 2.5 1250000
7: 60 1.35 2.0 1700000
8: 70 1.65 1.5 2300000
9: 80 2.00 1.0 2900000
10: 90 2.30 0.5 3600000
是的,就像excel中的vlookup函数一样,具有Asc和Desc排序能力。帮助。
我有一个函数可以正确获取值...但是我如何在每一列上使用它来将值填充到正确的列中,即对于 DSCRpd 分数,结果应该更新到名为 DSCRpdScore 的列.
此函数查看列号为 CN 的数据框 'df',并且 return 是基于 x 的适当值。
myFUN = function(df, x, CN){
if (dtScoreLU[1,CN] <= median(dtScoreLU[,CN])){
myMax = max(dtScoreLU[(dtScoreLU[,CN] <= x),CN])
return(dtScoreLU %>% select(Score) %>%
filter(dtScoreLU[,CN] == myMax))
} else {
myMin = min(dtScoreLU[as.vector(dtScoreLU[,CN] >= x),CN])
return(dtScoreLU %>% select(Score) %>%
filter(dtScoreLU[,CN] == myMin))
}
}
据我了解,这似乎是 data.table
滚动连接功能的一个很好的候选者。
So I want to be able to find the ScoreLU value based on the df table. For example the value of 1.3730682 in DSCRpd should return the ScoreLU value of 60 because it is larger than 1.35 but less than the next value of 1.65.
library(data.table)
ScoreLU[, .(DSCRpd, Score)][df, ,on = 'DSCRpd', roll = TRUE]
DSCRpd Score Leverage TCB
1: 1.3730682 60 2.010122 -1590099.11
2: 1.0449597 20 2.680051 493370.85
3: 1.0311141 20 4.790531 21594.63
4: 1.3923007 60 3.279903 -499326.76
5: 1.6443938 60 3.853003 988780.79
6: 0.6265976 NA 1.814359 1003736.73
7: 2.1025253 80 4.412528 1245305.83
8: 1.2872873 50 2.074424 -688305.83
9: 0.5088294 NA 2.504510 1406986.68
10: 1.7794307 70 3.724905 1132513.33
On the other hand for the Leverage column it needs to be in Desc order i.e. the first value of 2.01 should return the value of 60 as it is less than 2.5 but greater than the next value of 2.0.
ScoreLU[, .(Leverage, Score)][df, , on = 'Leverage', roll = TRUE]
Leverage Score DSCRpd TCB
1: 2.010122 60 1.3730682 -1590099.11
2: 2.680051 50 1.0449597 493370.85
3: 4.790531 10 1.0311141 21594.63
4: 3.279903 40 1.3923007 -499326.76
5: 3.853003 30 1.6443938 988780.79
6: 1.814359 70 0.6265976 1003736.73
7: 4.412528 20 2.1025253 1245305.83
8: 2.074424 60 1.2872873 -688305.83
9: 2.504510 50 0.5088294 1406986.68
10: 3.724905 30 1.7794307 1132513.33
如果您愿意,可以将它们组合起来:
ScoreLU[, .(Leverage, Score)][
ScoreLU[, .(DSCRpd, Score)][
df, ,on = 'DSCRpd', roll = TRUE
], , on = 'Leverage', roll = TRUE]
Leverage Score DSCRpd i.Score TCB
1: 2.010122 60 1.3730682 60 -1590099.11
2: 2.680051 50 1.0449597 20 493370.85
3: 4.790531 10 1.0311141 20 21594.63
4: 3.279903 40 1.3923007 60 -499326.76
5: 3.853003 30 1.6443938 60 988780.79
6: 1.814359 70 0.6265976 NA 1003736.73
7: 4.412528 20 2.1025253 80 1245305.83
8: 2.074424 60 1.2872873 50 -688305.83
9: 2.504510 50 0.5088294 NA 1406986.68
10: 3.724905 30 1.7794307 70 1132513.33
对于Score
两个变量的结尾,您可以根据需要指定rollends
参数。如果您也有时间,我会给 ?data.table
通读一遍。它有助于入门,因为语法有时可能有点不透明。
我是 data.table
的新手,欢迎有更多专业知识的人加入。
数据
ScoreLU <- structure(list(Score = c(0L, 10L, 20L, 30L, 40L, 50L, 60L, 70L, 80L, 90L),
DSCRpd = c(0.65, 0.8, 0.95, 1.1, 1.2, 1.26, 1.35, 1.65, 2, 2.3),
Leverage = c(5, 4.5, 4, 3.5, 3, 2.5, 2, 1.5, 1, 0.5),
TCB = c(0L, 100000L, 250000L, 500000L, 850000L, 1250000L, 1700000L, 2300000L, 2900000L, 3600000L)),
.Names = c("Score", "DSCRpd", "Leverage", "TCB"), row.names = c(NA, -10L), class = c("data.table", "data.frame"))
df <- structure(list(DSCRpd = c(1.3730682, 1.0449597, 1.0311141, 1.3923007, 1.6443938, 0.6265976, 2.1025253, 1.2872873, 0.5088294, 1.7794307),
Leverage = c(2.010122, 2.680051, 4.790531, 3.279903, 3.853003, 1.814359, 4.412528, 2.074424, 2.50451, 3.724905),
TCB = c(-1590099.11, 493370.85, 21594.63, -499326.76, 988780.79, 1003736.73, 1245305.83, -688305.83, 1406986.68, 1132513.33)),
.Names = c("DSCRpd", "Leverage", "TCB"),
row.names = c(NA, -10L), class = c("data.table", "data.frame" ))