这个 data.table R 代码有 Neat/Simplest 方法吗?
Is There A Neat/Simplest Way To This data.table R Code?
OECD 数据中的 STRATUM 太长了,为了简单起见,我使用了这个名称,并希望将其简化为更短和更精确的命名,如下面的代码所示。
pisaMas[,`:=`
(SchoolType = c(ifelse(STRATUM == "National Secondary School", "Public",
ifelse(STRATUM == "Religious School", "Religious",
ifelse(STRATUM == "MOE Technical School", "Technical",0)))))]
pisaMas[,table(SchoolType)]
我想知道是否有解决这个问题的简单方法,使用 data.table 包。
这是我经过一番思考得出的。
#' First I create a function (rname.SchType) that have oldname and newname using else if:
rname.SchType <- function(x){
if (is.na(x)) NA
else if (x == "MYS - stratum 01: MOE National Secondary School\Other States")"Public"
else if(x == "MYS - stratum 02: MOE Religious School\Other States")"Religious"
else if(x == "MYS - stratum 03: MOE Technical School\Other States")"Technical"
else if(x == "MYS - stratum 04: MOE Fully Residential School")"SBP"
else if(x == "MYS - stratum 05: non-MOE MARA Junior Science College\Other States")"MARA"
else if(x == "MYS - stratum 06: non-MOE Other Schools\Other States")"Private"
else if(x == "MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”")"Perlis Fully Residential"
else if(x == "MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”")"Putrajaya Fully Residential"
else if(x == "MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”")"Labuan Fully Residential"
}
通过使用我刚刚创建的函数,我通过在 data.table 中应用基础 R(sapply),仅用一行代码将其通过 data.table,从而避免了代码混乱-ness 并且看起来更简单:
pisaMalaysia[,`:=`(jenisSekolah = sapply(STRATUM,rname.SchType))]
data.table
的当前开发版本具有针对这种情况的新函数 fcase
(仿照 SQL CASE WHEN
):
pisaMas[ , SchoolType := fcase(
STRATUM == "National Secondary School", "Public",
STRATUM == "Religious School", "Religious",
STRATUM == "MOE Technical School", "Technical",
default = ''
)]
pisaMas[ , table(SchoolType)]
要获取开发版,请尝试
install.packages(
'data.table', type = 'source',repos = 'http://Rdatatable.github.io/data.table'
)
如果简单安装不起作用,您可以查看安装 wiki 了解更多详细信息:
https://github.com/Rdatatable/data.table/wiki/Installation
您也可以通过查找 table 来解决此问题,详情请参阅此问答:
我想我终于找到了上面问题的答案!这个答案克服了@Roland提到的 'not vectorized' 问题,谢谢先生!在我看来,它要快得多,尽管我花了几周的时间才理解这个概念并在网上找到正确的问题!
首先,我创建了一个新的 data.table,它包含 2 列,一列是原始名称,第二列是学校所需的名称。
lookUpStratum <- data.table(STRATUM=c("MYS - stratum 01: MOE National Secondary School\Other States",
"MYS - stratum 02: MOE Religious School\Other States",
"MYS - stratum 03: MOE Technical School\Other States",
"MYS - stratum 04: MOE Fully Residential School",
"MYS - stratum 05: non-MOE MARA Junior Science College\Other States",
"MYS - stratum 06: non-MOE Other Schools\Other States",
"MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”",
"MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”",
"MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”"),
SCH.TYPE=c("Public",
"Religious",
"Technical",
"SBP",
"MARA",
"Private",
"Perlis Fully Residential",
"Putrajaya Fully Residential",
"Labuan Fully Residential"))
答案在于 setDT
(通过引用将列表和 data.frames 强制转换为 data.table)。
使用我阅读的这行代码,它看起来有点长但它解决了我的问题!老实说,在我理解下面最短的代码之前,我首先理解了这一点。
setDT(pisaMalaysia)[,SCH.TYPE := lookUpStratum$SCH.TYPE[match(pisaMalaysia$STRATUM,lookUpStratum$STRATUM)]]
几分钟后,我终于设法理解这段代码 并生成了这段代码:
setDT(pisaMalaysia)[lookUpStratum,SCH.TYPE1 := i.SCH.TYPE, on = c(STRATUM = "STRATUM")]
我从同一个 post .
那里得到了这些答案
检查一切是否相同:
table(pisaMalaysia$SCH.TYPE)
table(pisaMalaysia$SCH.TYPE1)
#' original data
pisaMalaysia[,table(STRATUM)]
结果:
> table(pisaMalaysia$SCH.TYPE)
Labuan Fully Residential MARA Perlis Fully Residential
54 122 78
Private Public Putrajaya Fully Residential
385 4929 78
Religious SBP Technical
273 2661 281
> table(pisaMalaysia$SCH.TYPE1)
Labuan Fully Residential MARA Perlis Fully Residential
54 122 78
Private Public Putrajaya Fully Residential
385 4929 78
Religious SBP Technical
273 2661 281
> pisaMalaysia[,table(STRATUM)]
STRATUM
MYS - stratum 01: MOE National Secondary School\Other States
4929
MYS - stratum 02: MOE Religious School\Other States
273
MYS - stratum 03: MOE Technical School\Other States
281
MYS - stratum 04: MOE Fully Residential School
2661
MYS - stratum 05: non-MOE MARA Junior Science College\Other States
122
MYS - stratum 06: non-MOE Other Schools\Other States
385
MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”
78
MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”
78
MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”
54
谢谢!希望这对其他人也有帮助。
OECD 数据中的 STRATUM 太长了,为了简单起见,我使用了这个名称,并希望将其简化为更短和更精确的命名,如下面的代码所示。
pisaMas[,`:=`
(SchoolType = c(ifelse(STRATUM == "National Secondary School", "Public",
ifelse(STRATUM == "Religious School", "Religious",
ifelse(STRATUM == "MOE Technical School", "Technical",0)))))]
pisaMas[,table(SchoolType)]
我想知道是否有解决这个问题的简单方法,使用 data.table 包。
这是我经过一番思考得出的。
#' First I create a function (rname.SchType) that have oldname and newname using else if:
rname.SchType <- function(x){
if (is.na(x)) NA
else if (x == "MYS - stratum 01: MOE National Secondary School\Other States")"Public"
else if(x == "MYS - stratum 02: MOE Religious School\Other States")"Religious"
else if(x == "MYS - stratum 03: MOE Technical School\Other States")"Technical"
else if(x == "MYS - stratum 04: MOE Fully Residential School")"SBP"
else if(x == "MYS - stratum 05: non-MOE MARA Junior Science College\Other States")"MARA"
else if(x == "MYS - stratum 06: non-MOE Other Schools\Other States")"Private"
else if(x == "MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”")"Perlis Fully Residential"
else if(x == "MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”")"Putrajaya Fully Residential"
else if(x == "MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”")"Labuan Fully Residential"
}
通过使用我刚刚创建的函数,我通过在 data.table 中应用基础 R(sapply),仅用一行代码将其通过 data.table,从而避免了代码混乱-ness 并且看起来更简单:
pisaMalaysia[,`:=`(jenisSekolah = sapply(STRATUM,rname.SchType))]
data.table
的当前开发版本具有针对这种情况的新函数 fcase
(仿照 SQL CASE WHEN
):
pisaMas[ , SchoolType := fcase(
STRATUM == "National Secondary School", "Public",
STRATUM == "Religious School", "Religious",
STRATUM == "MOE Technical School", "Technical",
default = ''
)]
pisaMas[ , table(SchoolType)]
要获取开发版,请尝试
install.packages(
'data.table', type = 'source',repos = 'http://Rdatatable.github.io/data.table'
)
如果简单安装不起作用,您可以查看安装 wiki 了解更多详细信息:
https://github.com/Rdatatable/data.table/wiki/Installation
您也可以通过查找 table 来解决此问题,详情请参阅此问答:
我想我终于找到了上面问题的答案!这个答案克服了@Roland提到的 'not vectorized' 问题,谢谢先生!在我看来,它要快得多,尽管我花了几周的时间才理解这个概念并在网上找到正确的问题!
首先,我创建了一个新的 data.table,它包含 2 列,一列是原始名称,第二列是学校所需的名称。
lookUpStratum <- data.table(STRATUM=c("MYS - stratum 01: MOE National Secondary School\Other States",
"MYS - stratum 02: MOE Religious School\Other States",
"MYS - stratum 03: MOE Technical School\Other States",
"MYS - stratum 04: MOE Fully Residential School",
"MYS - stratum 05: non-MOE MARA Junior Science College\Other States",
"MYS - stratum 06: non-MOE Other Schools\Other States",
"MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”",
"MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”",
"MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”"),
SCH.TYPE=c("Public",
"Religious",
"Technical",
"SBP",
"MARA",
"Private",
"Perlis Fully Residential",
"Putrajaya Fully Residential",
"Labuan Fully Residential"))
答案在于 setDT
(通过引用将列表和 data.frames 强制转换为 data.table)。
使用我阅读的这行代码
setDT(pisaMalaysia)[,SCH.TYPE := lookUpStratum$SCH.TYPE[match(pisaMalaysia$STRATUM,lookUpStratum$STRATUM)]]
几分钟后,我终于设法理解这段代码
setDT(pisaMalaysia)[lookUpStratum,SCH.TYPE1 := i.SCH.TYPE, on = c(STRATUM = "STRATUM")]
我从同一个 post
检查一切是否相同:
table(pisaMalaysia$SCH.TYPE)
table(pisaMalaysia$SCH.TYPE1)
#' original data
pisaMalaysia[,table(STRATUM)]
结果:
> table(pisaMalaysia$SCH.TYPE)
Labuan Fully Residential MARA Perlis Fully Residential
54 122 78
Private Public Putrajaya Fully Residential
385 4929 78
Religious SBP Technical
273 2661 281
> table(pisaMalaysia$SCH.TYPE1)
Labuan Fully Residential MARA Perlis Fully Residential
54 122 78
Private Public Putrajaya Fully Residential
385 4929 78
Religious SBP Technical
273 2661 281
> pisaMalaysia[,table(STRATUM)]
STRATUM
MYS - stratum 01: MOE National Secondary School\Other States
4929
MYS - stratum 02: MOE Religious School\Other States
273
MYS - stratum 03: MOE Technical School\Other States
281
MYS - stratum 04: MOE Fully Residential School
2661
MYS - stratum 05: non-MOE MARA Junior Science College\Other States
122
MYS - stratum 06: non-MOE Other Schools\Other States
385
MYS - stratum 07: Perlis non-“MOE Fully Residential Schools”
78
MYS - stratum 08: Wilayah Persekutuan Putrajaya non-“MOE Fully Residential Schools”
78
MYS - stratum 09: Wilayah Persekutuan Labuan non-“MOE Fully Residential Schools”
54
谢谢!希望这对其他人也有帮助。