R:将数据帧的结构转换为另一个数据帧的相同结构
R: Converting structure of a dataframe into the same structure of another dataframe
目前我有 2 个大型数据框,其中包含超过 300,000 个观察值和 100 多个变量,但为了简单起见,我们假设我有 df1:
> str(df1)
'data.frame': 3000 obs. of 3 variables:
$ Name : chr "AAA" "BBB" "CCC" "DDD" ...
$ DateTime : POSIXct, format: "2014-01-01 00:00:00" "2014-01-01 00:10:00" "2014-01-01 00:20:00" ...
$ Age : num 27 25 27 30 ...
df2:
> str(df2)
'data.frame': 3000 obs. of 3 variables:
$ HEX : Factor w/ 500 levels "AAA","BBB",..: 100 100 100 100 ...
$ DateTime : Factor w/ 3000 levels "2014-01-01 00:00:00",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Age : Factor w/ 500 levels "27","25",..: 100 100 100 100 ...
两个数据帧具有相同的值和相同的列数和行数,除了它们的结构与 df2 中的所有因素不同。
我想将 df2 中的结构转换为与 df1 相同的结构。请指教,提前谢谢
假设两个数据框的列与描述的顺序完全相同,您可以在 Map
方法中使用 class
函数。
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
示范[=27=]
之前
lapply(df1, class)
# $name
# [1] "character"
#
# $date
# [1] "POSIXct" "POSIXt"
#
# $age
# [1] "numeric"
#
# $date2
# [1] "Date"
lapply(df2, class)
# $HEX
# [1] "factor"
#
# $date
# [1] "factor"
#
# $age
# [1] "factor"
#
# $date2
# [1] "factor"
转化
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
之后
lapply(df2, class)
# $HEX
# [1] "character"
#
# $date
# [1] "POSIXct" "POSIXt"
#
# $age
# [1] "numeric"
#
# $date2
# [1] "Date"
数据
df1 <- structure(list(name = c("A", "B", "C", "D", "E"), date = structure(c(1577836800,
1580515200, 1583020800, 1585699200, 1588291200), class = c("POSIXct",
"POSIXt")), age = c(30, 27, 25, 28, 23), date2 = structure(c(18262,
18293, 18322, 18353, 18383), class = "Date")), row.names = c(NA,
-5L), class = "data.frame")
df2 <- structure(list(HEX = structure(1:5, .Label = c("A", "B", "C",
"D", "E"), class = "factor"), date = structure(1:5, .Label = c("2020-01-01 01:00:00",
"2020-02-01 01:00:00", "2020-03-01 01:00:00", "2020-04-01 02:00:00",
"2020-05-01 02:00:00"), class = "factor"), age = structure(c(5L,
3L, 2L, 4L, 1L), .Label = c("23", "25", "27", "28", "30"), class = "factor"),
date2 = structure(1:5, .Label = c("2020-01-01", "2020-02-01",
"2020-03-01", "2020-04-01", "2020-05-01"), class = "factor")), row.names = c(NA,
-5L), class = "data.frame")
目前我有 2 个大型数据框,其中包含超过 300,000 个观察值和 100 多个变量,但为了简单起见,我们假设我有 df1:
> str(df1)
'data.frame': 3000 obs. of 3 variables:
$ Name : chr "AAA" "BBB" "CCC" "DDD" ...
$ DateTime : POSIXct, format: "2014-01-01 00:00:00" "2014-01-01 00:10:00" "2014-01-01 00:20:00" ...
$ Age : num 27 25 27 30 ...
df2:
> str(df2)
'data.frame': 3000 obs. of 3 variables:
$ HEX : Factor w/ 500 levels "AAA","BBB",..: 100 100 100 100 ...
$ DateTime : Factor w/ 3000 levels "2014-01-01 00:00:00",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Age : Factor w/ 500 levels "27","25",..: 100 100 100 100 ...
两个数据帧具有相同的值和相同的列数和行数,除了它们的结构与 df2 中的所有因素不同。
我想将 df2 中的结构转换为与 df1 相同的结构。请指教,提前谢谢
假设两个数据框的列与描述的顺序完全相同,您可以在 Map
方法中使用 class
函数。
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
示范[=27=]
之前
lapply(df1, class)
# $name
# [1] "character"
#
# $date
# [1] "POSIXct" "POSIXt"
#
# $age
# [1] "numeric"
#
# $date2
# [1] "Date"
lapply(df2, class)
# $HEX
# [1] "factor"
#
# $date
# [1] "factor"
#
# $age
# [1] "factor"
#
# $date2
# [1] "factor"
转化
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
之后
lapply(df2, class)
# $HEX
# [1] "character"
#
# $date
# [1] "POSIXct" "POSIXt"
#
# $age
# [1] "numeric"
#
# $date2
# [1] "Date"
数据
df1 <- structure(list(name = c("A", "B", "C", "D", "E"), date = structure(c(1577836800,
1580515200, 1583020800, 1585699200, 1588291200), class = c("POSIXct",
"POSIXt")), age = c(30, 27, 25, 28, 23), date2 = structure(c(18262,
18293, 18322, 18353, 18383), class = "Date")), row.names = c(NA,
-5L), class = "data.frame")
df2 <- structure(list(HEX = structure(1:5, .Label = c("A", "B", "C",
"D", "E"), class = "factor"), date = structure(1:5, .Label = c("2020-01-01 01:00:00",
"2020-02-01 01:00:00", "2020-03-01 01:00:00", "2020-04-01 02:00:00",
"2020-05-01 02:00:00"), class = "factor"), age = structure(c(5L,
3L, 2L, 4L, 1L), .Label = c("23", "25", "27", "28", "30"), class = "factor"),
date2 = structure(1:5, .Label = c("2020-01-01", "2020-02-01",
"2020-03-01", "2020-04-01", "2020-05-01"), class = "factor")), row.names = c(NA,
-5L), class = "data.frame")