在 R data.frame 中向左移动单元格
Shift cells to left in R data.frame
我在 Excel 中有来自应用程序的数据,该应用程序将该数据汇总到不同的表中。 Excel 中的数据看起来不错,但是当我尝试将其导入 R 时,一些列被跳过并且未对齐。我需要整理数据以便绘制它。
下面是一个可重现的样本。
df <- data.frame( ` ` = c("cars","buses","","under 1yr","1-2 yrs","2-5 yrs",">5 yrs"),
fcltA = c("1","5","","","","","" ),
` ` = c("","","fcltA","5","","","1"),
fcltB = c("6","","","","","",""),
` ` = c("","","fcltB","3","","2","1"),
fcltC = c("2","2","","","","",""),
` ` = c("","","fcltC","1","2","","1"),
check.names = FALSE, fix.empty.names = FALSE)
下面是我想要的
dfClnd <- data.frame( ` ` = c("cars","buses","","under 1yr","1-2 yrs","2-5 yrs",">5 yrs"),
fcltA = c("1","5"," fcltA","5","","","1" ),
fcltB = c("3","3","fcltB","3","","2","1"),
fcltC = c("2","2","fcltC","1","2","","1"),
check.names = FALSE, fix.empty.names = FALSE)
我找到了 this 个问题,但它不能很好地解决我的问题,因为它会将一些值转移到不正确的列。
下面是数据的示例:
使用 dplyr and purrr 的解决方案。请注意,在创建 df
时,我设置了 stringsAsFactors = FALSE
以避免创建因子列。这是因为 coalesce
函数不适用于不同的因子水平。 df3
是最终输出。
library(dplyr)
library(purrr)
# Replace "" with NA
df[df == ""] <- NA
# Get the new column names
NewCol <- names(df)
NewCol <- NewCol[!NewCol %in% " "]
# Conduct the merge of columns
df2 <- map_dfc(NewCol, function(x){
df_temp <- df[which(names(df) %in% x) + c(0, 1)]
df_out <- as_data_frame(coalesce(df_temp[, 1], df_temp[, 2])) %>%
setNames(x)
return(df_out)
})
# Merge the first column with the new data frame
# Replace NA with ""
df3 <- bind_cols(df[, 1, drop = FALSE], df2)
df3[is.na(df3)] <- ""
df3
# fcltA fcltB fcltC
# 1 cars 1 6 2
# 2 buses 5 2
# 3 fcltA fcltB fcltC
# 4 under 1yr 5 3 1
# 5 1-2 yrs 2
# 6 2-5 yrs 2
# 7 >5 yrs 1 1 1
数据
df <- data.frame( ` ` = c("cars","buses","","under 1yr","1-2 yrs","2-5 yrs",">5 yrs"),
fcltA = c("1","5","","","","","" ),
` ` = c("","","fcltA","5","","","1"),
fcltB = c("6","","","","","",""),
` ` = c("","","fcltB","3","","2","1"),
fcltC = c("2","2","","","","",""),
` ` = c("","","fcltC","1","2","","1"),
check.names = FALSE, fix.empty.names = FALSE,
stringsAsFactors = FALSE)
我在 Excel 中有来自应用程序的数据,该应用程序将该数据汇总到不同的表中。 Excel 中的数据看起来不错,但是当我尝试将其导入 R 时,一些列被跳过并且未对齐。我需要整理数据以便绘制它。
下面是一个可重现的样本。
df <- data.frame( ` ` = c("cars","buses","","under 1yr","1-2 yrs","2-5 yrs",">5 yrs"),
fcltA = c("1","5","","","","","" ),
` ` = c("","","fcltA","5","","","1"),
fcltB = c("6","","","","","",""),
` ` = c("","","fcltB","3","","2","1"),
fcltC = c("2","2","","","","",""),
` ` = c("","","fcltC","1","2","","1"),
check.names = FALSE, fix.empty.names = FALSE)
下面是我想要的
dfClnd <- data.frame( ` ` = c("cars","buses","","under 1yr","1-2 yrs","2-5 yrs",">5 yrs"),
fcltA = c("1","5"," fcltA","5","","","1" ),
fcltB = c("3","3","fcltB","3","","2","1"),
fcltC = c("2","2","fcltC","1","2","","1"),
check.names = FALSE, fix.empty.names = FALSE)
我找到了 this 个问题,但它不能很好地解决我的问题,因为它会将一些值转移到不正确的列。
下面是数据的示例:
使用 dplyr and purrr 的解决方案。请注意,在创建 df
时,我设置了 stringsAsFactors = FALSE
以避免创建因子列。这是因为 coalesce
函数不适用于不同的因子水平。 df3
是最终输出。
library(dplyr)
library(purrr)
# Replace "" with NA
df[df == ""] <- NA
# Get the new column names
NewCol <- names(df)
NewCol <- NewCol[!NewCol %in% " "]
# Conduct the merge of columns
df2 <- map_dfc(NewCol, function(x){
df_temp <- df[which(names(df) %in% x) + c(0, 1)]
df_out <- as_data_frame(coalesce(df_temp[, 1], df_temp[, 2])) %>%
setNames(x)
return(df_out)
})
# Merge the first column with the new data frame
# Replace NA with ""
df3 <- bind_cols(df[, 1, drop = FALSE], df2)
df3[is.na(df3)] <- ""
df3
# fcltA fcltB fcltC
# 1 cars 1 6 2
# 2 buses 5 2
# 3 fcltA fcltB fcltC
# 4 under 1yr 5 3 1
# 5 1-2 yrs 2
# 6 2-5 yrs 2
# 7 >5 yrs 1 1 1
数据
df <- data.frame( ` ` = c("cars","buses","","under 1yr","1-2 yrs","2-5 yrs",">5 yrs"),
fcltA = c("1","5","","","","","" ),
` ` = c("","","fcltA","5","","","1"),
fcltB = c("6","","","","","",""),
` ` = c("","","fcltB","3","","2","1"),
fcltC = c("2","2","","","","",""),
` ` = c("","","fcltC","1","2","","1"),
check.names = FALSE, fix.empty.names = FALSE,
stringsAsFactors = FALSE)