Pivot_wider 在 tidyr 中即使没有重复或缺失数据也会创建列表列
Pivot_wider in tidyr creates list cols even when there is no duplicate or missing data
这是我的代码:
# reading input file
library(readxl)
df_testing <- read_excel("Testing_Data.xlsx")
# Renaming the 1st column name for ease of use
colnames(df_testing)[1] = "Tag_No"
View(df_testing)
# creating a new data frame with columns from the row values
library(tidyr)
df_output = pivot_wider(df_testing, names_from = Tag_No, values_from = Reading)
# the below output is as expected, yet coming in list cols
View(df_output)
# this below code is an attempt to fix but replaces last row values with NA
# df_output = lapply(df_output, unlist)
# df_output = data.frame(lapply(df_output, `length<-`, max(lengths(df_output))))
# level count should be equal to no of columns created
length(levels(df_testing$Tag_No)) == ncol(df_output) - 3
# save output to the file. Since, it is in list cols, I can't save the data to the file
write.csv(df_output, file = "Output File.csv")
这是输入数据
file link 1
这是预期输出数据的样本
file link 2
欢迎对代码进行任何更改以在不丢失数据的情况下正常工作或提供完整的解决方案。提前致谢。如果我误解了pivot_wider用法的概念,请给予一些提示。
问题是因为 NA
值。大约有 59 行,其中包含 NA
。
library(readxl)
library(tidyr)
df_testing <- read_excel("Testing_Data.xlsx")
df_testing %>% filter(is.na(`Tag No.`))
# A tibble: 59 x 4
# `Tag No.` Reading Date Time
# <chr> <dbl> <dttm> <dttm>
# 1 NA NA NA NA
# 2 NA NA NA NA
# 3 NA NA NA NA
# 4 NA NA NA NA
# 5 NA NA NA NA
# 6 NA NA NA NA
# 7 NA NA NA NA
# 8 NA NA NA NA
# 9 NA NA NA NA
#10 NA NA NA NA
# … with 49 more rows
删除 NA
行不会给出列表列。
df_output <- pivot_wider(na.omit(df_testing), names_from = `Tag No.`, values_from = Reading)
df_output
这是我的代码:
# reading input file
library(readxl)
df_testing <- read_excel("Testing_Data.xlsx")
# Renaming the 1st column name for ease of use
colnames(df_testing)[1] = "Tag_No"
View(df_testing)
# creating a new data frame with columns from the row values
library(tidyr)
df_output = pivot_wider(df_testing, names_from = Tag_No, values_from = Reading)
# the below output is as expected, yet coming in list cols
View(df_output)
# this below code is an attempt to fix but replaces last row values with NA
# df_output = lapply(df_output, unlist)
# df_output = data.frame(lapply(df_output, `length<-`, max(lengths(df_output))))
# level count should be equal to no of columns created
length(levels(df_testing$Tag_No)) == ncol(df_output) - 3
# save output to the file. Since, it is in list cols, I can't save the data to the file
write.csv(df_output, file = "Output File.csv")
这是输入数据 file link 1
这是预期输出数据的样本 file link 2
欢迎对代码进行任何更改以在不丢失数据的情况下正常工作或提供完整的解决方案。提前致谢。如果我误解了pivot_wider用法的概念,请给予一些提示。
问题是因为 NA
值。大约有 59 行,其中包含 NA
。
library(readxl)
library(tidyr)
df_testing <- read_excel("Testing_Data.xlsx")
df_testing %>% filter(is.na(`Tag No.`))
# A tibble: 59 x 4
# `Tag No.` Reading Date Time
# <chr> <dbl> <dttm> <dttm>
# 1 NA NA NA NA
# 2 NA NA NA NA
# 3 NA NA NA NA
# 4 NA NA NA NA
# 5 NA NA NA NA
# 6 NA NA NA NA
# 7 NA NA NA NA
# 8 NA NA NA NA
# 9 NA NA NA NA
#10 NA NA NA NA
# … with 49 more rows
删除 NA
行不会给出列表列。
df_output <- pivot_wider(na.omit(df_testing), names_from = `Tag No.`, values_from = Reading)
df_output