在 R 中使用重塑将数据框从长更改为宽。未定义的列错误
Using reshape to change dataframe from long to wide in R. Undefined columns error
我试图制作一个长 table 宽,同时创建唯一变量以保留粒度细节,即将变量与序列变量 var1.seq1 var1.seq2
组合
reshape 似乎是我的救星,但我一直遇到未定义的列选择错误。
n.b。为简单起见,我没有将样本数据包含在整个序列号范围内,但它们确实达到了 180。
数据示例可在 github here
reshape(df, idvar = "MergeEncounterRecno", timevar = "Sequenceno", direction = "wide")
Error in [.data.frame
(data, , timevar) : undefined columns selected
看起来像是打字错误。试试这个:
#Code
dfres <- reshape(df, idvar = "MergeEncounterRecno", timevar = "SequenceNo", direction = "wide")
下面是一个使用 pivot longer 的例子:
是否要对列进行透视 NationalDiagnosis
然后使用第二个透视函数将 values
转换为字符(而不是数字)。
library(tidyverse)
df <- read_csv("https://raw.githubusercontent.com/Chazzer90/Whosebughelp2/main/SEQ_anom.csv")
#> Parsed with column specification:
#> cols(
#> `<ef>..MergeRecno` = col_double(),
#> MergeEncounterRecno = col_double(),
#> SequenceNo = col_double(),
#> DiagnosticSchemeCode = col_double(),
#> DiagnosisCode = col_double(),
#> DiagnosisSiteCode = col_character(),
#> NationalDiagnosisCode = col_double(),
#> NationalDiagnosis = col_character()
#> )
df %>%
mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>%
pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosisCode,
names_to = 'variables', values_to = 'Values',
values_drop_na = TRUE,
names_ptypes = list(Values = integer()))
#> # A tibble: 134 x 6
#> `\xef..MergeRec~ MergeEncounterR~ SequenceNo NationalDiagnos~ variables
#> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 402 545353 1 Muscle/tendon i~ Diagnost~
#> 2 402 545353 1 Muscle/tendon i~ Diagnosi~
#> 3 402 545353 1 Muscle/tendon i~ Diagnosi~
#> 4 402 545353 1 Muscle/tendon i~ National~
#> 5 758 261891 1 Cardiac conditi~ Diagnost~
#> 6 758 261891 1 Cardiac conditi~ Diagnosi~
#> 7 758 261891 1 Cardiac conditi~ National~
#> 8 894 941852 1 Respiratory con~ Diagnost~
#> 9 894 941852 1 Respiratory con~ Diagnosi~
#> 10 894 941852 1 Respiratory con~ Diagnosi~
#> # ... with 124 more rows, and 1 more variable: Values <dbl>
## do you want to pivot the column NationalDiagnosis
df %>%
mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>%
pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosis,
names_to = 'variables', values_to = 'Values',
values_drop_na = TRUE,
values_transform = list(Values = as.character))
由 reprex package (v0.3.0)
于 2020-10-21 创建
我试图制作一个长 table 宽,同时创建唯一变量以保留粒度细节,即将变量与序列变量 var1.seq1 var1.seq2
组合reshape 似乎是我的救星,但我一直遇到未定义的列选择错误。
n.b。为简单起见,我没有将样本数据包含在整个序列号范围内,但它们确实达到了 180。
数据示例可在 github here
reshape(df, idvar = "MergeEncounterRecno", timevar = "Sequenceno", direction = "wide")
Error in
[.data.frame
(data, , timevar) : undefined columns selected
看起来像是打字错误。试试这个:
#Code
dfres <- reshape(df, idvar = "MergeEncounterRecno", timevar = "SequenceNo", direction = "wide")
下面是一个使用 pivot longer 的例子:
是否要对列进行透视 NationalDiagnosis
然后使用第二个透视函数将 values
转换为字符(而不是数字)。
library(tidyverse)
df <- read_csv("https://raw.githubusercontent.com/Chazzer90/Whosebughelp2/main/SEQ_anom.csv")
#> Parsed with column specification:
#> cols(
#> `<ef>..MergeRecno` = col_double(),
#> MergeEncounterRecno = col_double(),
#> SequenceNo = col_double(),
#> DiagnosticSchemeCode = col_double(),
#> DiagnosisCode = col_double(),
#> DiagnosisSiteCode = col_character(),
#> NationalDiagnosisCode = col_double(),
#> NationalDiagnosis = col_character()
#> )
df %>%
mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>%
pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosisCode,
names_to = 'variables', values_to = 'Values',
values_drop_na = TRUE,
names_ptypes = list(Values = integer()))
#> # A tibble: 134 x 6
#> `\xef..MergeRec~ MergeEncounterR~ SequenceNo NationalDiagnos~ variables
#> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 402 545353 1 Muscle/tendon i~ Diagnost~
#> 2 402 545353 1 Muscle/tendon i~ Diagnosi~
#> 3 402 545353 1 Muscle/tendon i~ Diagnosi~
#> 4 402 545353 1 Muscle/tendon i~ National~
#> 5 758 261891 1 Cardiac conditi~ Diagnost~
#> 6 758 261891 1 Cardiac conditi~ Diagnosi~
#> 7 758 261891 1 Cardiac conditi~ National~
#> 8 894 941852 1 Respiratory con~ Diagnost~
#> 9 894 941852 1 Respiratory con~ Diagnosi~
#> 10 894 941852 1 Respiratory con~ Diagnosi~
#> # ... with 124 more rows, and 1 more variable: Values <dbl>
## do you want to pivot the column NationalDiagnosis
df %>%
mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>%
pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosis,
names_to = 'variables', values_to = 'Values',
values_drop_na = TRUE,
values_transform = list(Values = as.character))
由 reprex package (v0.3.0)
于 2020-10-21 创建