在 R 中使用重塑将数据框从长更改为宽。未定义的列错误

Using reshape to change dataframe from long to wide in R. Undefined columns error

我试图制作一个长 table 宽,同时创建唯一变量以保留粒度细节,即将变量与序列变量 var1.seq1 var1.seq2

组合

reshape 似乎是我的救星,但我一直遇到未定义的列选择错误。

n.b。为简单起见,我没有将样本数据包含在整个序列号范围内,但它们确实达到了 180。

数据示例可在 github here


reshape(df, idvar = "MergeEncounterRecno", timevar = "Sequenceno", direction = "wide")

Error in [.data.frame(data, , timevar) : undefined columns selected

看起来像是打字错误。试试这个:

#Code
dfres <- reshape(df, idvar = "MergeEncounterRecno", timevar = "SequenceNo", direction = "wide")

下面是一个使用 pivot longer 的例子:

是否要对列进行透视 NationalDiagnosis 然后使用第二个透视函数将 values 转换为字符(而不是数字)。

library(tidyverse)

df <- read_csv("https://raw.githubusercontent.com/Chazzer90/Whosebughelp2/main/SEQ_anom.csv")
#> Parsed with column specification:
#> cols(
#>   `<ef>..MergeRecno` = col_double(),
#>   MergeEncounterRecno = col_double(),
#>   SequenceNo = col_double(),
#>   DiagnosticSchemeCode = col_double(),
#>   DiagnosisCode = col_double(),
#>   DiagnosisSiteCode = col_character(),
#>   NationalDiagnosisCode = col_double(),
#>   NationalDiagnosis = col_character()
#> )

df %>% 
  mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>% 
  pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosisCode,
               names_to = 'variables', values_to = 'Values',
               values_drop_na = TRUE,
               names_ptypes = list(Values = integer()))
#> # A tibble: 134 x 6
#>    `\xef..MergeRec~ MergeEncounterR~ SequenceNo NationalDiagnos~ variables
#>               <dbl>            <dbl>      <dbl> <chr>            <chr>    
#>  1              402           545353          1 Muscle/tendon i~ Diagnost~
#>  2              402           545353          1 Muscle/tendon i~ Diagnosi~
#>  3              402           545353          1 Muscle/tendon i~ Diagnosi~
#>  4              402           545353          1 Muscle/tendon i~ National~
#>  5              758           261891          1 Cardiac conditi~ Diagnost~
#>  6              758           261891          1 Cardiac conditi~ Diagnosi~
#>  7              758           261891          1 Cardiac conditi~ National~
#>  8              894           941852          1 Respiratory con~ Diagnost~
#>  9              894           941852          1 Respiratory con~ Diagnosi~
#> 10              894           941852          1 Respiratory con~ Diagnosi~
#> # ... with 124 more rows, and 1 more variable: Values <dbl>

## do you want to pivot the column NationalDiagnosis
df %>% 
  mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>% 
  pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosis,
               names_to = 'variables', values_to = 'Values',
               values_drop_na = TRUE,
               values_transform = list(Values = as.character))

reprex package (v0.3.0)

于 2020-10-21 创建