使用 gather 时尝试维护索引

Trying to maintain index when using gather

我正在尝试将我的数据从宽数据转换为长数据,但由于某种原因,ID 列没有 转换后显示。这是我的数据的样子:

> head(dca)

# A tibble: 6 x 11
  ResponseId  Q9        Q10       Q11       Q12       Q13       Q14       Q15      Q16      Q17      Q18     
  <chr>       <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>    <chr>    <chr>    <chr>   
1 "Response … "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regard… "Regard… "Regard… "Regard…
2 "{\"Import… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Imp… "{\"Imp… "{\"Imp… "{\"Imp…
3 "R_2V7lrA7… "Using F… "Using T… "Using Y… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
4 "R_3nozPOT… "Using F… "Using T… "Using T… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
5 "R_2TB0Wwy… "Using Y… "Using T… "Using T… "Using T… "Using T… "Using F… "Using … "Using … "Using … "Using …
6 "R_2woFtS9… "Using Y… "Using T… "Using Y… "Using T… "Using I… "Using I… "Using … "Using … "Using … "Using …

应用以下转换后:

library(tidyr)
keycol <- "ResponseId"
valuecol <- "Response"
gathercols <- c("Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16","Q17","Q18" )
dca_long<- gather_(dca,dca$ResponseId, keycol, valuecol, gathercols)

这是我得到的:

> head(dca_long)
# A tibble: 6 x 2
  ResponseId 'Response`                                                               
  <chr>      <chr>                                                                                
1 Q9         "Regarding the use of social media, which of the following options would you prefer?"
2 Q9         "{\"ImportId\":\"QID12\"}"                                                           
3 Q9         "Using Facebook on PC for utility"                                                   
4 Q9         "Using Facebook on PC for utility"                                                   
5 Q9         "Using Youtube on mobile for entertainment"                                          
6 Q9         "Using Youtube on mobile for entertainment"      

本质上,我希望在 dca_long 中有一列,其中有一列匹配来自 dca 的 ResponseId 的值。我这样做是为了进一步使 dca 适合 mlogit().

评论中有人请求此输出以更好地理解代码:

> dput(head(dca))
structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}", 
"R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg", 
"R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility", 
"Using Facebook on PC for utility", "Using Youtube on mobile for entertainment", 
"Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility", 
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
"Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility", 
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
"Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility", 
"Using Twitter on mobile for utility", "Using Twitter on mobile for utility", 
"Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility", 
"Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility", 
"Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility", 
"Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment", 
"Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment", 
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
"Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment", 
"Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility", 
"Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment", 
"Using Youtube on PC for entertainment", "Using Instagram on mobile for utility", 
"Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment", 
"Using Youtube on PC for utility", "Using Instagram on mobile for entertainment", 
"Using Instagram on mobile for entertainment")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

此示例可能会帮助您解决问题。

#simulated wide data for reproducibility
wide_data <- read.table(header=TRUE, text='
 subject sex time_1  time_2  time_3   
       1   M     15   16    23 
       2   F     25   20    48 
       3   F     30   25    55 
       4   M     35   32    60 
')

使用 gather 你应该得到类似下面的东西。

gather(data = olddata_wide, 
       key = alternative,
       value = time, 
       c(time_1, time_2, time_3), 
       factor_key=TRUE) 


   subject sex alternative time
1        1   M      time_1   15
2        2   F      time_1   25
3        3   F      time_1   30
4        4   M      time_1   35
5        1   M      time_2   16
6        2   F      time_2   20
7        3   F      time_2   25
8        4   M      time_2   32
9        1   M      time_3   23
10       2   F      time_3   48
11       3   F      time_3   55
12       4   M      time_3   60

如果这没有帮助。请复制您的数据片段 (dca) 以解决其中的问题。最好!

[已编辑]

使用您发布的数据:

df<- structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}", 
                              "R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg", 
                              "R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                           "{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility", 
                                                           "Using Facebook on PC for utility", "Using Youtube on mobile for entertainment", 
                                                           "Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                 "{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility", 
                                                                                                                 "Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
                                                                                                                 "Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                       "{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility", 
                                                                                                                                                                       "Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
                                                                                                                                                                       "Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                   "{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility", 
                                                                                                                                                                                                                   "Using Twitter on mobile for utility", "Using Twitter on mobile for utility", 
                                                                                                                                                                                                                   "Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                   "{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility", 
                                                                                                                                                                                                                                                                   "Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility", 
                                                                                                                                                                                                                                                                   "Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                       "{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility", 
                                                                                                                                                                                                                                                                                                                       "Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                       "Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                     "{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                     "Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                     "Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                                                                      "{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                      "Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility", 
                                                                                                                                                                                                                                                                                                                                                                                                                      "Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                         "{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                         "Using Youtube on PC for entertainment", "Using Instagram on mobile for utility", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                         "Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           "{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           "Using Youtube on PC for utility", "Using Instagram on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           "Using Instagram on mobile for entertainment")), row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          -6L), class = c("tbl_df", "tbl", "data.frame"))

并使用我之前发布的部分片段:

df_long<- gather(data = df, 
       key = alternative,
       value = value_answer, 
       Q9:Q18, 
       factor_key=TRUE) 

你应该能够得到类似下面的东西,它保留了 Response ID 变量:

  ResponseId             alternative value_answer                                          
  <chr>                  <fct>       <chr>                                                 
1 "Response ID"          Q9          "Regarding the use of social media, which of the foll~
2 "{\"ImportId\":\"_rec~ Q9          "{\"ImportId\":\"QID12\"}"                            
3 "R_2V7lrA7n29xU0i6"    Q9          "Using Facebook on PC for utility"                    
4 "R_3nozPOTbJBE1OBa"    Q9          "Using Facebook on PC for utility"                    
5 "R_2TB0WwyWCugTyEg"    Q9          "Using Youtube on mobile for entertainment"           
6 "R_2woFtS93jHyiv8F"    Q9          "Using Youtube on mobile for entertainment"     

希望对您有所帮助。最好!

这是否接近您想要完成的目标?

library(tidyr)
library(dplyr)

dca_long <- dca %>% 
  gather(key = "question",
         value = "response",
         2:ncol(.))

# # A tibble: 60 x 3
# ResponseId                     question response                                                                             
# <chr>                          <chr>    <chr>                                                                                
#  1 "Response ID"                  Q9       "Regarding the use of social media, which of the following options would you prefer?"
#  2 "{\"ImportId\":\"_recordId\"}" Q9       "{\"ImportId\":\"QID12\"}"                                                           
#  3 "R_2V7lrA7n29xU0i6"            Q9       "Using Facebook on PC for utility"                                                   
#  4 "R_3nozPOTbJBE1OBa"            Q9       "Using Facebook on PC for utility"                                                   
#  5 "R_2TB0WwyWCugTyEg"            Q9       "Using Youtube on mobile for entertainment"                                          
#  6 "R_2woFtS93jHyiv8F"            Q9       "Using Youtube on mobile for entertainment"                                          
#  7 "Response ID"                  Q10      "Regarding the use of social media, which of the following options would you prefer?"
#  8 "{\"ImportId\":\"_recordId\"}" Q10      "{\"ImportId\":\"QID13\"}"                                                           
#  9 "R_2V7lrA7n29xU0i6"            Q10      "Using TikTok on PC for utility"                                                     
# 10 "R_3nozPOTbJBE1OBa"            Q10      "Using Twitter on mobile for entertainment"                                          
# # … with 50 more rows

看来您的数据仍需要清理。

Essentially, I want there to be a column in dca_long to have a column where the values of ResponseId from dca are matched.

all(unique(dca$ResponseId) == unique(dca_long$ResponseId))

# [1] TRUE