使用 gather 时尝试维护索引
Trying to maintain index when using gather
我正在尝试将我的数据从宽数据转换为长数据,但由于某种原因,ID 列没有
转换后显示。这是我的数据的样子:
> head(dca)
# A tibble: 6 x 11
ResponseId Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 "Response … "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regard… "Regard… "Regard… "Regard…
2 "{\"Import… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Imp… "{\"Imp… "{\"Imp… "{\"Imp…
3 "R_2V7lrA7… "Using F… "Using T… "Using Y… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
4 "R_3nozPOT… "Using F… "Using T… "Using T… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
5 "R_2TB0Wwy… "Using Y… "Using T… "Using T… "Using T… "Using T… "Using F… "Using … "Using … "Using … "Using …
6 "R_2woFtS9… "Using Y… "Using T… "Using Y… "Using T… "Using I… "Using I… "Using … "Using … "Using … "Using …
应用以下转换后:
library(tidyr)
keycol <- "ResponseId"
valuecol <- "Response"
gathercols <- c("Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16","Q17","Q18" )
dca_long<- gather_(dca,dca$ResponseId, keycol, valuecol, gathercols)
这是我得到的:
> head(dca_long)
# A tibble: 6 x 2
ResponseId 'Response`
<chr> <chr>
1 Q9 "Regarding the use of social media, which of the following options would you prefer?"
2 Q9 "{\"ImportId\":\"QID12\"}"
3 Q9 "Using Facebook on PC for utility"
4 Q9 "Using Facebook on PC for utility"
5 Q9 "Using Youtube on mobile for entertainment"
6 Q9 "Using Youtube on mobile for entertainment"
本质上,我希望在 dca_long
中有一列,其中有一列匹配来自 dca 的 ResponseId
的值。我这样做是为了进一步使 dca 适合 mlogit()
.
评论中有人请求此输出以更好地理解代码:
> dput(head(dca))
structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}",
"R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg",
"R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility",
"Using Facebook on PC for utility", "Using Youtube on mobile for entertainment",
"Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility",
"Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility",
"Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility",
"Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment",
"Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment",
"Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility",
"Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment",
"Using Youtube on PC for entertainment", "Using Instagram on mobile for utility",
"Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment",
"Using Youtube on PC for utility", "Using Instagram on mobile for entertainment",
"Using Instagram on mobile for entertainment")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
此示例可能会帮助您解决问题。
#simulated wide data for reproducibility
wide_data <- read.table(header=TRUE, text='
subject sex time_1 time_2 time_3
1 M 15 16 23
2 F 25 20 48
3 F 30 25 55
4 M 35 32 60
')
使用 gather
你应该得到类似下面的东西。
gather(data = olddata_wide,
key = alternative,
value = time,
c(time_1, time_2, time_3),
factor_key=TRUE)
subject sex alternative time
1 1 M time_1 15
2 2 F time_1 25
3 3 F time_1 30
4 4 M time_1 35
5 1 M time_2 16
6 2 F time_2 20
7 3 F time_2 25
8 4 M time_2 32
9 1 M time_3 23
10 2 F time_3 48
11 3 F time_3 55
12 4 M time_3 60
如果这没有帮助。请复制您的数据片段 (dca
) 以解决其中的问题。最好!
[已编辑]
使用您发布的数据:
df<- structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}",
"R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg",
"R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility",
"Using Facebook on PC for utility", "Using Youtube on mobile for entertainment",
"Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility",
"Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility",
"Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility",
"Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment",
"Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment",
"Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility",
"Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment",
"Using Youtube on PC for entertainment", "Using Instagram on mobile for utility",
"Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment",
"Using Youtube on PC for utility", "Using Instagram on mobile for entertainment",
"Using Instagram on mobile for entertainment")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
并使用我之前发布的部分片段:
df_long<- gather(data = df,
key = alternative,
value = value_answer,
Q9:Q18,
factor_key=TRUE)
你应该能够得到类似下面的东西,它保留了 Response ID
变量:
ResponseId alternative value_answer
<chr> <fct> <chr>
1 "Response ID" Q9 "Regarding the use of social media, which of the foll~
2 "{\"ImportId\":\"_rec~ Q9 "{\"ImportId\":\"QID12\"}"
3 "R_2V7lrA7n29xU0i6" Q9 "Using Facebook on PC for utility"
4 "R_3nozPOTbJBE1OBa" Q9 "Using Facebook on PC for utility"
5 "R_2TB0WwyWCugTyEg" Q9 "Using Youtube on mobile for entertainment"
6 "R_2woFtS93jHyiv8F" Q9 "Using Youtube on mobile for entertainment"
希望对您有所帮助。最好!
这是否接近您想要完成的目标?
library(tidyr)
library(dplyr)
dca_long <- dca %>%
gather(key = "question",
value = "response",
2:ncol(.))
# # A tibble: 60 x 3
# ResponseId question response
# <chr> <chr> <chr>
# 1 "Response ID" Q9 "Regarding the use of social media, which of the following options would you prefer?"
# 2 "{\"ImportId\":\"_recordId\"}" Q9 "{\"ImportId\":\"QID12\"}"
# 3 "R_2V7lrA7n29xU0i6" Q9 "Using Facebook on PC for utility"
# 4 "R_3nozPOTbJBE1OBa" Q9 "Using Facebook on PC for utility"
# 5 "R_2TB0WwyWCugTyEg" Q9 "Using Youtube on mobile for entertainment"
# 6 "R_2woFtS93jHyiv8F" Q9 "Using Youtube on mobile for entertainment"
# 7 "Response ID" Q10 "Regarding the use of social media, which of the following options would you prefer?"
# 8 "{\"ImportId\":\"_recordId\"}" Q10 "{\"ImportId\":\"QID13\"}"
# 9 "R_2V7lrA7n29xU0i6" Q10 "Using TikTok on PC for utility"
# 10 "R_3nozPOTbJBE1OBa" Q10 "Using Twitter on mobile for entertainment"
# # … with 50 more rows
看来您的数据仍需要清理。
Essentially, I want there to be a column in dca_long to have a column where the values of ResponseId from dca are matched.
all(unique(dca$ResponseId) == unique(dca_long$ResponseId))
# [1] TRUE
我正在尝试将我的数据从宽数据转换为长数据,但由于某种原因,ID 列没有 转换后显示。这是我的数据的样子:
> head(dca)
# A tibble: 6 x 11
ResponseId Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 "Response … "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regard… "Regard… "Regard… "Regard…
2 "{\"Import… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Imp… "{\"Imp… "{\"Imp… "{\"Imp…
3 "R_2V7lrA7… "Using F… "Using T… "Using Y… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
4 "R_3nozPOT… "Using F… "Using T… "Using T… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
5 "R_2TB0Wwy… "Using Y… "Using T… "Using T… "Using T… "Using T… "Using F… "Using … "Using … "Using … "Using …
6 "R_2woFtS9… "Using Y… "Using T… "Using Y… "Using T… "Using I… "Using I… "Using … "Using … "Using … "Using …
应用以下转换后:
library(tidyr)
keycol <- "ResponseId"
valuecol <- "Response"
gathercols <- c("Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16","Q17","Q18" )
dca_long<- gather_(dca,dca$ResponseId, keycol, valuecol, gathercols)
这是我得到的:
> head(dca_long)
# A tibble: 6 x 2
ResponseId 'Response`
<chr> <chr>
1 Q9 "Regarding the use of social media, which of the following options would you prefer?"
2 Q9 "{\"ImportId\":\"QID12\"}"
3 Q9 "Using Facebook on PC for utility"
4 Q9 "Using Facebook on PC for utility"
5 Q9 "Using Youtube on mobile for entertainment"
6 Q9 "Using Youtube on mobile for entertainment"
本质上,我希望在 dca_long
中有一列,其中有一列匹配来自 dca 的 ResponseId
的值。我这样做是为了进一步使 dca 适合 mlogit()
.
评论中有人请求此输出以更好地理解代码:
> dput(head(dca))
structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}",
"R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg",
"R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility",
"Using Facebook on PC for utility", "Using Youtube on mobile for entertainment",
"Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility",
"Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility",
"Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility",
"Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment",
"Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment",
"Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility",
"Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment",
"Using Youtube on PC for entertainment", "Using Instagram on mobile for utility",
"Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment",
"Using Youtube on PC for utility", "Using Instagram on mobile for entertainment",
"Using Instagram on mobile for entertainment")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
此示例可能会帮助您解决问题。
#simulated wide data for reproducibility
wide_data <- read.table(header=TRUE, text='
subject sex time_1 time_2 time_3
1 M 15 16 23
2 F 25 20 48
3 F 30 25 55
4 M 35 32 60
')
使用 gather
你应该得到类似下面的东西。
gather(data = olddata_wide,
key = alternative,
value = time,
c(time_1, time_2, time_3),
factor_key=TRUE)
subject sex alternative time
1 1 M time_1 15
2 2 F time_1 25
3 3 F time_1 30
4 4 M time_1 35
5 1 M time_2 16
6 2 F time_2 20
7 3 F time_2 25
8 4 M time_2 32
9 1 M time_3 23
10 2 F time_3 48
11 3 F time_3 55
12 4 M time_3 60
如果这没有帮助。请复制您的数据片段 (dca
) 以解决其中的问题。最好!
[已编辑]
使用您发布的数据:
df<- structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}",
"R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg",
"R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility",
"Using Facebook on PC for utility", "Using Youtube on mobile for entertainment",
"Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility", "Using Twitter on mobile for utility",
"Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility",
"Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility",
"Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility",
"Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment",
"Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment",
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment",
"Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment",
"Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility",
"Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment",
"Using Youtube on PC for entertainment", "Using Instagram on mobile for utility",
"Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?",
"{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment",
"Using Youtube on PC for utility", "Using Instagram on mobile for entertainment",
"Using Instagram on mobile for entertainment")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
并使用我之前发布的部分片段:
df_long<- gather(data = df,
key = alternative,
value = value_answer,
Q9:Q18,
factor_key=TRUE)
你应该能够得到类似下面的东西,它保留了 Response ID
变量:
ResponseId alternative value_answer
<chr> <fct> <chr>
1 "Response ID" Q9 "Regarding the use of social media, which of the foll~
2 "{\"ImportId\":\"_rec~ Q9 "{\"ImportId\":\"QID12\"}"
3 "R_2V7lrA7n29xU0i6" Q9 "Using Facebook on PC for utility"
4 "R_3nozPOTbJBE1OBa" Q9 "Using Facebook on PC for utility"
5 "R_2TB0WwyWCugTyEg" Q9 "Using Youtube on mobile for entertainment"
6 "R_2woFtS93jHyiv8F" Q9 "Using Youtube on mobile for entertainment"
希望对您有所帮助。最好!
这是否接近您想要完成的目标?
library(tidyr)
library(dplyr)
dca_long <- dca %>%
gather(key = "question",
value = "response",
2:ncol(.))
# # A tibble: 60 x 3
# ResponseId question response
# <chr> <chr> <chr>
# 1 "Response ID" Q9 "Regarding the use of social media, which of the following options would you prefer?"
# 2 "{\"ImportId\":\"_recordId\"}" Q9 "{\"ImportId\":\"QID12\"}"
# 3 "R_2V7lrA7n29xU0i6" Q9 "Using Facebook on PC for utility"
# 4 "R_3nozPOTbJBE1OBa" Q9 "Using Facebook on PC for utility"
# 5 "R_2TB0WwyWCugTyEg" Q9 "Using Youtube on mobile for entertainment"
# 6 "R_2woFtS93jHyiv8F" Q9 "Using Youtube on mobile for entertainment"
# 7 "Response ID" Q10 "Regarding the use of social media, which of the following options would you prefer?"
# 8 "{\"ImportId\":\"_recordId\"}" Q10 "{\"ImportId\":\"QID13\"}"
# 9 "R_2V7lrA7n29xU0i6" Q10 "Using TikTok on PC for utility"
# 10 "R_3nozPOTbJBE1OBa" Q10 "Using Twitter on mobile for entertainment"
# # … with 50 more rows
看来您的数据仍需要清理。
Essentially, I want there to be a column in dca_long to have a column where the values of ResponseId from dca are matched.
all(unique(dca$ResponseId) == unique(dca_long$ResponseId))
# [1] TRUE