为 mlogit 格式化数据
Formatting data for mlogit
我正忙于通过 mlogit 整理数据集以进行多项式 logit 分析。我的数据集可从以下代码中的 url 获得。
我收到以下错误:
Error in row.names<-.data.frame
(*tmp*
, value = c("1.Accessible",
"1.Accessible", : duplicate 'row.names' are not allowed
我在别处查过,似乎出现了这个问题。我试过使用 alt.levels
而不是 alt.var
参数,但这不起作用。
#Loadpackages
library(RCurl)
library(mlogit)
library(tidyr)
library(dplyr)
#URL where data is stored
dat.url<- 'https://raw.githubusercontent.com/sjkiss/Survey/master/mlogit.out.csv'
#Get data
dat<-read.csv(dat.url)
#Complete cases only as it seems mlogit cannot handle missing values or tied data which in this case you might get because of median imputation
dat<-dat[complete.cases(dat),]
#Tidy data to get it into long format
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12))
#Try to replicate code on pp.26-27 of http://cran.r- project.org/web/packages/mlogit/vignettes/mlogit.pdf
mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', id.var='X',ranked=TRUE)
#Try this option as per a discussion on stackexchange
mlogit.out<-mlogit.data(dat.out, shape='long',alt.levels='Open',choice='Rank', id.var='X',ranked=TRUE)
我的建议是你试试 nnet 包中的 multinom() 函数。它不需要logit或logit的特殊格式。
library(RCurl)
library(nnet)
Data<-getURL("https://raw.githubusercontent.com/sjkiss/Survey/master/mlogit.out.csv")
Data<-read.csv(text=Data,header=T)
Data<-na.omit(Data) # Get rid of NA's
Data<-as.data.frame(Data)
# relevel the dependent variable (must be a factor)
Data$Job<-factor(Data$Job)
# Using "Online Blogger" as the reference, substitute with your choice
Data$Job<-relevel(Data$Job,ref="Online blogger")
# Run the multinomial logistic regression
# (seems like an awful lot of variables btw)
Data<-multinom(formula=Job~Accessible+Information+Responsive+Debate+Officials+Social+Trade.Offs+economic+gender+age,data=Data)
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12)) %>%
arrange(X, Open, Rank)
mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', ranked=TRUE,child.var='X')
head(mlogit.out)
X economic gender age Job Open Rank
1.Accessible 1 5 Male 1970 Professional journalist Accessible FALSE
1.Information 1 5 Male 1970 Professional journalist Information FALSE
1.Responsive 1 5 Male 1970 Professional journalist Responsive TRUE
1.Debate 1 5 Male 1970 Professional journalist Debate FALSE
1.Officials 1 5 Male 1970 Professional journalist Officials FALSE
1.Social 1 5 Male 1970 Professional journalist Social FALSE
我正忙于通过 mlogit 整理数据集以进行多项式 logit 分析。我的数据集可从以下代码中的 url 获得。
我收到以下错误:
Error in
row.names<-.data.frame
(*tmp*
, value = c("1.Accessible", "1.Accessible", : duplicate 'row.names' are not allowed
我在别处查过,似乎出现了这个问题。我试过使用 alt.levels
而不是 alt.var
参数,但这不起作用。
#Loadpackages
library(RCurl)
library(mlogit)
library(tidyr)
library(dplyr)
#URL where data is stored
dat.url<- 'https://raw.githubusercontent.com/sjkiss/Survey/master/mlogit.out.csv'
#Get data
dat<-read.csv(dat.url)
#Complete cases only as it seems mlogit cannot handle missing values or tied data which in this case you might get because of median imputation
dat<-dat[complete.cases(dat),]
#Tidy data to get it into long format
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12))
#Try to replicate code on pp.26-27 of http://cran.r- project.org/web/packages/mlogit/vignettes/mlogit.pdf
mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', id.var='X',ranked=TRUE)
#Try this option as per a discussion on stackexchange
mlogit.out<-mlogit.data(dat.out, shape='long',alt.levels='Open',choice='Rank', id.var='X',ranked=TRUE)
我的建议是你试试 nnet 包中的 multinom() 函数。它不需要logit或logit的特殊格式。
library(RCurl)
library(nnet)
Data<-getURL("https://raw.githubusercontent.com/sjkiss/Survey/master/mlogit.out.csv")
Data<-read.csv(text=Data,header=T)
Data<-na.omit(Data) # Get rid of NA's
Data<-as.data.frame(Data)
# relevel the dependent variable (must be a factor)
Data$Job<-factor(Data$Job)
# Using "Online Blogger" as the reference, substitute with your choice
Data$Job<-relevel(Data$Job,ref="Online blogger")
# Run the multinomial logistic regression
# (seems like an awful lot of variables btw)
Data<-multinom(formula=Job~Accessible+Information+Responsive+Debate+Officials+Social+Trade.Offs+economic+gender+age,data=Data)
dat.out<-dat %>%
gather(Open, Rank, -c(1,9:12)) %>%
arrange(X, Open, Rank)
mlogit.out<-mlogit.data(dat.out, shape='long',alt.var='Open',choice='Rank', ranked=TRUE,child.var='X')
head(mlogit.out)
X economic gender age Job Open Rank
1.Accessible 1 5 Male 1970 Professional journalist Accessible FALSE
1.Information 1 5 Male 1970 Professional journalist Information FALSE
1.Responsive 1 5 Male 1970 Professional journalist Responsive TRUE
1.Debate 1 5 Male 1970 Professional journalist Debate FALSE
1.Officials 1 5 Male 1970 Professional journalist Officials FALSE
1.Social 1 5 Male 1970 Professional journalist Social FALSE