在 R 中调整 SVM - 因变量类型错误
Tune SVM in R - Dependent variable has wrong type
我正在使用 e1071
中的 svm
作为这样的数据集:
sdewey <- svm(x = as.matrix(trainS),
y = trainingSmall$DEWEY,
type="C-classification")
效果很好,但是当我尝试像这样调整成本和伽玛时:
svm_tune <- tune(svm, train.x=as.matrix(trainS), train.y=trainingSmall$DEWEY, type="C-classification", ranges=list(cost=10^(-1:6), gamma=1^(-1:1)))
我收到这个错误:
Error in tune(svm, train.x = as.matrix(trainS), train.y =
trainingSmall$DEWEY, : Dependent variable has wrong type!
我的训练数据的结构是这样的,但还有很多行:
'data.frame': 1000 obs. of 1542 variables:
$ women.prisoners : int 1 0 0 0 0 0 0 0 0 0 ...
$ reformatories.for.women : int 1 0 0 0 0 0 0 0 0 0 ...
$ women : int 1 0 0 0 0 0 0 0 0 0 ...
$ criminal.justice : int 1 0 0 0 0 0 0 0 0 0 ...
$ soccer : int 0 1 0 0 0 0 0 0 0 0 ...
$ coal.mines.and.mining : int 0 0 1 0 0 0 0 0 0 0 ...
$ coal : int 0 0 1 0 0 0 0 0 0 0 ...
$ engineering.geology : int 0 0 1 0 0 0 0 0 0 0 ...
$ family.violence : int 0 0 0 1 0 0 0 0 0 0 ...
这是一个多class的问题。
我不确定如何解决这个问题,或者是否有其他方法可以找到成本和伽玛参数的最佳值。
Here is an example of my data,trainS
是没有前 4 列(DEWEY、D1、D2 和 D3)的数据
谢谢
require(e1071)
trainingSmall<-read.csv("trainingSmallExtra.csv")
sdewey <- svm(x = as.matrix(trainingSmall[,4:nrow(trainingSmall)]),
y = trainingSmall$DEWEY,
type = "C-classification",
kernel = "linear" # same as no kernel
)
之所以有效,是因为 svm
已自动将 DEWEY
转换为一个因数。
tune
模型失败,因为它是为用户定制而制作的,它依赖于您提供正确的数据类型。由于 DEWEY
是整数而不是 factor
它失败了。我们可以解决这个问题:
trainingSmall$DEWEY <- as.factor(trainingSmall$DEWEY)
svm_tune <- tune(svm, train.x = as.matrix(trainingSmall[,4:nrow(trainingSmall)]),
train.y = trainingSmall$DEWEY, # the way I'm formatting your
kernel = "linear", # code is Google's R style
type = "C-classification",
ranges = list(
cost = 10^(-1:6),
gamma = 1^(-1:1)
)
)
我正在使用 e1071
中的 svm
作为这样的数据集:
sdewey <- svm(x = as.matrix(trainS),
y = trainingSmall$DEWEY,
type="C-classification")
效果很好,但是当我尝试像这样调整成本和伽玛时:
svm_tune <- tune(svm, train.x=as.matrix(trainS), train.y=trainingSmall$DEWEY, type="C-classification", ranges=list(cost=10^(-1:6), gamma=1^(-1:1)))
我收到这个错误:
Error in tune(svm, train.x = as.matrix(trainS), train.y = trainingSmall$DEWEY, : Dependent variable has wrong type!
我的训练数据的结构是这样的,但还有很多行:
'data.frame': 1000 obs. of 1542 variables:
$ women.prisoners : int 1 0 0 0 0 0 0 0 0 0 ...
$ reformatories.for.women : int 1 0 0 0 0 0 0 0 0 0 ...
$ women : int 1 0 0 0 0 0 0 0 0 0 ...
$ criminal.justice : int 1 0 0 0 0 0 0 0 0 0 ...
$ soccer : int 0 1 0 0 0 0 0 0 0 0 ...
$ coal.mines.and.mining : int 0 0 1 0 0 0 0 0 0 0 ...
$ coal : int 0 0 1 0 0 0 0 0 0 0 ...
$ engineering.geology : int 0 0 1 0 0 0 0 0 0 0 ...
$ family.violence : int 0 0 0 1 0 0 0 0 0 0 ...
这是一个多class的问题。 我不确定如何解决这个问题,或者是否有其他方法可以找到成本和伽玛参数的最佳值。
Here is an example of my data,trainS
是没有前 4 列(DEWEY、D1、D2 和 D3)的数据
谢谢
require(e1071)
trainingSmall<-read.csv("trainingSmallExtra.csv")
sdewey <- svm(x = as.matrix(trainingSmall[,4:nrow(trainingSmall)]),
y = trainingSmall$DEWEY,
type = "C-classification",
kernel = "linear" # same as no kernel
)
之所以有效,是因为 svm
已自动将 DEWEY
转换为一个因数。
tune
模型失败,因为它是为用户定制而制作的,它依赖于您提供正确的数据类型。由于 DEWEY
是整数而不是 factor
它失败了。我们可以解决这个问题:
trainingSmall$DEWEY <- as.factor(trainingSmall$DEWEY)
svm_tune <- tune(svm, train.x = as.matrix(trainingSmall[,4:nrow(trainingSmall)]),
train.y = trainingSmall$DEWEY, # the way I'm formatting your
kernel = "linear", # code is Google's R style
type = "C-classification",
ranges = list(
cost = 10^(-1:6),
gamma = 1^(-1:1)
)
)