支持向量机-R代码-预测时间序列的残差

Support Vector Machine - R code - Predict Residual error of Time Series

我正在尝试使用 R 代码预测时间序列的残差。我的数据集有以下两列(我将放置前 10 行的示例):

Observation Residuals
1   -0,087527458
2   -0,06907199
3   -0,066604145
4   -0,07796713
5   -0,081723932
6   -0,094046868
7   -0,101535816
8   -0,101884203
9   -0,11131246
10  -0,092548176

为了预测,我正在使用 R 构建支持向量机:

# Load the data from the csv file
dataDirectory <- "C://"  
data <- read.csv(paste(dataDirectory, "Data_SVM_Test.csv", sep=""),sep=";", header = TRUE)
head(data)
# Plot the data 
plot(data, pch=16)

# Create a linear regression model
model <- lm(Residuals ~ Observation, data)

# Add the fitted line
abline(model)

predictedY <- predict(model, data)

# display the predictions
points(data$Observation, predictedY, col = "blue", pch=4) 

# This function will compute the RMSE
rmse <- function(error)
{
  sqrt(mean(error^2))
}

error <- model$residuals  # same as data$Y - predictedY
predictionRMSE <- rmse(error)   # 5.70377

plot(data, pch=16)

plot.new()
# svr model ==============================================
if(require(e1071)){ 
  model <- svm(Residuals ~ Observation , data)

  predictedY <- predict(model, data)

  points(data$Observation, predictedY, col = "red", pch=4)

  # /!\ this time  svrModel$residuals  is not the same as data$Y - predictedY
  # so we compute the error like this
  error <- data$Residuals - predictedY  
  svrPredictionRMSE <- rmse(error)  # 3.157061 
} 

当我执行上面的代码时,我收到以下错误消息并且没有任何输出:

Warning message:
In Ops.factor(data$Residuals, predictedY) : ‘-’ not meaningful for factors

有人知道如何解决这个错误吗?

非常感谢!

使用svm进行分类时,输出类型为factor。这是来自文档:

Output of svm: A vector of predicted values (for classification: a vector of labels, for density estimation: a logical vector).

这可以从下面的例子看出:

library(e1071)
model <- svm(Species ~ ., data = iris)
> str( predict(model, iris))
 Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "names")= chr [1:150] "1" "2" "3" "4" ...

你的数据也一样。级别显示 PredictedY 是一个因素:

> predictedY <- predict(model, df)
> predictedY
           1            2            3            4            5            6            7            8            9           10 
-0,087527458  -0,06907199 -0,066604145  -0,07796713 -0,081723932 -0,094046868 -0,101535816 -0,101884203  -0,11131246 -0,092548176 
Levels: -0,066604145 -0,06907199 -0,07796713 -0,081723932 -0,087527458 -0,092548176 -0,094046868 -0,101535816 -0,101884203 -0,11131246

在您的代码行 predictedY <- predict(model, data) 中,predictedY 是 factor 类型。如果您尝试从一个因子中减去一个数字(反之亦然),您会得到错误:

> 1:10 - as.factor(1:10)
 [1] NA NA NA NA NA NA NA NA NA NA
Warning message:
In Ops.factor(1:10, as.factor(1:10)) : ‘-’ not meaningful for factors

如果你想让它起作用,你需要使用 as.numeric 将因子转换成数字。 1:10 - as.numeric(as.factor(1:10)).

我不知道你的数据是什么样的,但我从问题的标题来看 svm 对于时间序列来说可能不是一个好主意。