R Error: Expecting a single string value: [type=character; extent=5]

R Error: Expecting a single string value: [type=character; extent=5]

我正在使用 R 编程语言。我正在使用 R 中的“CORELS”库,这是一个示例(CORELS 是一种类似于决策树的统计模型):

library(corels)

logdir <- tempdir()
rules_file <- system.file("sample_data", "compas_train.out", package="corels")
labels_file <- system.file("sample_data", "compas_train.label", package="corels")
meta_file <- system.file("sample_data", "compas_train.minor", package="corels")

stopifnot(file.exists(rules_file),
          file.exists(labels_file),
          file.exists(meta_file),
          dir.exists(logdir))

corels(rules_file, labels_file, logdir, meta_file,
       verbosity_policy = "silent",
       regularization = 0.015,
       curiosity_policy = 2,   # by lower bound
       map_type = 1)       # permutation map
cat("See ", logdir, " for result file.")

可以在此处查看输出:

OPTIMAL RULE LIST
if ({sex:Male,juvenile-crimes:>0}) then ({recidivate-within-two-years:Yes})
else if ({priors:>3}) then ({recidivate-within-two-years:Yes})
else ({recidivate-within-two-years:No})

我的问题: 对于上述函数的语法工作原理,我仍然有些困惑。例如,我正在尝试在“iris”数据集上使用上述函数:

data(iris)
head(iris)

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

现在,我尝试在 iris 数据集上应用“corles”函数:

logdir <- tempdir()
rules_file <- iris
labels_file <- colnames(iris)

corels(rules_file, labels_file, logdir, 
       verbosity_policy = "silent",
       regularization = 0.015,
       curiosity_policy = 2,   # by lower bound
       map_type = 1)       # permutation map
cat("See ", logdir, " for result file.")

但这会产生以下错误:

Error in corels(rules_file, labels_file, logdir, verbosity_policy = "silent",  : 
  Expecting a single string value: [type=character; extent=5].

有人可以告诉我如何解决这个错误吗?

谢谢

参考文献:

问题是您试图将数据集传递给 corels() 函数,但它希望您传递 文件名 corels 软件的 github 的 Data Format 部分详细介绍了如何格式化数据结构。这些应该保存在文件中,然后传递给 corels().

数据文件中的每一行都应包含一个由大括号 {} 包围的规则名称,后跟 01 以指示是否该规则适用于给定的数据行。

这里有一个简单的脚本来构建一些:

library(corels)

# Write a function to take a dataset & expression and format it correctly
make_rule <- function(data,expr){
  rule_name <- deparse1(substitute(expr))
  rule_name <- gsub(" ","",rule_name)
  out <- eval(substitute(expr),data)
  paste0("{",rule_name,"} ",paste0(1*out,collapse=" "))
}

# Create names for our rule/label files

rules_file <- "rule_data"
labels_file <- "label_data"

# Create some example rules (must always be binary operations)
iris_rules <- c(
  make_rule(iris,Sepal.Length < 5.84),
  make_rule(iris,Sepal.Width < 3.05),
  make_rule(iris,Petal.Length < 3.76),
  make_rule(iris,Petal.Width < 1.2)
)

#Label data appropriately. Must be a pair of rules
# where the first is the negative option & the 2nd is the
# positive outcome. Here we want to know how to find when
# the flower is a setosa
iris_labels <- c(
  make_rule(iris,Species != "setosa"),
  make_rule(iris,Species == "setosa")
)

# Save the data in the files
writeLines(iris_rules,rules_file)
writeLines(iris_labels,labels_file)


# reference the files and set verbosity to high so
# we get the full output
corels(rules_file, labels_file, ".",
       verbosity_policy = "loud")