R 包`penalized`：将 predict() 与 "penalized" 和 "unpenalized" 的矩阵输入一起使用时出错

Question

我正在尝试使用 penalized 包来校准惩罚线性回归，以使变量子集的系数为正。我设法校准了模型，但未能使用它进行新的预测。

这是一个玩具示例：

require(dplyr)
require(penalized)
require(ggplot2)

head(diamonds)
# A tibble: 6 x 10
#  carat       cut color clarity depth table price     x     y     z
#  <dbl>     <ord> <ord>   <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#1  0.23     Ideal     E     SI2  61.5    55   326  3.95  3.98  2.43
#2  0.21   Premium     E     SI1  59.8    61   326  3.89  3.84  2.31
#3  0.23      Good     E     VS1  56.9    65   327  4.05  4.07  2.31
#4  0.29   Premium     I     VS2  62.4    58   334  4.20  4.23  2.63
#5  0.31      Good     J     SI2  63.3    58   335  4.34  4.35  2.75
#6  0.24 Very Good     J    VVS2  62.8    57   336  3.94  3.96  2.48

response = diamonds$price
penalized_vars = "x"
unpenalized_vars = "depth"
fit_penalized = penalized(response=as.matrix(response),
                          penalized = model.matrix(~., select_(diamonds, .dots = penalized_vars)), 
                          unpenalized = model.matrix(~., select_(diamonds, .dots = unpenalized_vars)),
                          model="linear", 
                          positive=TRUE,
                          maxiter=25)
# nonzero coefficients: 3

show(fit_penalized)
#Penalized linear regression object
#4 regression coefficients of which 3 are non-zero

#Loglikelihood =     -482648.2 

head(fitted(fit_penalized))
#         1          2          3          4          5          6 
#-1679.6983 -1924.0012 -1515.2681  -863.6912  -393.7956 -1668.7105

到目前为止一切顺利。我如何实际使用它来根据新信息预测值？我试过了

predict(fit_penalized,
        penalized = model.matrix(~., select_(vars, .dots = penalized_vars)),
        unpenalized = model.matrix(~., select_(vars, .dots = unpenalized_vars)) )
# Error in terms.default(object@formula$unpenalized) : 
#   no terms component nor attribute

Answer 1

关于最新的penalized_0.9-47更新于2016-05-27

从我下面的测试中可以看出，unpenalized 只能通过 "formula" 而不是 "matrix" 指定，才能让 predict 工作。这似乎是包中的错误。错误不在于 penalized，而在于 predict。一方面，?penalized::predict 表示：

In particular, if penalized and/or unpenalized
was specified in matrix form, a matrix must be given with the new
subjects' data. The columns of these matrices must be exactly the
same as in the matrices supplied in the original call that
produced the ‘penfit’ object. If either penalized or unpenalized
was given as a ‘formula’ in the original call, the user of
‘predict’ must supply a new ‘data’ argument.

将矩阵传递给unpenalized似乎是合法的，但我的实际测试否定了这一点。所以尝试联系包作者。

作为旁注，您当前的规范将有一个惩罚拦截，以及一个免费拦截。您应该删除其中一个以便识别。在下文中，我删除了免费拦截。

library(penalized)
library(ggplot2)
X1 <- model.matrix (~x, diamonds)    ## model matrix for penalized terms
X2 <- model.matrix(~ depth - 1, diamonds)    ## model matrix for free terms
vars <- diamonds[1:5, ]    ## prediction dataset
Xp1 <- model.matrix(~x, vars)    ## prediction matrix for penalized terms
Xp2 <- model.matrix(~ depth - 1, vars)    ## prediction matrix for free terms

## use "formula" for both
fit <- penalized (price, ~ x, ~ depth - 1, data = diamonds, model = "linear", positive = TRUE)
predict(fit, ~ x, ~ depth - 1, vars)
#          mu  sigma2
#1 -1523.5643 3600667
#2 -1328.9947 3600667
#3  -183.9204 3600667
#4  -950.1526 3600667
#5  -717.6854 3600667

## "matrix" for `penalized` and "formula" for `unpenalized`
fit <- penalized (price, X1, ~ depth - 1, data = diamonds, model = "linear", positive = TRUE)
predict(fit, Xp1, ~ depth - 1, vars)
#          mu  sigma2
#1 -1523.5643 3600667
#2 -1328.9947 3600667
#3  -183.9204 3600667
#4  -950.1526 3600667
#5  -717.6854 3600667

## "formula" for `penalized` and "matrix" for `unpenalized`
fit <- penalized (price, ~ x, X2, data = diamonds, model = "linear", positive = TRUE)
predict(fit, ~ x, Xp2, vars)
# Error in terms.default(object@formula$unpenalized) : 
#  no terms component nor attribute

## "matrix" for both
fit <- penalized (price, X1, X2, data = diamonds, model = "linear", positive = TRUE)
predict(fit, Xp1, Xp2, vars)
# Error in terms.default(object@formula$unpenalized) : 
#  no terms component nor attribute

R 包`penalized`：将 predict() 与 "penalized" 和 "unpenalized" 的矩阵输入一起使用时出错

R package `penalized`: error when using predict() with matrix input for "penalized" and "unpenalized"

regression

r

linear-regression