如果高于 x，则获取保持最大值的列名

Question

我正在尝试分配分类，但我运行遇到了一些问题。正常的分类方法获得了大多数选票，但我想更严格一点。假设我有以下矩阵：

     c1    c2    c3
x1   0.09  0.7   0.21
x2   0.34  0.33  0.33

如果我获得多数票，则分类如下：

     class
x1   c2
x2   c1

但我想将阈值设置为例如 0.40 票，这样我就会得到这些分类：

     class
x1   c2
x2   unassigned

我知道如何获取一行中的最大值以及如何获取该行中包含最大值的列名（来自 this 问题，但它没有解决我的问题），但对于某些人我似乎无法查询最大值至少为 0.40 的原因。任何帮助将不胜感激:)

Answer 1

您可以使用 max.col 获取行中的最大值。

cols <- names(df)[max.col(df) * NA^!rowSums(df > 0.4) > 0]
cols[is.na(cols)] <- 'unassigned'
cols
#[1] "c2"         "unassigned"

NA^!rowSums(df > 0.4) > 0 部分是 return NA 对于那些没有值 > 0.4 的行。

数据

df <- structure(list(c1 = c(0.09, 0.34), c2 = c(0.7, 0.33), c3 = c(0.21, 
0.33)), class = "data.frame", row.names = c("x1", "x2"))

Answer 2

我会建议这种方法 apply():

#Function
myfun <- function(x)
{
  y <- names(x)[which(x==max(x[which(x>0.4)]))]
  y2 <- y[1]
  if(is.na(y2))
  {
    y2 <- 'not assigned'
  }
    
  return(as.character(y2))
}
#Apply
df$Class <- apply(df,1,myfun)

输出：

     c1   c2   c3        Class
x1 0.09 0.70 0.21           c2
x2 0.34 0.33 0.33 not assigned

如果高于 x，则获取保持最大值的列名

Get column name that holds max value if higher than x

r

classification