为什么按因子排序会破坏小数位

Why does ordering by factor destroy the decimal places

我正在尝试对我的数据框做一个简单的 ggplot

  structure(list(CLevel = c(3, 4, 5, 6, 7, 8, 9, 10, 11), 
 Sensitivity = structure(c(1L, 2L, 3L, 3L, 5L, 5L, 7L, 8L, 9L), 
.Label = c("56.6666666666667","53.125", "52.9411764705882", "52.9411764705882", 
"54.2857142857143", "54.2857142857143", "55.5555555555556", "56.7567567567568",
 "57.1428571428571"), class = "factor"), 
Specificity = c(76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941,
 76.4705882352941, 76.4705882352941, 76.4705882352941,
 76.4705882352941, 76.4705882352941)), 
 .Names = c("CLevel", "Sensitivity", "Specificity"), row.names = c(NA, -9L),
 class ="data.frame")

当我做剧情的时候

library(ggplot2)
ggplot() + 
  geom_point(aes(CLevel, Sensitivity, color = "red",size=12), St) +  
  geom_point(shape=5)

我得到一个 x 轴,它没有按照我想要的方式排列。

所以我试过了

St$Sensitivity <- factor(St$Sensitivity, levels = St$Sensitivity[order(St$CLevel)])

但我收到错误

In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  : duplicated levels in factors are deprecated

所以我再次查看了我的数据框,看起来 Sensitivity 列中有重复项,因为小数点已被删除,所以一些数字完全相同。我想要做的就是订购 x 轴,所以这看起来不必要地复杂。我该怎么做?

已编辑:

我看到级别是 个字符 当您使用数字

时避免使用 ""

这应该是合适的示例数据

St =  structure(list(CLevel = c(3, 4, 5, 6, 7, 8, 9, 10, 11), Sensitivity = structure(c(1L, 2L, 3L, 3L, 5L, 5L, 7L, 8L, 9L), .Label = c(56.6666666666667,53.125, 52.9411764705882, 52.9411764705882, 54.2857142857143, 54.2857142857143, 55.5555555555556, 56.7567567567568, 57.1428571428571), class = "factor"), Specificity = c(76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941, 76.4705882352941)), .Names = c("CLevel", "Sensitivity", "Specificity"), row.names = c(NA, -9L), class ="data.frame")

你可以试试这样点菜

ggplot() + 
geom_point(aes(CLevel, reorder(Sensitivity, -as.vector(Sensitivity)), color = "red",size=12), St) + 
geom_point(shape=5)

# or use this: reorder(Sensitivity, as.vector(Sensitivity)) based on your requirement

需要

如果将 Sensitivity 更改为 double,生成的图看起来很合适:

St <- structure(list(CLevel = c(3, 4, 5, 6, 7, 8, 9, 10, 11),

               Sensitivity = c(56.6666666666667, 53.125, 52.9411764705882,
                 52.9411764705882, 54.2857142857143, 54.2857142857143,
                 55.5555555555556, 56.7567567567568, 57.1428571428571 ),

               Specificity = c(76.4705882352941, 76.4705882352941,
                 76.4705882352941, 76.4705882352941, 76.4705882352941,
                 76.4705882352941, 76.4705882352941, 76.4705882352941,
                 76.4705882352941)),

          .Names = c("CLevel", "Sensitivity", "Specificity"),
          row.names = c(NA, -9L), class = "data.frame")  

library(ggplot2)
ggplot() + 
  geom_point(aes(CLevel, Sensitivity, color = "red",size=12), St) +  
  geom_point(shape=5)

这是您要找的吗?

 ## to avoid typing "Sensitivity" so many times:
 s <- levels(St$Sensitivity) 
 St2 <- transform(St,Sensitivity=factor(Sensitivity,
                         levels=s[order(as.numeric(s))]))
 library("ggplot2")
 ggplot(St2,aes(CLevel,Sensitivity))+
     geom_point(color = "red",size=12, shape=5)

请注意,我将颜色、大小和形状规范 放在映射 (aes()) 规范的 之外,我猜这就是您真正想要的...

正如@VeerendraGadekar 所说,出现警告是因为您给我们的值确实在因子水平上有重复。特别是,直接引用您给我们的结构(为了清楚起见,只是稍微重新排列间距)

.Label = c("56.6666666666667",
           "53.125", 
           "52.9411764705882",
           "52.9411764705882", ## duplicate
           "54.2857142857143", 
           "54.2857142857143", ## duplicate
           "55.5555555555556", 
           "56.7567567567568", 
           "57.1428571428571")

也许你在上游某处丢失了精度?