如何计算 Spearman 相关系数与 Spark ?我无法从统计书中复制样本

How can I calculate a Spearman coefficient of correlation with Spark ? I am unable to reproduce a sample from a statistic book

为了用 Spark 和经典统计分析训练自己,我正在尝试执行书中给出的一些示例(中性统计书籍:不专门用于计算或 Spark)。

书中的示例提供了计算两个裁判给十个运动员的 Spearman 相关系数:

| Judge 1 | 8.3 | 7.6 | 9.1 | 9.5 | 8.4 | 6.9 | 9.2 | 7.8 | 8.6 | 8.2
| Judge 2 | 7.9 | 7.4 | 9.1 | 9.3 | 8.4 | 7.5 | 9.0 | 7.2 | 8.2 | 8.1

创建秩的中间矩阵,
|法官 1 | 5 | 2 | 8 | 10 | 6 | 1 | 9 | 3 | 7 | 4
|法官 2 | 4 | 2 | 9 | 10 | 7 | 3 | 8 | 1 | 6 | 5

书中的示例最终以以下结果结束:

r = 0.915

我试着用 Spark 那样实现它,according to the API documentation of Correlation :

List<Row> data = Arrays.asList(
   RowFactory.create(Vectors.dense(8.3, 7.6, 9.1, 9.5, 8.4, 6.9, 9.2, 7.8, 8.6, 8.2)),
   RowFactory.create(Vectors.dense(7.9, 7.4, 9.1, 9.3, 8.4, 7.5, 9.0, 7.2, 8.2, 8.1))
);

StructType schema = new StructType(new StructField[]{
   new StructField("features", new VectorUDT(), false, Metadata.empty()),
});

Dataset<Row> df = this.session.createDataFrame(data, schema);

Row r2 = Correlation.corr(df, "features", "spearman").head();
System.out.println("Spearman correlation matrix:\n" + r2.get(0).toString());

但它 return 我不是系数。相反,另一个对我来说很奇怪的矩阵:

Spearman correlation matrix:
1.0                  0.9999999999999998   NaN  ... (10 total)
0.9999999999999998   1.0                  NaN  ...
NaN                  NaN                  1.0  ...
0.9999999999999998   0.9999999999999998   NaN  ...
NaN                  NaN                  NaN  ...
-0.9999999999999998  -0.9999999999999998  NaN  ...
0.9999999999999998   0.9999999999999998   NaN  ...
0.9999999999999998   0.9999999999999998   NaN  ...
0.9999999999999998   0.9999999999999998   NaN  ...
0.9999999999999998   0.9999999999999998   NaN  ...

我是 MLib 的新手,统计能力不是很强。很明显,我做错了事。

我在这里看到了什么,而不是我所期望的,
我该如何达到我想要的结果?

部分问题的解决方法很丢人...
我只是把矢量放在错误的一边。还有这个,更正一下:

List<Row> data = Arrays.asList(
   RowFactory.create(Vectors.dense(8.3, 7.9)),
   RowFactory.create(Vectors.dense(7.6, 7.4)),
   RowFactory.create(Vectors.dense(9.1, 9.1)),
   RowFactory.create(Vectors.dense(9.5, 9.3)),
   RowFactory.create(Vectors.dense(8.4, 8.4)),
   RowFactory.create(Vectors.dense(6.9, 7.5)),
   RowFactory.create(Vectors.dense(9.2, 9.0)),
   RowFactory.create(Vectors.dense(7.8, 7.2)),
   RowFactory.create(Vectors.dense(8.6, 8.2)),
   RowFactory.create(Vectors.dense(8.2, 8.1))
);

Correlation entre les notes des deux juges pour les sportifs :
1.0                                 0.9151515151515153
0.9151515151515153   1.0