如何在 spark rdd 中通过自定义显示前 N 个结果?

How to show top N number of results with customization in spark rdd?

val sorting = sc.parallelize(List(1,1,1,2,2,2,2,3,3,3,4,4,4,4,5,5,5,6,6,7,8,8,8,8,8))
sorting.map(x=>(x,1)).reduceByKey((a,b)=>a+b).map(x=>(x._1,"==>",x._2)).sortBy(s=>s._2,false).collect.foreach(println)    
output:
(8,==>,5)
(1,==>,3)
(2,==>,4)
(3,==>,3)
(4,==>,4)
(5,==>,3)
(6,==>,2)
(7,==>,1)

我只想显示前 3 个结果并从结果中删除 ,(逗号)。

使用take(3)代替collect得到前3个结果,然后手动清理输出:

sorting.map(x=>(x,1)).reduceByKey((a,b)=>a+b).sortBy(s=>s._2,false).map(x=>s"${x._1} ${x._2}").take(3).foreach(println)

8 5
2 4
4 4