spark Dataframe 中不带小数点的 Double 值四舍五入

Question

我试图在 spark 数据帧中舍入一个没有小数点的双精度值，但在输出中获得了相同的值。

下面是数据框列值。

+-----+-----+
| SIG1| SIG2|
+-----+-----+
| 46.0| 46.0|
| 94.0| 46.0|

数据框列的架构如下。

scala> df.printSchema
root
 |-- SIG1: double (nullable = true)
 |-- SIG2: double (nullable = true)

预期输出如下

+-----+-----+
| SIG1| SIG2|
+-----+-----+
| 46  |   46|
| 94  |   46|

我已经尝试按照文档

对列进行四舍五入

+------------------------------------------------------------------+
|ReturnType| Signature     |                            Description|
+------------------------------------------------------------------+
|DOUBLE    |round(DOUBLE a)| Returns the rounded BIGINT value of a.|

使用的代码是

val df1 = df.withColumn("SIG1", round(col("SIG1"))).withColumn("SIG2", round(col("SIG2")))

我们需要将列转换为 int/bigint 还是可以使用 round 函数本身？

提前致谢！

Answer 1

您不需要投专栏。如果你想去掉小数点后的数字，你可以使用 round(colName, 0)。

Answer 2

round function returns double 值也是，所以如果你想要 int 类型然后转换它。

scala> Seq(1.9999,2.1234,3.6523).toDF().select(round('value,2)).show()
+---------------+
|round(value, 2)|
+---------------+
|            2.0|
|           2.12|
|           3.65|
+---------------+


scala> Seq(1.9999,2.1234,3.6523).toDF().select(round('value,0)).show()
+---------------+
|round(value, 0)|
+---------------+
|            2.0|
|            2.0|
|            4.0|
+---------------+


scala> Seq(1.9999,2.1234,3.6523).toDF().select(round('value)).show()
+---------------+
|round(value, 0)|
+---------------+
|            2.0|
|            2.0|
|            4.0|
+---------------+

scala> Seq(1.9999,2.1234,3.6523).toDF().select('value.cast("int")).show()
+-----+
|value|
+-----+
|    1|
|    2|
|    3|
+-----+

spark Dataframe 中不带小数点的 Double 值四舍五入

Rounding of Double value without decimal points in spark Dataframe

rounding

dataframe

apache-spark

apache-spark-sql