在 User ID Spark 上旋转 Dataframe 列
Pivoting a Dataframe column transforming on a User ID Spark
我有一个看起来像
的数据框
+------+------------+------------------+
|UserID|Attribute | Value |
+------+------------+------------------+
|123 | City | San Francisco |
|123 | Lang | English |
|111 | Lang | French |
|111 | Age | 23 |
|111 | Gender | Female |
+------+------------+------------------+
所以我有一些不同的属性,对于某些用户来说可以为空(有限的属性说最多 20 个)
我想将此 DF 转换为
+-----+--------------+---------+-----+--------+
|User |City | Lang | Age | Gender |
+-----+--------------+---------+-----+--------+
|123 |San Francisco | English | NULL| NULL |
|111 | NULL| French | 23 | Female |
+-----+--------------+---------+-----+--------+
我对 Spark 和 Scala 很陌生。
您可以使用 pivot
获得所需的输出:
import org.apache.spark.sql.functions._
import sparkSession.sqlContext.implicits._
df.groupBy("UserID")
.pivot("Attribute")
.agg(first("Value")).show()
这将为您提供所需的输出:
+------+----+-------------+------+-------+
|UserID| Age| City|Gender| Lang|
+------+----+-------------+------+-------+
| 111| 23| null|Female| French|
| 123|null|San Francisco| null|English|
+------+----+-------------+------+-------+
我有一个看起来像
的数据框+------+------------+------------------+
|UserID|Attribute | Value |
+------+------------+------------------+
|123 | City | San Francisco |
|123 | Lang | English |
|111 | Lang | French |
|111 | Age | 23 |
|111 | Gender | Female |
+------+------------+------------------+
所以我有一些不同的属性,对于某些用户来说可以为空(有限的属性说最多 20 个)
我想将此 DF 转换为
+-----+--------------+---------+-----+--------+
|User |City | Lang | Age | Gender |
+-----+--------------+---------+-----+--------+
|123 |San Francisco | English | NULL| NULL |
|111 | NULL| French | 23 | Female |
+-----+--------------+---------+-----+--------+
我对 Spark 和 Scala 很陌生。
您可以使用 pivot
获得所需的输出:
import org.apache.spark.sql.functions._
import sparkSession.sqlContext.implicits._
df.groupBy("UserID")
.pivot("Attribute")
.agg(first("Value")).show()
这将为您提供所需的输出:
+------+----+-------------+------+-------+
|UserID| Age| City|Gender| Lang|
+------+----+-------------+------+-------+
| 111| 23| null|Female| French|
| 123|null|San Francisco| null|English|
+------+----+-------------+------+-------+