将 pyspark 数据框写入文本而不更改其结构
Write a pyspark dataframe to text without changing its structure
我有一个 pyspark 数据框,如下所示
+--------------------+
| speed|
+--------------------+
|[5.59239, 2.51329...|
|[0.0191166, 0.169...|
|[0.561913, 0.4098...|
|[0.393343, 0.3580...|
|[0.118315, 0.1183...|
|[0.831407, 0.4470...|
|[1.49012e-08, 0.1...|
|[0.0411047, 0.152...|
|[0.620069, 0.8262...|
|[0.20373, 0.20373...|
+--------------------+
如何将此数据帧写入 CSV,以便按显示的方式保存它above.Currently我尝试合并,但保存如下
"[5.59239, 2.51329, 0.141536, 1.27485, 2.35138, 12.9668, 12.9668, 2.52421, 0.330804, 0.459188, 0.459188, 0.651573, 3.15373, 6.11923, 8.8445, 8.0871, 0.855173, 1.43534, 1.43534, 1.05988, 1.05988, 0.778344, 1.20522, 1.70414, 1.70414, 0.0795492, 1.10385, 1.4759, 1.64844, 0.82941, 1.11321, 1.37977, 0.849902, 1.24436, 1.24436, 0.698651, 0.791467, 0.636781, 0.666729, 0.666729, 0.45688, 0.45688, 0.158829, 2.12693, 29.8682, 29.8682, 9.62536, 3.40384, 2.51002, 1.55077, 1.01774, 0.922753, 0.922753, 0.0438924, 0.530669, 0.879573, 0.627267, 0.0532846, 0.0890066, 0.0884833, 0.140008, 0.147534, 0.0180038, 0.0132851, 0.112785, 0.112785, 0.22997, 0.22997, 0.0524423, 0.141886, 0.328422,............]"
但我想以适当的格式保存它 excel 文件,速度作为列名,其值作为列表列表。
我不想使用 topandas(),因为它占用大量内存
如果我emphasised/under强调了某事,请在评论中告诉我。
df.coalesce(1).write.option("header","true").csv("file:///s/tesing")
我解决了!
df_Welding_amp.rdd.coalesce(1).saveAsTextFile('home/ram/file.csv')
虽然我没有完全得到列表的列表,但我能够成功获得如下所示的行格式
Row(speed='[5.59239, 2.51329, 0.141536, 1.27485, 2.35138, 12.9668, 12.9668, 2.52421, 0.330804, 0.459188, 0.459188, 0.651573, 3.15373, 6.11923, 8.8445, 8.0871, 0.855173, 1.43534, 1.43534, 1.05988, 1.05988, 0.778344, 1.20522, 1.70414, 1.70414, 0.0795492, 1.10385, 1.4759, 1.64844, 0.82941........
.....]
Row(speed='[0.0191166, 0.169978, 0.226254, 0.149923, 0.149923, 0.505102, 0.505102, 0.369975, 0.305384, 0.154693, 0.224818, 0.875909, 0.875909, 2.5506, 6.06761, 5.0829, 4.46667, 2.16333, 3.74257, 3.74257, 2.33873, 1.39336, 1.56772, 0.889895, 0.249284, 0.249284, 0.132409, 0.177825, 0.270215, 0.398466, 2.3726, 4.87186, 4.05198, 2.23753, 0.266356, 0.513157, 0.78962, 0.523164, 0.138469, 0.315834, 0.315834]
我有一个 pyspark 数据框,如下所示
+--------------------+
| speed|
+--------------------+
|[5.59239, 2.51329...|
|[0.0191166, 0.169...|
|[0.561913, 0.4098...|
|[0.393343, 0.3580...|
|[0.118315, 0.1183...|
|[0.831407, 0.4470...|
|[1.49012e-08, 0.1...|
|[0.0411047, 0.152...|
|[0.620069, 0.8262...|
|[0.20373, 0.20373...|
+--------------------+
如何将此数据帧写入 CSV,以便按显示的方式保存它above.Currently我尝试合并,但保存如下
"[5.59239, 2.51329, 0.141536, 1.27485, 2.35138, 12.9668, 12.9668, 2.52421, 0.330804, 0.459188, 0.459188, 0.651573, 3.15373, 6.11923, 8.8445, 8.0871, 0.855173, 1.43534, 1.43534, 1.05988, 1.05988, 0.778344, 1.20522, 1.70414, 1.70414, 0.0795492, 1.10385, 1.4759, 1.64844, 0.82941, 1.11321, 1.37977, 0.849902, 1.24436, 1.24436, 0.698651, 0.791467, 0.636781, 0.666729, 0.666729, 0.45688, 0.45688, 0.158829, 2.12693, 29.8682, 29.8682, 9.62536, 3.40384, 2.51002, 1.55077, 1.01774, 0.922753, 0.922753, 0.0438924, 0.530669, 0.879573, 0.627267, 0.0532846, 0.0890066, 0.0884833, 0.140008, 0.147534, 0.0180038, 0.0132851, 0.112785, 0.112785, 0.22997, 0.22997, 0.0524423, 0.141886, 0.328422,............]"
但我想以适当的格式保存它 excel 文件,速度作为列名,其值作为列表列表。
我不想使用 topandas(),因为它占用大量内存
如果我emphasised/under强调了某事,请在评论中告诉我。
df.coalesce(1).write.option("header","true").csv("file:///s/tesing")
我解决了!
df_Welding_amp.rdd.coalesce(1).saveAsTextFile('home/ram/file.csv')
虽然我没有完全得到列表的列表,但我能够成功获得如下所示的行格式
Row(speed='[5.59239, 2.51329, 0.141536, 1.27485, 2.35138, 12.9668, 12.9668, 2.52421, 0.330804, 0.459188, 0.459188, 0.651573, 3.15373, 6.11923, 8.8445, 8.0871, 0.855173, 1.43534, 1.43534, 1.05988, 1.05988, 0.778344, 1.20522, 1.70414, 1.70414, 0.0795492, 1.10385, 1.4759, 1.64844, 0.82941........
.....]
Row(speed='[0.0191166, 0.169978, 0.226254, 0.149923, 0.149923, 0.505102, 0.505102, 0.369975, 0.305384, 0.154693, 0.224818, 0.875909, 0.875909, 2.5506, 6.06761, 5.0829, 4.46667, 2.16333, 3.74257, 3.74257, 2.33873, 1.39336, 1.56772, 0.889895, 0.249284, 0.249284, 0.132409, 0.177825, 0.270215, 0.398466, 2.3726, 4.87186, 4.05198, 2.23753, 0.266356, 0.513157, 0.78962, 0.523164, 0.138469, 0.315834, 0.315834]