将 DataFrame 写入 HDFS,连接被拒绝
Write DataFrame to HDFS,connection refused
我正在尝试 运行 Mastering Apache Spark 2.x 一书中的示例。
scala> val df = sc.parallelize(Array(1,2,3)).toDF
df: org.apache.spark.sql.DataFrame = [value: int]
我是 Spark 世界的新手,但我想数据帧应该保存到 HDFS
scala> df.write.json("hdfs://localhost:9000/tmp/account.json")
java.net.ConnectException: Call From miki/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
我与 dfsadmin 确认过
hadoop dfsadmin -safemode enter
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.
safemode: FileSystem file:/// is not an HDFS file system
jps输出
miki@miki:~$ jps
13798 Jps
10906 SparkSubmit
如何解决这个问题?
根据您的 jps 输出,您没有 运行ning 读取和写入 HDFS(名称节点、数据节点、资源管理器)所需的必要 Hadoop 守护进程。确保你 运行 start-yarn 和 start-dfs 在你的机器上启动你的 HDFS 并 运行ning.
我正在尝试 运行 Mastering Apache Spark 2.x 一书中的示例。
scala> val df = sc.parallelize(Array(1,2,3)).toDF
df: org.apache.spark.sql.DataFrame = [value: int]
我是 Spark 世界的新手,但我想数据帧应该保存到 HDFS
scala> df.write.json("hdfs://localhost:9000/tmp/account.json")
java.net.ConnectException: Call From miki/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
我与 dfsadmin 确认过
hadoop dfsadmin -safemode enter
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.
safemode: FileSystem file:/// is not an HDFS file system
jps输出
miki@miki:~$ jps
13798 Jps
10906 SparkSubmit
如何解决这个问题?
根据您的 jps 输出,您没有 运行ning 读取和写入 HDFS(名称节点、数据节点、资源管理器)所需的必要 Hadoop 守护进程。确保你 运行 start-yarn 和 start-dfs 在你的机器上启动你的 HDFS 并 运行ning.