windows 中的 Spark 异常:java.io.IOException:无法重命名洗牌文件
Spark Exception in windows: java.io.IOException: fail to rename shuffle file
我在尝试 运行 一段 Spark Streaming 代码时遇到问题。我正在尝试从 kafka 主题读取数据,然后将处理后的数据推送到弹性搜索。我在 windows 上 运行 在 eclipse 中使用此代码并配置了 Kafka、Spark、zookeeper 和 elasticsearch。我收到以下错误:
18/02/20 14:52:11 ERROR Executor: Exception in task 0.0 in stage 6.0 (TID 5)
java.io.IOException: fail to rename file C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index.ca3b55d2-6c26-4798-a17b-21a42f099126 to C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index
at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:178)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:72)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/02/20 14:52:11 WARN TaskSetManager: Lost task 0.0 in stage 6.0 (TID 5, localhost): java.io.IOException: fail to rename file C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index.ca3b55d2-6c26-4798-a17b-21a42f099126 to C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index
at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:178)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:72)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
有人可以就如何解决这个问题给我一些意见吗?
我通过将 spark.local.dir 设置为另一个您具有重命名权限的路径来解决这个问题。
SparkConf conf = new SparkConf().set("spark.local.dir","another path");
我是 spark 的新手,希望它对你有用。
我在尝试 运行 一段 Spark Streaming 代码时遇到问题。我正在尝试从 kafka 主题读取数据,然后将处理后的数据推送到弹性搜索。我在 windows 上 运行 在 eclipse 中使用此代码并配置了 Kafka、Spark、zookeeper 和 elasticsearch。我收到以下错误:
18/02/20 14:52:11 ERROR Executor: Exception in task 0.0 in stage 6.0 (TID 5)
java.io.IOException: fail to rename file C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index.ca3b55d2-6c26-4798-a17b-21a42f099126 to C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index
at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:178)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:72)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/02/20 14:52:11 WARN TaskSetManager: Lost task 0.0 in stage 6.0 (TID 5, localhost): java.io.IOException: fail to rename file C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index.ca3b55d2-6c26-4798-a17b-21a42f099126 to C:\Users\shash\AppData\Local\Temp\blockmgr-cb45497b-7f85-4158-815b-852edecbb2c5[=12=]f\shuffle_1_0_0.index
at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:178)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:72)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
有人可以就如何解决这个问题给我一些意见吗?
我通过将 spark.local.dir 设置为另一个您具有重命名权限的路径来解决这个问题。
SparkConf conf = new SparkConf().set("spark.local.dir","another path");
我是 spark 的新手,希望它对你有用。