Apache Spark:JAR 文件未在 spark-submit 上发布
Apache Spark: JAR file not shipped on spark-submit
Spark 不会自动将 JAR 文件(包含 spark 应用程序)从主服务器发送到从服务器,这是否正常?在早期版本(并在 Amazon Webservices 上使用)中它起作用了!此功能自版本 1.2.2 以来是否发生了变化,还是由没有 public dns 地址的集群引起的问题???还是这个 "copy the jar automatically" 函数只能在 AWS 集群中工作?
这是我的提交调用:
./spark-submit --class prototype.Test --master spark://192.168.178.128:7077 --deploy-mode cluster ~/test.jar
信息:--jars 参数列出的文件 "copied" 给工作人员。
那是我自己的错! -> 不要使用参数 --deploy-mode 来使用标准集群,其中驱动程序进程计划在主节点上 运行。
查看 Spark 文档:https://spark.apache.org/docs/latest/submitting-applications.html
--deploy-mode: Whether to deploy your driver on the worker nodes (cluster) or locally as an external client (client) (default: client) [...]
A common deployment strategy is to submit your application from a gateway machine that is physically co-located with your worker machines (e.g. Master node in a standalone EC2 cluster). In this setup, client mode is appropriate. In client mode, the driver is launched directly within the spark-submit process which acts as a client to the cluster. The input and output of the application is attached to the console. Thus, this mode is especially suitable for applications that involve the REPL (e.g. Spark shell).
[...]
Spark 不会自动将 JAR 文件(包含 spark 应用程序)从主服务器发送到从服务器,这是否正常?在早期版本(并在 Amazon Webservices 上使用)中它起作用了!此功能自版本 1.2.2 以来是否发生了变化,还是由没有 public dns 地址的集群引起的问题???还是这个 "copy the jar automatically" 函数只能在 AWS 集群中工作?
这是我的提交调用:
./spark-submit --class prototype.Test --master spark://192.168.178.128:7077 --deploy-mode cluster ~/test.jar
信息:--jars 参数列出的文件 "copied" 给工作人员。
那是我自己的错! -> 不要使用参数 --deploy-mode 来使用标准集群,其中驱动程序进程计划在主节点上 运行。
查看 Spark 文档:https://spark.apache.org/docs/latest/submitting-applications.html
--deploy-mode: Whether to deploy your driver on the worker nodes (cluster) or locally as an external client (client) (default: client) [...]
A common deployment strategy is to submit your application from a gateway machine that is physically co-located with your worker machines (e.g. Master node in a standalone EC2 cluster). In this setup, client mode is appropriate. In client mode, the driver is launched directly within the spark-submit process which acts as a client to the cluster. The input and output of the application is attached to the console. Thus, this mode is especially suitable for applications that involve the REPL (e.g. Spark shell). [...]