通过 NiFi 提交 spark 提交作业时无效 JSON
Invalid JSON while submitting spark submit job via NiFi
我正在尝试提交一份 spark
工作,我在 conf
属性 中设置了一个 date
参数,我正在 运行 完成它NiFi
中的脚本。但是,当我 运行 编写脚本时遇到错误。
Spark Submit
脚本中的代码:
aws emr add-steps --cluster-id "" --steps '[{"Args":["spark-submit","--deploy-mode","cluster","--jars","s3://tvsc-lumiq-edl/jars/ojdbc7.jar","--executor-memory","10g","--driver-memory","10g","--conf","spark.hadoop.yarn.timeline-service.enabled=false","--conf","currDate='\"\"'","--class",'\"\"','\"\"','\"\"'],"Type":"CUSTOM_JAR","ActionOnFailure":"CONTINUE","Jar":"command-runner.jar","Properties":"","Name":"Spark application"}]' --region ""
在我 运行 之后,我得到以下错误:
ExecuteStreamCommand[id=5b08df5a-1f24-3958-30ca-2e27a6c4becf] Transferring flow file StandardFlowFileRecord[uuid=00f844ee-dbea-42a3-aba3-0edcabfc50a2,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1607082757752-507103, container=default, section=223], offset=29, length=-1],offset=0,name=6414901712887990,size=0] to nonzero status. Executable command /bin/bash ended in an error:
Error parsing parameter '--steps': Invalid JSON:
[{"Args":["spark-submit","--deploy-mode","cluster","--jars","s3://tvsc-lumiq-edl/jars/ojdbc7.jar","--executor-memory","10g","--driver-memory","10g","--conf","spark.hadoop.yarn.timeline-service.enabled=false","--conf","currDate="Fri
我哪里错了?
您可以使用 JSONLint 来验证您的 JSON,这样可以更容易地找出错误的原因。
在你的例子中,你将最后的 3 个值用单引号 '
而不是双引号 "
您的 steps
JSON 应如下所示:
[{
"Args": [
"spark-submit",
"--deploy-mode",
"cluster",
"--jars",
"s3://tvsc-lumiq-edl/jars/ojdbc7.jar",
"--executor-memory",
"10g",
"--driver-memory",
"10g",
"--conf",
"spark.hadoop.yarn.timeline-service.enabled=false",
"--conf",
"currDate='\"\"'",
"--class",
"\"\"",
"\"\"",
"\"\""
],
"Type": "CUSTOM_JAR",
"ActionOnFailure": "CONTINUE",
"Jar": "command-runner.jar",
"Properties": "",
"Name": "Spark application"
}]
具体来说,这 3 行:
"\"\"",
"\"\"",
"\"\""
代替原来的:
'\"\"',
'\"\"',
'\"\"'
我正在尝试提交一份 spark
工作,我在 conf
属性 中设置了一个 date
参数,我正在 运行 完成它NiFi
中的脚本。但是,当我 运行 编写脚本时遇到错误。
Spark Submit
脚本中的代码:
aws emr add-steps --cluster-id "" --steps '[{"Args":["spark-submit","--deploy-mode","cluster","--jars","s3://tvsc-lumiq-edl/jars/ojdbc7.jar","--executor-memory","10g","--driver-memory","10g","--conf","spark.hadoop.yarn.timeline-service.enabled=false","--conf","currDate='\"\"'","--class",'\"\"','\"\"','\"\"'],"Type":"CUSTOM_JAR","ActionOnFailure":"CONTINUE","Jar":"command-runner.jar","Properties":"","Name":"Spark application"}]' --region ""
在我 运行 之后,我得到以下错误:
ExecuteStreamCommand[id=5b08df5a-1f24-3958-30ca-2e27a6c4becf] Transferring flow file StandardFlowFileRecord[uuid=00f844ee-dbea-42a3-aba3-0edcabfc50a2,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1607082757752-507103, container=default, section=223], offset=29, length=-1],offset=0,name=6414901712887990,size=0] to nonzero status. Executable command /bin/bash ended in an error:
Error parsing parameter '--steps': Invalid JSON:
[{"Args":["spark-submit","--deploy-mode","cluster","--jars","s3://tvsc-lumiq-edl/jars/ojdbc7.jar","--executor-memory","10g","--driver-memory","10g","--conf","spark.hadoop.yarn.timeline-service.enabled=false","--conf","currDate="Fri
我哪里错了?
您可以使用 JSONLint 来验证您的 JSON,这样可以更容易地找出错误的原因。
在你的例子中,你将最后的 3 个值用单引号 '
而不是双引号 "
您的 steps
JSON 应如下所示:
[{
"Args": [
"spark-submit",
"--deploy-mode",
"cluster",
"--jars",
"s3://tvsc-lumiq-edl/jars/ojdbc7.jar",
"--executor-memory",
"10g",
"--driver-memory",
"10g",
"--conf",
"spark.hadoop.yarn.timeline-service.enabled=false",
"--conf",
"currDate='\"\"'",
"--class",
"\"\"",
"\"\"",
"\"\""
],
"Type": "CUSTOM_JAR",
"ActionOnFailure": "CONTINUE",
"Jar": "command-runner.jar",
"Properties": "",
"Name": "Spark application"
}]
具体来说,这 3 行:
"\"\"",
"\"\"",
"\"\""
代替原来的:
'\"\"',
'\"\"',
'\"\"'