Hadoop post-hook 和作业完成通知

Hadoop post-hook and job completation notification

我想将我的 Hadoop 作业输出导入到 Hive table。我如何在 map-reduce job/flow 中实现 post-hooking?或者任何其他自动化选项?

此外,我还会在工作完成后发出通知,例如向用户发送电子邮件。我发现了这个:https://issues.apache.org/jira/browse/HADOOP-1111,但我不太明白如何去做,因为我是 map-reducing 的新手。

谢谢。

conf.set("mapreduce.job.end-notification.url","url")

会做。 url 应该是一个 http url,您将在其中收到回调。

来自 javadocs :

Set the uri to be invoked in-order to send a notification after the job has completed (success/failure).

The uri can contain 2 special parameters: $jobId and $jobStatus. Those, if present, are replaced by the job's identifier and completion-status respectively.

This is typically used by application-writers to implement chaining of Map-Reduce jobs in an asynchronous manner.

请注意,较旧的 hadoop 版本使用 job.end.notification.url
它在较新版本中已被弃用,取而代之的是 mapreduce.job.end-notification.url.

引用mapred-default.xml#mapreduce.job.end-notification.url.