Error: Java heap space in reducer phase
Error: Java heap space in reducer phase
我在 reducer 阶段遇到 JAVA Heap space 错误。我在我的应用程序中使用了 41 reducer 和 Custom Partitioner class。
下面是我的 reducer 代码,它抛出以下错误。
17/02/12 05:26:45 INFO mapreduce.Job: map 98% reduce 0%
17/02/12 05:28:02 INFO mapreduce.Job: map 100% reduce 0%
17/02/12 05:28:09 INFO mapreduce.Job: map 100% reduce 17%
17/02/12 05:28:10 INFO mapreduce.Job: map 100% reduce 39%
17/02/12 05:28:11 INFO mapreduce.Job: map 100% reduce 46%
17/02/12 05:28:12 INFO mapreduce.Job: map 100% reduce 51%
17/02/12 05:28:13 INFO mapreduce.Job: map 100% reduce 54%
17/02/12 05:28:14 INFO mapreduce.Job: map 100% reduce 56%
17/02/12 05:28:15 INFO mapreduce.Job: map 100% reduce 88%
17/02/12 05:28:16 INFO mapreduce.Job: map 100% reduce 90%
17/02/12 05:28:18 INFO mapreduce.Job: map 100% reduce 93%
17/02/12 05:28:18 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000020_0, Status : FAILED
Error: Java heap space
17/02/12 05:28:19 INFO mapreduce.Job: map 100% reduce 91%
17/02/12 05:28:20 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000021_0, Status : FAILED
Error: Java heap space
17/02/12 05:28:22 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000027_0, Status : FAILED
Error: Java heap space
17/02/12 05:28:23 INFO mapreduce.Job: map 100% reduce 89%
17/02/12 05:28:24 INFO mapreduce.Job: map 100% reduce 90%
17/02/12 05:28:24 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000029_0, Status : FAILED
Error: Java heap space
这是我的减速器代码..
public class MyReducer extends Reducer<NullWritable, Text, NullWritable, Text> {
private Logger logger = Logger.getLogger(MyReducer.class);
StringBuilder sb = new StringBuilder();
private MultipleOutputs<NullWritable, Text> multipleOutputs;
public void setup(Context context) {
logger.info("Inside Reducer.");
multipleOutputs = new MultipleOutputs<NullWritable, Text>(context);
}
@Override
public void reduce(NullWritable Key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
for (Text value : values) {
final String valueStr = value.toString();
if (valueStr.contains("Japan")) {
sb.append(valueStr.substring(0, valueStr.length() - 20));
} else if (valueStr.contains("SelfSourcedPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 29));
} else if (valueStr.contains("SelfSourcedPublic")) {
sb.append(value.toString().substring(0, valueStr.length() - 29));
} else if (valueStr.contains("ThirdPartyPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 25));
}
}
multipleOutputs.write(NullWritable.get(), new Text(sb.toString()), "MyFileName");
}
public void cleanup(Context context) throws IOException, InterruptedException {
multipleOutputs.close();
}
}
你能提出任何可以解决我的问题的更改吗?
如果我们使用组合器 class 会改善吗?
终于解决了。
我只是在 for 循环中使用了 multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),strName);
,这解决了我的问题。我用非常大的数据集 19 gb 文件对其进行了测试,它对我来说工作正常。
这是我的最终解决方案。最初我认为它可能会创建很多对象,但它对我来说工作正常。Map reduce 也很快得到竞争。
@Override
public void reduce(NullWritable Key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
for (Text value : values) {
final String valueStr = value.toString();
StringBuilder sb = new StringBuilder();
if (valueStr.contains("Japan")) {
sb.append(valueStr.substring(0, valueStr.length() - 20));
} else if (valueStr.contains("SelfSourcedPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 24));
} else if (valueStr.contains("SelfSourcedPublic")) {
sb.append(value.toString().substring(0, valueStr.length() - 25));
} else if (valueStr.contains("ThirdPartyPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 25));
}
multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),
strName);
}
}
我在 reducer 阶段遇到 JAVA Heap space 错误。我在我的应用程序中使用了 41 reducer 和 Custom Partitioner class。 下面是我的 reducer 代码,它抛出以下错误。
17/02/12 05:26:45 INFO mapreduce.Job: map 98% reduce 0%
17/02/12 05:28:02 INFO mapreduce.Job: map 100% reduce 0%
17/02/12 05:28:09 INFO mapreduce.Job: map 100% reduce 17%
17/02/12 05:28:10 INFO mapreduce.Job: map 100% reduce 39%
17/02/12 05:28:11 INFO mapreduce.Job: map 100% reduce 46%
17/02/12 05:28:12 INFO mapreduce.Job: map 100% reduce 51%
17/02/12 05:28:13 INFO mapreduce.Job: map 100% reduce 54%
17/02/12 05:28:14 INFO mapreduce.Job: map 100% reduce 56%
17/02/12 05:28:15 INFO mapreduce.Job: map 100% reduce 88%
17/02/12 05:28:16 INFO mapreduce.Job: map 100% reduce 90%
17/02/12 05:28:18 INFO mapreduce.Job: map 100% reduce 93%
17/02/12 05:28:18 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000020_0, Status : FAILED
Error: Java heap space
17/02/12 05:28:19 INFO mapreduce.Job: map 100% reduce 91%
17/02/12 05:28:20 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000021_0, Status : FAILED
Error: Java heap space
17/02/12 05:28:22 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000027_0, Status : FAILED
Error: Java heap space
17/02/12 05:28:23 INFO mapreduce.Job: map 100% reduce 89%
17/02/12 05:28:24 INFO mapreduce.Job: map 100% reduce 90%
17/02/12 05:28:24 INFO mapreduce.Job: Task Id : attempt_1486663266028_2653_r_000029_0, Status : FAILED
Error: Java heap space
这是我的减速器代码..
public class MyReducer extends Reducer<NullWritable, Text, NullWritable, Text> {
private Logger logger = Logger.getLogger(MyReducer.class);
StringBuilder sb = new StringBuilder();
private MultipleOutputs<NullWritable, Text> multipleOutputs;
public void setup(Context context) {
logger.info("Inside Reducer.");
multipleOutputs = new MultipleOutputs<NullWritable, Text>(context);
}
@Override
public void reduce(NullWritable Key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
for (Text value : values) {
final String valueStr = value.toString();
if (valueStr.contains("Japan")) {
sb.append(valueStr.substring(0, valueStr.length() - 20));
} else if (valueStr.contains("SelfSourcedPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 29));
} else if (valueStr.contains("SelfSourcedPublic")) {
sb.append(value.toString().substring(0, valueStr.length() - 29));
} else if (valueStr.contains("ThirdPartyPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 25));
}
}
multipleOutputs.write(NullWritable.get(), new Text(sb.toString()), "MyFileName");
}
public void cleanup(Context context) throws IOException, InterruptedException {
multipleOutputs.close();
}
}
你能提出任何可以解决我的问题的更改吗? 如果我们使用组合器 class 会改善吗?
终于解决了。
我只是在 for 循环中使用了 multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),strName);
,这解决了我的问题。我用非常大的数据集 19 gb 文件对其进行了测试,它对我来说工作正常。
这是我的最终解决方案。最初我认为它可能会创建很多对象,但它对我来说工作正常。Map reduce 也很快得到竞争。
@Override
public void reduce(NullWritable Key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
for (Text value : values) {
final String valueStr = value.toString();
StringBuilder sb = new StringBuilder();
if (valueStr.contains("Japan")) {
sb.append(valueStr.substring(0, valueStr.length() - 20));
} else if (valueStr.contains("SelfSourcedPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 24));
} else if (valueStr.contains("SelfSourcedPublic")) {
sb.append(value.toString().substring(0, valueStr.length() - 25));
} else if (valueStr.contains("ThirdPartyPrivate")) {
sb.append(valueStr.substring(0, valueStr.length() - 25));
}
multipleOutputs.write(NullWritable.get(), new Text(sb.toString()),
strName);
}
}