我的映射器输入和减速器输出如何相同

Question

我运行遇到了一个有趣的情况，我的映射器输入与 reducer 输出相同（reducer 代码不工作）。这是我的第一个数据集，因为我是新手。提前致谢。

问题陈述：寻找一年中的最高温度。

考虑一下，下面是我的数据集（年份和临时列由制表符分隔 space）

映射器代码

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MapperCode extends Mapper<LongWritable,Text,Text,IntWritable> {
public void map(LongWritable key,Text value,Context context) throws IOException,InterruptedException
{
    String Line=value.toString();
    String keyvalpair[]=Line.split("\t");
    context.write(new Text(keyvalpair[0].trim()), new IntWritable(Integer.parseInt(keyvalpair[1].trim())));
}
}

减速器代码：

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class ReducerCode extends Reducer<Text,IntWritable,Text,IntWritable>           {
public void reducer(Text key,Iterable<IntWritable> value,Context context)throws IOException,InterruptedException
{
    int max=0;
    for (IntWritable values:value)
    {
        max=Math.max(max, values.get());
    }
    context.write(key,new IntWritable(max));    
}   
}

Driver代码：

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class MaxTemp extends Configuration {
    public static void main(String[] args) throws IOException,InterruptedException,Exception {
Job job=new Job();
job.setJobName("MaxTemp");
job.setJarByClass(MaxTemp.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MapperCode.class);
job.setReducerClass(ReducerCode.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.waitForCompletion(true);

    }

}

请让我知道我在哪里犯了错误。为什么我的 o/p 与输入数据集相同。

Answer 1

Reducer 实现必须覆盖 reduce() 方法。您的实现具有名为 reducer() 的方法，该方法永远不会被调用。

改为

public class ReducerCode extends Reducer<Text,IntWritable,Text,IntWritable> {
    public void reduce(Text key,Iterable<IntWritable> value,Context context)throws IOException,InterruptedException {

我的映射器输入和减速器输出如何相同

How my mapper input and reducer output is same

hadoop

mapreduce

hadoop2