错误 - MapReduce 中的 Hadoop 字数统计程序
Error - Hadoop Word Count Program in MapReduce
我是 Hadoop 的新手,如果这个问题看起来很愚蠢,请原谅我。
我是 运行 我的 MapReduce 程序并收到以下错误:
java.lang.Exception:java.io.IOException:映射中的键类型不匹配:预期 org.apache.hadoop.io.Text,收到 org.apache.hadoop.io.LongWritable
在 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
原因:java.io.IOException:映射中键的类型不匹配:预期 org.apache.hadoop.io.Text,收到 org.apache.hadoop.io.LongWritable
在 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1019)
感谢任何帮助。
public class 字数 {
// Mapper Class
public static class MapperClass extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
// Mapper method defined
public void mapperMethod(Object key,Text lineContent,Context context){
try{
StringTokenizer strToken = new StringTokenizer(lineContent.toString());
//Iterating through the line
while(strToken.hasMoreTokens()){
word.set(strToken.nextToken());
try{
context.write(word, one);
}
catch(Exception e){
System.err.println(new Date()+" ---> Cannot write data to hadoop in Mapper.");
e.printStackTrace();
}
}
}
catch(Exception ex){
ex.printStackTrace();
}
}
}
// Reducer Class
public static class ReducerClass extends Reducer<Text, IntWritable, Text, IntWritable>{
private IntWritable result = new IntWritable();
//Reducer method
public void reduce(Text key,Iterable<IntWritable> values,Context context){
try{
int sum=0;
for(IntWritable itr : values){
sum+=itr.get();
}
result.set(sum);
try {
context.write(key,result);
} catch (Exception e) {
System.err.println(new Date()+" ---> Error while sending data to Hadoop in Reducer");
e.printStackTrace();
}
}
catch (Exception err){
err.printStackTrace();
}
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
try{
Configuration conf = new Configuration();
String [] arguments = new GenericOptionsParser(conf, args).getRemainingArgs();
if(arguments.length!=2){
System.err.println("Enter both and input and output location.");
System.exit(1);
}
Job job = new Job(conf,"Simple Word Count");
job.setJarByClass(WordCount.class);
job.setMapperClass(MapperClass.class);
job.setReducerClass(ReducerClass.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(arguments[0]));
FileOutputFormat.setOutputPath(job, new Path(arguments[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
catch(Exception e){
}
}
}
您需要覆盖 Mapper 中的 Map 方法 Class 而不是您有一个新方法。
犯了你的错误,因为你没有覆盖 map 方法,你的程序归结为只减少工作。 Reducer 正在获取 LongWritable,Text 输入,但您已声明 Intwritable 和文本作为输入。
希望这能解释清楚。
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
我是 Hadoop 的新手,如果这个问题看起来很愚蠢,请原谅我。
我是 运行 我的 MapReduce 程序并收到以下错误:
java.lang.Exception:java.io.IOException:映射中的键类型不匹配:预期 org.apache.hadoop.io.Text,收到 org.apache.hadoop.io.LongWritable 在 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) 原因:java.io.IOException:映射中键的类型不匹配:预期 org.apache.hadoop.io.Text,收到 org.apache.hadoop.io.LongWritable 在 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1019)
感谢任何帮助。
public class 字数 {
// Mapper Class
public static class MapperClass extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
// Mapper method defined
public void mapperMethod(Object key,Text lineContent,Context context){
try{
StringTokenizer strToken = new StringTokenizer(lineContent.toString());
//Iterating through the line
while(strToken.hasMoreTokens()){
word.set(strToken.nextToken());
try{
context.write(word, one);
}
catch(Exception e){
System.err.println(new Date()+" ---> Cannot write data to hadoop in Mapper.");
e.printStackTrace();
}
}
}
catch(Exception ex){
ex.printStackTrace();
}
}
}
// Reducer Class
public static class ReducerClass extends Reducer<Text, IntWritable, Text, IntWritable>{
private IntWritable result = new IntWritable();
//Reducer method
public void reduce(Text key,Iterable<IntWritable> values,Context context){
try{
int sum=0;
for(IntWritable itr : values){
sum+=itr.get();
}
result.set(sum);
try {
context.write(key,result);
} catch (Exception e) {
System.err.println(new Date()+" ---> Error while sending data to Hadoop in Reducer");
e.printStackTrace();
}
}
catch (Exception err){
err.printStackTrace();
}
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
try{
Configuration conf = new Configuration();
String [] arguments = new GenericOptionsParser(conf, args).getRemainingArgs();
if(arguments.length!=2){
System.err.println("Enter both and input and output location.");
System.exit(1);
}
Job job = new Job(conf,"Simple Word Count");
job.setJarByClass(WordCount.class);
job.setMapperClass(MapperClass.class);
job.setReducerClass(ReducerClass.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(arguments[0]));
FileOutputFormat.setOutputPath(job, new Path(arguments[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
catch(Exception e){
}
}
}
您需要覆盖 Mapper 中的 Map 方法 Class 而不是您有一个新方法。 犯了你的错误,因为你没有覆盖 map 方法,你的程序归结为只减少工作。 Reducer 正在获取 LongWritable,Text 输入,但您已声明 Intwritable 和文本作为输入。
希望这能解释清楚。
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}