I am trying to implement a MapReduce job, where each of the mappers would take 150 lines of the text file, and all the mappers would run simmultaniously; also, it should not fail, no matter how many map tasks fail.
Here's the configuration part:
JobConf conf = new JobConf(Main.class); conf.setJobName("My mapreduce"); conf.set("mapreduce.input.lineinputformat.linespermap", "150"); conf.set("mapred.max.map.failures.percent","100"); conf.setInputFormat(NLineInputFormat.class); FileInputFormat.addInputPath(conf, new Path(args)); FileOutputFormat.setOutputPath(conf, new Path(args));
The problem is that hadoop creates a mapper for every single line of text, they seem to run sequentially, and if a single one fails, the job fails.
From this I deduce, that the settings I've applied do not have any effect.
What did I do wrong?