Sunday 20 September 2015

Multiple OutPut Format In Hadoop

The MultipleOutputs class simplifies writing data to multiple outputs.

  • Configure a named output with a name and an OutputFormat .
  • When writing out a < key ,value pair, specify the named output to send it to.


This is accomplished by assigning names to each output, using the static
 addNamedOutput method of MultipleOutputs
For example: 
In Driver Class add following:


MultipleOutputs.addNamedOutput(job, "QuantityData", TextOutputFormat.class, NullWritable.class, Text.class);

MultipleOutputs.addNamedOutput(job,"loadProfileChannelData",TextOutputFormat.class,NullWritable.class,Text.class);
MultipleOutputs.addNamedOutput(job, "registerValueData", TextOutputFormat.class, NullWritable.class, Text.class);

In Mapper Class add following:

public class MyParserMapper   extends
    Mapper<LongWritable, Text, NullWritable, Text> {
private MultipleOutputs<NullWritable,Text> outs;
   @Override
   public void setup(Context context)throws IOException, InterruptedException{
  outs = new MultipleOutputs<NullWritable,Text>(context);
   }

No comments:

Post a Comment