org.htuple
Class ShuffleUtils

java.lang.Object
  extended by org.htuple.ShuffleUtils

public class ShuffleUtils
extends Object

Utilities to help with custom sorting, grouping and partitioning of Tuple's.

The following example shows how you'd configure your job for secondary sort for tuples with two elements, assuming that all the elements would be used for partitioning and grouping, but only the first element would be used for grouping.


 ShuffleUtils.configBuilder()
   .setPartitionerIndices(0)
   .setSortIndices(0, 1)
   .setGroupIndices(0)
   .configure(conf);
 

This class also supports using enum's to improve the readability of your code (just like with Tuples).


 enum MyTupleFields { ID, NAME }
 ...
 ShuffleUtils.configBuilder()
   .setPartitionerIndices(MyTupleFields.values())
   .setSortIndices(MyTupleFields.values())
   .setGroupIndices(MyTupleFields.ID)
   .configure(conf);
 


Nested Class Summary
static class ShuffleUtils.ConfigBuilder
          A builder that allows you to tune how partitioning, sorting and grouping should work for a given MapReduce job using your Tuple instances.
 
Field Summary
static String BASE_CONFIG_NAME
           
static String GROUPING_INDEXES_CONFIG_NAME
           
static String PARTITIONER_INDEXES_CONFIG_NAME
           
static String SORTING_INDEXES_CONFIG_NAME
           
 
Constructor Summary
ShuffleUtils()
           
 
Method Summary
static ShuffleUtils.ConfigBuilder configBuilder()
           
static int[] indexesFromConfig(org.apache.hadoop.conf.Configuration conf, String confName)
           
static String[] indexesToStrings(int[] indexes)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BASE_CONFIG_NAME

public static final String BASE_CONFIG_NAME
See Also:
Constant Field Values

PARTITIONER_INDEXES_CONFIG_NAME

public static final String PARTITIONER_INDEXES_CONFIG_NAME
See Also:
Constant Field Values

SORTING_INDEXES_CONFIG_NAME

public static final String SORTING_INDEXES_CONFIG_NAME
See Also:
Constant Field Values

GROUPING_INDEXES_CONFIG_NAME

public static final String GROUPING_INDEXES_CONFIG_NAME
See Also:
Constant Field Values
Constructor Detail

ShuffleUtils

public ShuffleUtils()
Method Detail

configBuilder

public static ShuffleUtils.ConfigBuilder configBuilder()

indexesFromConfig

public static int[] indexesFromConfig(org.apache.hadoop.conf.Configuration conf,
                                      String confName)

indexesToStrings

public static String[] indexesToStrings(int[] indexes)


Copyright © 2013. All Rights Reserved.