K - key typeV - data (value) typepublic abstract class BatchTupleTSetImpl<K,V> extends BaseTSetWithSchema<V> implements BatchTupleTSet<K,V>
BaseTSet.StateType| Modifier | Constructor and Description |
|---|---|
protected |
BatchTupleTSetImpl() |
| Modifier and Type | Method and Description |
|---|---|
BatchTupleTSetImpl<K,V> |
addInput(java.lang.String key,
StorableTBase<?> input)
Adds inputs to
BatchTupleTSets |
KeyedCachedTSet<K,V> |
cache()
Runs this TSet and caches the data to an in-memory
DataPartition and exposes the data as another TSet. |
TupleSchema |
getOutputSchema()
Output schema of a BatchTupleTSet is definitely a
KeyedSchema. |
BatchEnvironment |
getTSetEnv()
tset env
|
<VR> JoinTLink<K,V,VR> |
join(BatchTupleTSet<K,VR> rightTSet,
CommunicationContext.JoinType type,
java.util.Comparator<K> keyComparator)
Joins with another
BatchTupleTSet. |
<VR> JoinTLink<K,V,VR> |
join(BatchTupleTSet<K,VR> rightTSet,
CommunicationContext.JoinType type,
java.util.Comparator<K> keyComparator,
TaskPartitioner<K> partitioner)
Joins with another
BatchTupleTSet. |
KeyedDirectTLink<K,V> |
keyedDirect()
Direct/pipe communication
|
KeyedGatherTLink<K,V> |
keyedGather()
Gathers data by key for
BatchTupleTSets |
KeyedGatherTLink<K,V> |
keyedGather(PartitionFunc<K> partitionFn)
Gathers data by key for
BatchTupleTSets |
KeyedGatherTLink<K,V> |
keyedGather(PartitionFunc<K> partitionFn,
java.util.Comparator<K> comparator)
Gathers data by key for
BatchTupleTSets |
KeyedGatherUngroupedTLink<K,V> |
keyedGatherUngrouped()
Gathers data by key for
BatchTupleTSets without grouping data by keys |
KeyedGatherUngroupedTLink<K,V> |
keyedGatherUngrouped(PartitionFunc<K> partitionFn)
Gathers data by key for
BatchTupleTSets without grouping data by keys |
KeyedGatherUngroupedTLink<K,V> |
keyedGatherUngrouped(PartitionFunc<K> partitionFn,
java.util.Comparator<K> comparator)
Gathers data by key for
BatchTupleTSets without grouping data by keys |
KeyedPartitionTLink<K,V> |
keyedPartition(PartitionFunc<K> partitionFn)
Partitions data using a
PartitionFunc based on keys |
KeyedPipeTLink<K,V> |
keyedPipe()
Pipe implementation
|
KeyedReduceTLink<K,V> |
keyedReduce(ReduceFunc<V> reduceFn)
Reduces data by key for
BatchTupleTSets |
KeyedCachedTSet<K,V> |
lazyCache()
Performs caching lazily.
|
KeyedPersistedTSet<K,V> |
lazyPersist()
Performs persisting lazily.
|
KeyedPersistedTSet<K,V> |
persist()
Similar to cache, but the data is stored in a disk based
DataPartition. |
BatchTupleTSetImpl<K,V> |
setName(java.lang.String n)
Name of the tset
|
BatchTupleTSetImpl<K,V> |
withSchema(TupleSchema schema)
Sets the data type of the
TupleTSet output. |
getInputSchema, setOutputSchemaaddChildToGraph, addChildToGraph, equals, getId, getInputs, getName, getParallelism, getStateType, hashCode, isMutable, rename, setMutable, setStateType, setTSetEnv, toStringclone, finalize, getClass, notify, notifyAll, wait, wait, waitbuild, getINodegenerateID, getTBaseGraphpublic BatchEnvironment getTSetEnv()
BuildablegetTSetEnv in interface BuildablegetTSetEnv in class BaseTSet<V>public KeyedDirectTLink<K,V> keyedDirect()
BatchTupleTSetkeyedDirect in interface BatchTupleTSet<K,V>keyedDirect in interface TupleTSet<K,V>public KeyedPipeTLink<K,V> keyedPipe()
BatchTupleTSetkeyedPipe in interface BatchTupleTSet<K,V>public KeyedReduceTLink<K,V> keyedReduce(ReduceFunc<V> reduceFn)
BatchTupleTSetBatchTupleTSetskeyedReduce in interface BatchTupleTSet<K,V>reduceFn - the reduce functionpublic KeyedPartitionTLink<K,V> keyedPartition(PartitionFunc<K> partitionFn)
BatchTupleTSetPartitionFunc based on keyskeyedPartition in interface BatchTupleTSet<K,V>keyedPartition in interface TupleTSet<K,V>partitionFn - partition functionpublic KeyedGatherTLink<K,V> keyedGather()
BatchTupleTSetBatchTupleTSetskeyedGather in interface BatchTupleTSet<K,V>public KeyedGatherTLink<K,V> keyedGather(PartitionFunc<K> partitionFn)
BatchTupleTSetBatchTupleTSetskeyedGather in interface BatchTupleTSet<K,V>partitionFn - partition function to partition data based on keypublic KeyedGatherTLink<K,V> keyedGather(PartitionFunc<K> partitionFn, java.util.Comparator<K> comparator)
BatchTupleTSetBatchTupleTSetskeyedGather in interface BatchTupleTSet<K,V>partitionFn - partition function to partition data based on keycomparator - custom key comparatorpublic KeyedGatherUngroupedTLink<K,V> keyedGatherUngrouped()
BatchTupleTSetBatchTupleTSets without grouping data by keyskeyedGatherUngrouped in interface BatchTupleTSet<K,V>public KeyedGatherUngroupedTLink<K,V> keyedGatherUngrouped(PartitionFunc<K> partitionFn)
BatchTupleTSetBatchTupleTSets without grouping data by keyskeyedGatherUngrouped in interface BatchTupleTSet<K,V>partitionFn - partition function to partition data based on keypublic KeyedGatherUngroupedTLink<K,V> keyedGatherUngrouped(PartitionFunc<K> partitionFn, java.util.Comparator<K> comparator)
BatchTupleTSetBatchTupleTSets without grouping data by keyskeyedGatherUngrouped in interface BatchTupleTSet<K,V>partitionFn - partition function to partition data based on keycomparator - custom key comparatorpublic <VR> JoinTLink<K,V,VR> join(BatchTupleTSet<K,VR> rightTSet, CommunicationContext.JoinType type, java.util.Comparator<K> keyComparator, TaskPartitioner<K> partitioner)
BatchTupleTSetBatchTupleTSet. Note that this TSet will be considered the left
TSetjoin in interface BatchTupleTSet<K,V>VR - value type of the right tsetrightTSet - right tsettype - CommunicationContext.JoinTypekeyComparator - key comparatorpartitioner - partitioner for keyspublic <VR> JoinTLink<K,V,VR> join(BatchTupleTSet<K,VR> rightTSet, CommunicationContext.JoinType type, java.util.Comparator<K> keyComparator)
BatchTupleTSetBatchTupleTSet. Note that this TSet will be considered the left
TSetjoin in interface BatchTupleTSet<K,V>VR - value type of the right tsetrightTSet - right tsettype - CommunicationContext.JoinTypekeyComparator - key comparatorpublic BatchTupleTSetImpl<K,V> setName(java.lang.String n)
BatchTupleTSetpublic KeyedCachedTSet<K,V> cache()
StoringDataDataPartition and exposes the data as another TSet.cache in interface StoringData<Tuple<K,V>>public KeyedCachedTSet<K,V> lazyCache()
StoringDataTSet
is evaluated explicitly.lazyCache in interface StoringData<Tuple<K,V>>public KeyedPersistedTSet<K,V> persist()
StoringDataDataPartition. This method would also expose the
checkpointing ability to TSets.persist in interface StoringData<Tuple<K,V>>public KeyedPersistedTSet<K,V> lazyPersist()
StoringDatalazyPersist in interface StoringData<Tuple<K,V>>public BatchTupleTSetImpl<K,V> addInput(java.lang.String key, StorableTBase<?> input)
BatchTupleTSetBatchTupleTSetsaddInput in interface AcceptingData<Tuple<K,V>>addInput in interface BatchTupleTSet<K,V>key - identifier for the inputinput - input tsetpublic BatchTupleTSetImpl<K,V> withSchema(TupleSchema schema)
TupleTSetTupleTSet output. This will be used in the packers for efficient
SER-DE operations in the following TLinkswithSchema in interface TupleTSet<K,V>schema - data type as a MessageTypeTupleTSetpublic TupleSchema getOutputSchema()
KeyedSchema.getOutputSchema in class BaseTSetWithSchema<V>