K
- key typeV
- data (value) typepublic abstract class BatchTupleTSetImpl<K,V> extends BaseTSetWithSchema<V> implements BatchTupleTSet<K,V>
BaseTSet.StateType
Modifier | Constructor and Description |
---|---|
protected |
BatchTupleTSetImpl() |
Modifier and Type | Method and Description |
---|---|
BatchTupleTSetImpl<K,V> |
addInput(java.lang.String key,
StorableTBase<?> input)
Adds inputs to
BatchTupleTSet s |
KeyedCachedTSet<K,V> |
cache()
Runs this TSet and caches the data to an in-memory
DataPartition and exposes the data as another TSet. |
TupleSchema |
getOutputSchema()
Output schema of a BatchTupleTSet is definitely a
KeyedSchema . |
BatchEnvironment |
getTSetEnv()
tset env
|
<VR> JoinTLink<K,V,VR> |
join(BatchTupleTSet<K,VR> rightTSet,
CommunicationContext.JoinType type,
java.util.Comparator<K> keyComparator)
Joins with another
BatchTupleTSet . |
<VR> JoinTLink<K,V,VR> |
join(BatchTupleTSet<K,VR> rightTSet,
CommunicationContext.JoinType type,
java.util.Comparator<K> keyComparator,
TaskPartitioner<K> partitioner)
Joins with another
BatchTupleTSet . |
KeyedDirectTLink<K,V> |
keyedDirect()
Direct/pipe communication
|
KeyedGatherTLink<K,V> |
keyedGather()
Gathers data by key for
BatchTupleTSet s |
KeyedGatherTLink<K,V> |
keyedGather(PartitionFunc<K> partitionFn)
Gathers data by key for
BatchTupleTSet s |
KeyedGatherTLink<K,V> |
keyedGather(PartitionFunc<K> partitionFn,
java.util.Comparator<K> comparator)
Gathers data by key for
BatchTupleTSet s |
KeyedGatherUngroupedTLink<K,V> |
keyedGatherUngrouped()
Gathers data by key for
BatchTupleTSet s without grouping data by keys |
KeyedGatherUngroupedTLink<K,V> |
keyedGatherUngrouped(PartitionFunc<K> partitionFn)
Gathers data by key for
BatchTupleTSet s without grouping data by keys |
KeyedGatherUngroupedTLink<K,V> |
keyedGatherUngrouped(PartitionFunc<K> partitionFn,
java.util.Comparator<K> comparator)
Gathers data by key for
BatchTupleTSet s without grouping data by keys |
KeyedPartitionTLink<K,V> |
keyedPartition(PartitionFunc<K> partitionFn)
Partitions data using a
PartitionFunc based on keys |
KeyedPipeTLink<K,V> |
keyedPipe()
Pipe implementation
|
KeyedReduceTLink<K,V> |
keyedReduce(ReduceFunc<V> reduceFn)
Reduces data by key for
BatchTupleTSet s |
KeyedCachedTSet<K,V> |
lazyCache()
Performs caching lazily.
|
KeyedPersistedTSet<K,V> |
lazyPersist()
Performs persisting lazily.
|
KeyedPersistedTSet<K,V> |
persist()
Similar to cache, but the data is stored in a disk based
DataPartition . |
BatchTupleTSetImpl<K,V> |
setName(java.lang.String n)
Name of the tset
|
BatchTupleTSetImpl<K,V> |
withSchema(TupleSchema schema)
Sets the data type of the
TupleTSet output. |
getInputSchema, setOutputSchema
addChildToGraph, addChildToGraph, equals, getId, getInputs, getName, getParallelism, getStateType, hashCode, isMutable, rename, setMutable, setStateType, setTSetEnv, toString
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
build, getINode
generateID, getTBaseGraph
public BatchEnvironment getTSetEnv()
Buildable
getTSetEnv
in interface Buildable
getTSetEnv
in class BaseTSet<V>
public KeyedDirectTLink<K,V> keyedDirect()
BatchTupleTSet
keyedDirect
in interface BatchTupleTSet<K,V>
keyedDirect
in interface TupleTSet<K,V>
public KeyedPipeTLink<K,V> keyedPipe()
BatchTupleTSet
keyedPipe
in interface BatchTupleTSet<K,V>
public KeyedReduceTLink<K,V> keyedReduce(ReduceFunc<V> reduceFn)
BatchTupleTSet
BatchTupleTSet
skeyedReduce
in interface BatchTupleTSet<K,V>
reduceFn
- the reduce functionpublic KeyedPartitionTLink<K,V> keyedPartition(PartitionFunc<K> partitionFn)
BatchTupleTSet
PartitionFunc
based on keyskeyedPartition
in interface BatchTupleTSet<K,V>
keyedPartition
in interface TupleTSet<K,V>
partitionFn
- partition functionpublic KeyedGatherTLink<K,V> keyedGather()
BatchTupleTSet
BatchTupleTSet
skeyedGather
in interface BatchTupleTSet<K,V>
public KeyedGatherTLink<K,V> keyedGather(PartitionFunc<K> partitionFn)
BatchTupleTSet
BatchTupleTSet
skeyedGather
in interface BatchTupleTSet<K,V>
partitionFn
- partition function to partition data based on keypublic KeyedGatherTLink<K,V> keyedGather(PartitionFunc<K> partitionFn, java.util.Comparator<K> comparator)
BatchTupleTSet
BatchTupleTSet
skeyedGather
in interface BatchTupleTSet<K,V>
partitionFn
- partition function to partition data based on keycomparator
- custom key comparatorpublic KeyedGatherUngroupedTLink<K,V> keyedGatherUngrouped()
BatchTupleTSet
BatchTupleTSet
s without grouping data by keyskeyedGatherUngrouped
in interface BatchTupleTSet<K,V>
public KeyedGatherUngroupedTLink<K,V> keyedGatherUngrouped(PartitionFunc<K> partitionFn)
BatchTupleTSet
BatchTupleTSet
s without grouping data by keyskeyedGatherUngrouped
in interface BatchTupleTSet<K,V>
partitionFn
- partition function to partition data based on keypublic KeyedGatherUngroupedTLink<K,V> keyedGatherUngrouped(PartitionFunc<K> partitionFn, java.util.Comparator<K> comparator)
BatchTupleTSet
BatchTupleTSet
s without grouping data by keyskeyedGatherUngrouped
in interface BatchTupleTSet<K,V>
partitionFn
- partition function to partition data based on keycomparator
- custom key comparatorpublic <VR> JoinTLink<K,V,VR> join(BatchTupleTSet<K,VR> rightTSet, CommunicationContext.JoinType type, java.util.Comparator<K> keyComparator, TaskPartitioner<K> partitioner)
BatchTupleTSet
BatchTupleTSet
. Note that this TSet will be considered the left
TSetjoin
in interface BatchTupleTSet<K,V>
VR
- value type of the right tsetrightTSet
- right tsettype
- CommunicationContext.JoinType
keyComparator
- key comparatorpartitioner
- partitioner for keyspublic <VR> JoinTLink<K,V,VR> join(BatchTupleTSet<K,VR> rightTSet, CommunicationContext.JoinType type, java.util.Comparator<K> keyComparator)
BatchTupleTSet
BatchTupleTSet
. Note that this TSet will be considered the left
TSetjoin
in interface BatchTupleTSet<K,V>
VR
- value type of the right tsetrightTSet
- right tsettype
- CommunicationContext.JoinType
keyComparator
- key comparatorpublic BatchTupleTSetImpl<K,V> setName(java.lang.String n)
BatchTupleTSet
public KeyedCachedTSet<K,V> cache()
StoringData
DataPartition
and exposes the data as another TSet.cache
in interface StoringData<Tuple<K,V>>
public KeyedCachedTSet<K,V> lazyCache()
StoringData
TSet
is evaluated explicitly.lazyCache
in interface StoringData<Tuple<K,V>>
public KeyedPersistedTSet<K,V> persist()
StoringData
DataPartition
. This method would also expose the
checkpointing ability to TSet
s.persist
in interface StoringData<Tuple<K,V>>
public KeyedPersistedTSet<K,V> lazyPersist()
StoringData
lazyPersist
in interface StoringData<Tuple<K,V>>
public BatchTupleTSetImpl<K,V> addInput(java.lang.String key, StorableTBase<?> input)
BatchTupleTSet
BatchTupleTSet
saddInput
in interface AcceptingData<Tuple<K,V>>
addInput
in interface BatchTupleTSet<K,V>
key
- identifier for the inputinput
- input tsetpublic BatchTupleTSetImpl<K,V> withSchema(TupleSchema schema)
TupleTSet
TupleTSet
output. This will be used in the packers for efficient
SER-DE operations in the following TLink
swithSchema
in interface TupleTSet<K,V>
schema
- data type as a MessageType
TupleTSet
public TupleSchema getOutputSchema()
KeyedSchema
.getOutputSchema
in class BaseTSetWithSchema<V>