public abstract class BatchTSetImpl<T> extends BaseTSetWithSchema<T> implements BatchTSet<T>
BaseTSet.StateType| Constructor and Description |
|---|
BatchTSetImpl() |
| Modifier and Type | Method and Description |
|---|---|
BatchTSetImpl<T> |
addInput(java.lang.String key,
StorableTBase<?> input)
Adds data to this TSet
|
AllGatherTLink<T> |
allGather()
Same as gather, but all the target TSet instances would receive the gathered result in the
runtime.
|
AllReduceTLink<T> |
allReduce(ReduceFunc<T> reduceFn)
Similar to reduce, but all instances of the target
TSet would receive the reduced
result. |
CachedTSet<T> |
cache()
Runs this TSet and caches the data to an in-memory
DataPartition and exposes the data as another TSet. |
DirectTLink<T> |
direct()
Returns a Direct
TLink that corresponds to the communication operation where the data
will be transferred to another TSet directly. |
GatherTLink<T> |
gather()
Returns a Gather
TLink that would gather data to the target TSet instance with index
0 (in the runtime). |
BatchEnvironment |
getTSetEnv()
tset env
|
CachedTSet<T> |
lazyCache()
Performs caching lazily.
|
PersistedTSet<T> |
lazyPersist()
Performs persisting lazily.
|
<K,V> KeyedTSet<K,V> |
mapToTuple(MapFunc<Tuple<K,V>,T> mapToTupleFn)
|
PartitionTLink<T> |
partition(PartitionFunc<T> partitionFn)
Same as above, but the parallelism will be preserved in the target
TSet. |
PartitionTLink<T> |
partition(PartitionFunc<T> partitionFn,
int targetParallelism)
Returns a Partition
TLink that would partition data according based on a function
provided. |
PersistedTSet<T> |
persist()
Similar to cache, but the data is stored in a disk based
DataPartition. |
PipeTLink<T> |
pipe() |
ReduceTLink<T> |
reduce(ReduceFunc<T> reduceFn)
Returns a Reduce
TLink that reduce data on to the target TSet instance (in the runtime)
with index 0. |
ReplicateTLink<T> |
replicate(int replications)
|
ComputeTSet<T,java.util.Iterator<T>> |
union(java.util.Collection<TSet<T>> tSets)
Same as above, but accepts a
Collection of TSets. |
ComputeTSet<T,java.util.Iterator<T>> |
union(TSet<T> other)
|
BatchTSetImpl<T> |
withSchema(Schema schema)
Sets the data type of the
TSet output. |
getInputSchema, getOutputSchema, setOutputSchemaaddChildToGraph, addChildToGraph, equals, getId, getInputs, getName, getParallelism, getStateType, hashCode, isMutable, rename, setMutable, setStateType, setTSetEnv, toStringclone, finalize, getClass, notify, notifyAll, wait, wait, waitbuild, getINodegenerateID, getTBaseGraphpublic BatchEnvironment getTSetEnv()
BuildablegetTSetEnv in interface BuildablegetTSetEnv in class BaseTSet<T>public DirectTLink<T> direct()
TSetTLink that corresponds to the communication operation where the data
will be transferred to another TSet directly.public ReduceTLink<T> reduce(ReduceFunc<T> reduceFn)
TSetTLink that reduce data on to the target TSet instance (in the runtime)
with index 0.public PartitionTLink<T> partition(PartitionFunc<T> partitionFn, int targetParallelism)
TSetpublic PartitionTLink<T> partition(PartitionFunc<T> partitionFn)
TSetTSet.public GatherTLink<T> gather()
TSetTLink that would gather data to the target TSet instance with index
0 (in the runtime).public AllReduceTLink<T> allReduce(ReduceFunc<T> reduceFn)
TSetTSet would receive the reduced
result.public AllGatherTLink<T> allGather()
TSetpublic <K,V> KeyedTSet<K,V> mapToTuple(MapFunc<Tuple<K,V>,T> mapToTupleFn)
TSetTupleTSet based on the MapFunc provided. This will an entry point
to keyed communication operations from a non-keyed TSet.mapToTuple in interface BatchTSet<T>mapToTuple in interface TSet<T>K - type of keyV - type of valuemapToTupleFn - Map functionpublic ComputeTSet<T,java.util.Iterator<T>> union(TSet<T> other)
TSetpublic ComputeTSet<T,java.util.Iterator<T>> union(java.util.Collection<TSet<T>> tSets)
TSetCollection of TSets.public ReplicateTLink<T> replicate(int replications)
TSetpublic CachedTSet<T> cache()
StoringDataDataPartition and exposes the data as another TSet.cache in interface StoringData<T>public CachedTSet<T> lazyCache()
StoringDataTSet
is evaluated explicitly.lazyCache in interface StoringData<T>public PersistedTSet<T> persist()
StoringDataDataPartition. This method would also expose the
checkpointing ability to TSets.persist in interface StoringData<T>public PersistedTSet<T> lazyPersist()
StoringDatalazyPersist in interface StoringData<T>public BatchTSetImpl<T> addInput(java.lang.String key, StorableTBase<?> input)
BatchTSetaddInput in interface AcceptingData<T>addInput in interface BatchTSet<T>key - the key used to store the given TSetinput - a @StorableTBase TSet to be added as an inputpublic BatchTSetImpl<T> withSchema(Schema schema)
TSetTSet output. This will be used in the packers for efficient
SER-DE operations in the following TLinkswithSchema in interface TSet<T>schema - data type as a MessageTypeTSet