public class KeyedPersistedTSet<K,V> extends KeyedStoredTSet<K,V>
BaseTSet.StateType
storedSource
Constructor and Description |
---|
KeyedPersistedTSet(BatchEnvironment tSetEnv,
SinkFunc<java.util.Iterator<Tuple<K,V>>> storingSinkFn,
int parallelism,
KeyedSchema inputSchema) |
Modifier and Type | Method and Description |
---|---|
KeyedPersistedTSet<K,V> |
addInput(java.lang.String key,
StorableTBase<?> input)
Adds inputs to
BatchTupleTSet s |
KeyedCachedTSet<K,V> |
cache()
Runs this TSet and caches the data to an in-memory
DataPartition and exposes the data as another TSet. |
KeyedCachedTSet<K,V> |
lazyCache()
Performs caching lazily.
|
KeyedPersistedTSet<K,V> |
lazyPersist()
Performs persisting lazily.
|
KeyedPersistedTSet<K,V> |
persist()
Similar to cache, but the data is stored in a disk based
DataPartition . |
KeyedPersistedTSet<K,V> |
setName(java.lang.String n)
Name of the tset
|
KeyedPersistedTSet<K,V> |
withSchema(TupleSchema schema)
Sets the data type of the
TupleTSet output. |
getData, getINode, getInputSchema, getStoredSourceTSet, join, join, keyedDirect, keyedGather, keyedGather, keyedGather, keyedGatherUngrouped, keyedGatherUngrouped, keyedGatherUngrouped, keyedPartition, keyedPipe, keyedReduce
getOutputSchema, getTSetEnv
setOutputSchema
addChildToGraph, addChildToGraph, equals, getId, getInputs, getName, getParallelism, getStateType, hashCode, isMutable, rename, setMutable, setStateType, setTSetEnv, toString
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
getDataObject
build
generateID, getTBaseGraph
public KeyedPersistedTSet(BatchEnvironment tSetEnv, SinkFunc<java.util.Iterator<Tuple<K,V>>> storingSinkFn, int parallelism, KeyedSchema inputSchema)
public KeyedPersistedTSet<K,V> setName(java.lang.String n)
BatchTupleTSet
public KeyedPersistedTSet<K,V> addInput(java.lang.String key, StorableTBase<?> input)
BatchTupleTSet
BatchTupleTSet
saddInput
in interface AcceptingData<Tuple<K,V>>
addInput
in interface BatchTupleTSet<K,V>
addInput
in class BatchTupleTSetImpl<K,V>
key
- identifier for the inputinput
- input tsetpublic KeyedPersistedTSet<K,V> withSchema(TupleSchema schema)
TupleTSet
TupleTSet
output. This will be used in the packers for efficient
SER-DE operations in the following TLink
swithSchema
in interface TupleTSet<K,V>
withSchema
in class BatchTupleTSetImpl<K,V>
schema
- data type as a MessageType
TupleTSet
public KeyedCachedTSet<K,V> cache()
StoringData
DataPartition
and exposes the data as another TSet.cache
in interface StoringData<Tuple<K,V>>
cache
in class BatchTupleTSetImpl<K,V>
public KeyedCachedTSet<K,V> lazyCache()
StoringData
TSet
is evaluated explicitly.lazyCache
in interface StoringData<Tuple<K,V>>
lazyCache
in class BatchTupleTSetImpl<K,V>
public KeyedPersistedTSet<K,V> persist()
StoringData
DataPartition
. This method would also expose the
checkpointing ability to TSet
s.persist
in interface StoringData<Tuple<K,V>>
persist
in class BatchTupleTSetImpl<K,V>
public KeyedPersistedTSet<K,V> lazyPersist()
StoringData
lazyPersist
in interface StoringData<Tuple<K,V>>
lazyPersist
in class BatchTupleTSetImpl<K,V>