public class DataLocalityBatchTaskScheduler
The data locality batch task scheduler generate the task schedule plan based on the distance
calculated between the worker node and the data nodes where the input data resides. Once the
allocation is done, it calculates the task instance ram, disk, and cpu values and
allocates the size of the container with required ram, disk, and cpu values.
This is the base method for the data locality aware task scheduling for scheduling the batch
task instances. It retrieves the task vertex set of the task graph and send the set to the
data locality aware scheduling algorithm to allocate the batch task instances which are closer
to the data nodes.