public class DataLocalityStreamingTaskScheduler
The data locality streaming task scheduler generate the task schedule plan based on the distance
calculated between the worker node and the data nodes where the input data resides. Once the
allocation is done, it calculates the task instance ram, disk, and cpu values and also it
allocates the size of the container with required ram, disk, and cpu values.
This is the base method for the data locality aware task scheduling for scheduling the
streaming task instances. It retrieves the task vertex set of the task graph and send the set
to the data locality aware scheduling algorithm to schedule the streaming task instances
which are closer to the data nodes.