public class KMeansComputeJob
It is the main class for the K-Means clustering which consists of four main tasks namely
generation of datapoints and centroids, partition and read the partitioned data points,
read the centroids, and finally perform the distance calculation.
First, the execute method invokes the generateDataPoints method to generate the datapoints file
and centroid file based on the respective filesystem submitted by the user. Next, it invoke
the DataObjectSource and DataObjectSink to partition and read the partitioned data points
respectively through data points task graph. Then, it calls the DataFileReader to read the
centroid values from the filesystem through centroid task graph. Next, the datapoints are
stored in DataSet \(0th object\) and centroids are stored in DataSet 1st object\). Finally, it
constructs the kmeans task graph to perform the clustering process which computes the distance
between the centroids and data points.