Parallel K-Means for Big Data: On Enhancing Its Cluster Metrics and Patterns

Moertini, Veronica Sri; Venica, Liptia

dc.contributor.author	Moertini, Veronica Sri
dc.contributor.author	Venica, Liptia
dc.date.accessioned	2017-11-17T03:54:22Z
dc.date.available	2017-11-17T03:54:22Z
dc.date.issued	2017
dc.identifier.issn	1844-1856
dc.identifier.other	artsc228
dc.identifier.uri	http://hdl.handle.net/123456789/4044
dc.description	JOURNAL OF THEORETICAL AND APPLIED INFORMATION TECHNOLOGY; Vol.95 No.8, 30 April 2017
dc.description.abstract	K-Means clustering algorithm has been enhanced based on MapReduce such that it works in distributed Hadoop cluster for clustering big data. We found that the existing algorithm have not included techniques for computing the cluster metrics necessary for evaluating the quality of clusters and finding interesting patterns. This research adds this capability. Few metrics are computed in every iteration of k-Means in the Hadoop’s Reduce function such that when it is converged, the metrics are ready to be evaluated. We have implemented the proposed parallel k-Means and the experiments results show that the proposed metrics are useful for selecting clusters and finding interesting patterns.	en_US
dc.description.uri	http://www.jatit.org/volumes/n
dc.language.iso	en	en_US
dc.publisher	Little Lion Scientific
dc.relation.ispartofseries	JOURNAL OF THEORETICAL AND APPLIED INFORMATION TECHNOLOGY; ; Vol.95 No.8, 30 April 2017
dc.subject	CLUSTERING BIG DATA	en_US
dc.subject	PARALLEL K-MEANS	en_US
dc.subject	HADOOP MAPREDUCE	en_US
dc.title	Parallel K-Means for Big Data: On Enhancing Its Cluster Metrics and Patterns	en_US
dc.type	Journal Articles	en_US