%0 Conference Proceedings %T A New Approach to Determine the Optimal Number of Clusters Based on the Gap Statistic %+ Chonbuk National University %+ Howon University (HU) %A Yang, Jaekyung %A Lee, Jong-Yeong %A Choi, Myoungjin %A Joo, Yeongin %< avec comité de lecture %( Lecture Notes in Computer Science %B 2nd International Conference on Machine Learning for Networking (MLN) %C Paris, France %Y Selma Boumerdassi %Y Éric Renault %Y Paul Mühlethaler %I Springer International Publishing %3 Machine Learning for Networking %V LNCS-12081 %P 227-239 %8 2019-12-03 %D 2019 %R 10.1007/978-3-030-45778-5_15 %K Clustering %K Number of clusters %K Data mining %Z Computer Science [cs] %Z Computer Science [cs]/Networking and Internet Architecture [cs.NI]Conference papers %X Data clustering is one of the most important unsupervised classification method. It aims at organizing objects into groups (or clusters), in such a way that members in the same cluster are similar in some way and members belonging to different cluster are distinctive. Among other general clustering method, k-means is arguably the most popular one. However, it still has some inherent weaknesses. One of the biggest challenges when using k-means is to determine the optimal number of clusters, k. Although many approaches have been suggested in the literature, this is still considered as an unsolved problem. In this study, we propose a new technique to improve the gap statistic approach for selecting k. It has been tested on different datasets, on which it yields superior results compared to the original gap statistic. We expect our new method to also work well on other clustering algorithms where the number k is required. This is because our new approach, like the gap statistic, can work with any clustering method. %G English %Z TC 6 %2 https://inria.hal.science/hal-03266454/document %2 https://inria.hal.science/hal-03266454/file/487577_1_En_15_Chapter.pdf %L hal-03266454 %U https://inria.hal.science/hal-03266454 %~ IFIP-LNCS %~ IFIP %~ IFIP-TC %~ IFIP-TC6 %~ IFIP-LNCS-12081 %~ IFIP-MLN