Pengelompokan Kabupaten/Kota di Kalimantan Berdasarkan Indikator Pendidikan Menggunakan Metode K-Means dengan Optimasi Principal Component Analysis
Abstract
Cluster analysis is used to group several objects based on similarities within the group. There are many methods included in cluster analysis, including k-means. K-means is a non-hierarchical cluster analysis method. The assumption that needs to be considered in cluster analysis is that there is no strong correlation between research variables. An alternative that can be done to deal with variables that are strongly correlated is to use Principal Component Analysis (PCA). This research aims to group districts/cities in Kalimantan based on education indicators in 2022 using k-means with PCA optimization, as well as finding out the optimal cluster based on the smallest Davies Bouldin Index (DBI) value. Based on the results of the analysis, from 11 research variables two main components were formed. From these two main components, new data transformations are produced which are then used in grouping districts/cities in Kalimantan based on education indicators using the k-means methods. The analysis results, it was found that the optimal cluster with k-means grouping was 5 clusters with a DBI value of 0.835. Cluster 1 has 8 regencies/cities, cluster 2 has 16 regencies/cities, cluster 6 has 5 regencies/cities, cluster 4 has 21 regencies/cities, and cluster 5 has 5 regencies/cities.
Downloads
References
Badan Pusat Statistik. (2023). Statistik Indonesia 2023. Jakarta: Badan Pusat Statistik Republik Indonesia.
Badruttamam, A., Sudarno, & Maruddani, D. I. (2020). Penerapan Analisis Klaster K-Modes dengan Validasi Davies Bouldin Index dalam Menentukan Karakteristik Kanal Youtube di Indonesia. Jurnal GAUSSIAN, 9(3), 263-272.
Bashori, & Aprima, S. G. (2019). Analisis Kebijakan Program Wajib Belajar 12 Tahun di Provinsi Lampung. Jurnal Manajemen Pendidikan Islam, 1(1), 18-28.
Gujarati, D. (2003). Ekonometrika Dasar. Jakarta: Erlangga.
Irwansyah, E., & Faisal, M. (2015). Advanced Clustering: Teori dan Aplikasi. Yogyakarta: DeePublish.
Nawari. (2010). Analisis Regresi dengan MS Excel . Jakarta: Elex Media Komputindo.
Nugroho, S. (2008). Statistika Multivariat Terapan. Bengkulu: UNIB Press.
Prasetyo, E. (2012). Data Mining: Konsep dan Aplikasi Menggunakan Matlab. Yogyakarta: Andi Offset.
Rahmayanti, A., Juita, R., & Suhendra, C. D. (2022). Penerapan Metode K-Means untuk Clustering Data Anak Berdasarkan Kepemilikan Akta Kelahiran dan KIA. Jurnal Informatik, 7(3), 210-219.
Santosa, B., & Umam, A. (2018). Data Mining dan Big Data Analytics. Yogyakarta: Penebar Media Pustaka.
Sopyan, Y., Lesmana, A. D., & Juliane, C. (2022). Analisis Algoritma K-Means dan Davies Bouldin Index dalam Mencari Cluster Terbaik Kasus Perceraian di Kabupaten Kuningan. Building of Informatics, Technology and Science, 4(3), 1464-1470.
Supranto, J. (2004). Analisis Multivariat: Arti dan Interpretasi. Jakarta: Rineka Cipta.
Suyanto. (2017). Data Mining untuk Klasifikasi dan Klasterisasi Data. Bandung: Informatika .
Ulinnuha, N., Veriani, R. (2020). Analisis Cluster dalam Pengelompokan Provinsi di Indonesia Berdasarkan Penyakit Menular Menggunakan Metode Complete Linkage, Average Linkage, dan Ward. Jurnal Nasional Informatika dan Teknologi Jaringan, 5(1). 101 - 108.
Wangge, M. (2021). Penerapan Metode Principal Component Analysis (PCA) Terhadap Faktor-Faktor yang Memengaruhi Lamanya Penyelesaian Skripsi Mahasiswa Program Studi Pendidikan Matematika FKIP UNDANA. Jurnal Cendekia: Jurnal Pendidikan Matematika, 5(1). 974 - 988.
Wanto, A., Siregar, M. N., Windarto, A. P., Hartama, D., Ginantra, L. W., Napitupulu, D., . . . Prianto, C. (2020). Data Mining: Algoritma dan Implementasi. Medan: Yayasan Kita Menulis.