Computer Science & Electrical

Computer Science & Electrical

Archive
Join as an Editor/Reviewer

An Enhanced K-means Firefly for Health Care Cluster Analysis of Philippines COVID-19 Datasets

Volume: 101  ,  Issue: 1 , May    Published Date: 01 June 2022
Publisher Name: IJRP
Views: 474  ,  Download: 258 , Pages: 494 - 501    
DOI: 10.47119/IJRP1001011520223179

Authors

# Author Name
1 Jerico Contreras
2 Vince Andrei Isip
3 Raymund Dioses
4 Dan Michael Cortez

Abstract

The development of digital health technologies during the COVID-19 outbreak sets the foundation for different research initiatives, including cluster analysis to better present and interpret the COVID-19 dataset. One of the cluster analysis algorithms often used is the K-means that various researchers applied in different applications. The application includes, but is not limited to, analyzing tourism attractions and restaurants, developing a machine learning model to track the virus's progression, and categorizing a country's cities or provinces based on its COVID-19 records. This paper intends to extend the cluster analysis of the prior research by enhancing a novel clustering algorithm known as K-means Firefly, which has improved the conventional K-means. The proponents achieved their objectives and were able to solve the limitation of the algorithm by improving the initialization of the algorithm's data and clusters, specifically by employing Principal Component Analysis for dimension reduction of data, the Calinski-Harabasz Index for automatically determining the number of clusters, and K-means++ for obtaining the initial clusters. The enhanced algorithm clustered the cities in the Philippines based on their COVID-19 datasets containing 32 healthcare-related features. As a result, the algorithm can handle real-world datasets with multiple features by reducing its dimensionality. It can automatically determine the optimal number of clusters and the initial location of the centroids. In terms of internal validity metrics, the enhanced algorithm also performed better than the previous implementation, with a percentage difference of 90.16% for the Silhouette Coefficient, 72.99% for the Davies-Bouldin Index, and 68.98 for the Calinski-Harabasz Index. The proposed algorithm may use for various applications, such as data dashboards and real-time tracking websites, due to its extended dynamic features in handling datasets and producing substantial clusters.

Keywords

  • COVID-19
  • K-means Firefly
  • Cluster Analysis
  • Health care
  • Algorithm Enhancement
  • Philippines