Journal of Decision Making and Healthcare

Electronic ISSN: 3008-1572

DOI: 10.69829/jdmh

Scalable clustering of lines

Journal of Decision Making and Healthcare, Volume 1, Issue 1, June 2024, Pages: 45–56

AKSHAY SHARMA

Department of Mathematical Sciences, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh 221005, India

DEBDAS GHOSH

Department of Mathematical Sciences, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh 221005, India

WITOLD PEDRYCZ

Department of Electrical and Computer Engineering, University of Alberta, Alberta T6G 1H9, Canada


Abstract

We introduce a scalable algorithm for clustering lines and affine subspaces of fixed dimensions via data reduction using a coreset approximating scheme. A key characteristic of the proposed algorithm is exploiting a scatter-then-aggregate scheme for data partitioning, which allows to maintain a constant coreset approximation size. This key feature enables a novel algorithm which streams as well as distributes workload. We study how the proposed algorithm scales in a practical setting and compare it with other clustering methods. Results of experiments executed on the PARAM supercomputer confirm that the time complexity of algorithm is log-linear in problem size and decreases linearly as the number of processors increase. Source code for the algorithm, the experiments, and other potential applications is provided.


Cite this Article as

Akshay Sharma, Debdas Ghosh and Witold Pedrycz, Scalable clustering of lines, Journal of Decision Making and Healthcare, 1 (2024), 45–56