Abstract
Many practical applications impose a new challenge of utilizing instance-level background knowledge (e.g., subsets of similar or dissimilar data points) within their input data to improve clustering results. In this work, we build on the widely adopted k-center clustering, modeling its input instance-level background knowledge as must-link (ML) and cannot-link (CL) constraint sets, and formulate the constrained k-center problem. Given the long-standing challenge of developing efficient algorithms for constrained clustering problems, we first derive an efficient approximation algorithm for constrained k-center at the best possible approximation ratio of 2 with linear programming (LP)-rounding technology. Recognizing the limitations of LP-rounding algorithms including high runtime complexity and challenges in parallelization, we subsequently develop a greedy algorithm that does not rely on the LP and can be efficiently parallelized. This algorithm also achieves the same approximation ratio 2 but with lower runtime complexity. Lastly, we empirically evaluate our approximation algorithm against baselines on various real datasets, validating our theoretical findings and demonstrating significant advantages of our algorithm in terms of clustering cost, quality, and runtime complexity.
| Original language | English |
|---|---|
| Pages (from-to) | 18844-18858 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| Volume | 36 |
| Issue number | 10 |
| DOIs | |
| Publication status | Published - 2025 |
Bibliographical note
Publisher Copyright:© 2025 IEEE.
Keywords
- Approximation algorithm
- constrained clustering
- greedy algorithm
- k-center
- linear programming (LP)-rounding
Fingerprint
Dive into the research topics of 'Near-optimal algorithms for instance-level constrained k-center clustering'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver