Abstract
Dimensionality reduction is usually an essential step in data mining and classical machine learning from high-dimensional data. Uniform Manifold Approximations Projections (UMAP) is a recently developed nonlinear dimensionality reduction method that is being widely applied in biomedical informatics. However, the UMAP implementation is still not efficient enough for processing the recent big omics data from biomedicine. This paper proposes and implements a method that reduces UMAP runtime using GPU-acceleration on the GPU-RAPIDS platform. Our experiments showed that the parallel UMAP implementation performed hundred times faster than the original UMAP implementation on a cluster computer, while maintaining the effectiveness on identifying leukemic cells from clinical flow cytometry data.
Original language | English |
---|---|
Title of host publication | Data Mining: 19th Australasian Conference on Data Mining, AusDM, Brisbane, QLD, Australia, December 14-15, 2021, Proceedings |
Editors | Yue Xu, Rosalind Wang, Anton Lord, Yeeling Boo, Richi Nayak, Yanchang Zhao, Graham Williams |
Place of Publication | Singapore |
Publisher | Springer Nature |
Pages | 3-15 |
Number of pages | 13 |
ISBN (Print) | 9789811685309 |
DOIs | |
Publication status | Published - 2021 |