Skip to main navigation Skip to search Skip to main content

Enhancing understandability of Omics data with SHAP, embedding projections and interactive visualisations

Research output: Chapter in Book / Conference PaperChapterpeer-review

3 Citations (Scopus)

Abstract

Uniform Manifold Approximation and Projection (UMAP) is a new and effective non-linear dimensionality reduction (DR) method recently applied in biomedical informatics analysis. UMAP's data transformation process is complicated and lacks transparency. Principal component analysis (PCA) is a conventional and essential DR method for analysing single-cell datasets. PCA projection is linear and easy to interpret. The UMAP is more scalable and accurate, but the complex algorithm makes it challenging to endorse the users' trust. Another challenge is that some single-cell data have too many dimensions, making the computational process inefficient and lacking accuracy. This paper uses linkable and interactive visualisations to understand UMAP results by comparing PCA results. An explainable machine learning model, SHapley Additive exPlanations (SHAP) run on Random Forest (RF), is used to optimise the input single-cell data to make UMAP and PCA processes more efficient. We demonstrate that this approach can be applied to high-dimensional omics data exploration to visually validate informative molecule markers and cell populations identified from the UMAP-reduced dimensionality space.
Original languageEnglish
Title of host publicationData Mining: 20th Australasian Conference, AusDM 2022, Western Sydney, Australia, December 12-15, 2022, Proceedings
EditorsLaurence A. F. Park, Heitor Murilo Gomes, Maryam Doborjeh, Yee Ling Boo, Yun Sing Koh, Yanchang Zhao, Graham Williams, Simeon Simoff
Place of PublicationSingapore
PublisherSpringer
Pages58-72
Number of pages15
ISBN (Electronic)9789811987465
ISBN (Print)9789811987458
DOIs
Publication statusPublished - 2022

Publication series

NameCommunications in Computer and Information Science
Volume1741 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Keywords

  • Dimensionality reduction
  • UMAP
  • Permutation importance
  • SHAP
  • Explainable AI
  • Machine learning
  • Visualisation
  • Random forest
  • PCA

Fingerprint

Dive into the research topics of 'Enhancing understandability of Omics data with SHAP, embedding projections and interactive visualisations'. Together they form a unique fingerprint.

Cite this