Skip to main navigation Skip to search Skip to main content

Chameleon: a Python workflow toolkit for feature selection

  • Western Sydney University
  • University of Technology Sydney
  • CSIRO, Australia

Research output: Chapter in Book / Conference PaperChapterpeer-review

2 Citations (Scopus)

Abstract

When considering classification problems in relation to high-dimensional data sets, such as biological data sets, the need for effective methods of dimensionality reduction by feature selection becomes apparent. Feature selection has been shown to significantly decrease computational cost and allow for classification models that are more easily interpretable. We present Chameleon, a Python-based toolkit that integrates all steps in a feature selection evaluation pipeline – from splitting data for cross-validation, to visualisation of classification results using various metrics. We implemented in Chameleon six existing feature selection methods, six common classification methods, and the classification results are evaluated using two different metrics. We also implemented an ensemble method which selects only common features from the different methods evaluated. Experimental results using four different data sets suggest that the common features method achieves improved or similar classification performance, compared to the individual feature selection algorithms, using smaller and thus more computationally efficient subsets of features.

Original languageEnglish
Title of host publicationData Mining: 19th Australasian Conference on Data Mining, AusDM, Brisbane, QLD, Australia, December 14-15, 2021, Proceedings
EditorsYue Xu, Rosalind Wang, Anton Lord, Yee Ling Boo, Richi Nayak, Yanchang Zhao, Graham Williams
Place of PublicationU.S.
PublisherSpringer
Pages121-135
Number of pages15
ISBN (Electronic)9789811685316
ISBN (Print)9789811685309
DOIs
Publication statusPublished - Dec 2021

Publication series

NameCommunications in Computer and Information Science
Volume1504
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Keywords

  • Biological data
  • Classification
  • Feature selection

Fingerprint

Dive into the research topics of 'Chameleon: a Python workflow toolkit for feature selection'. Together they form a unique fingerprint.

Cite this