Microarray data mining : selecting trustworthy genes with gene feature ranking

Ubaudi Franco, Paul J. Kennedy, Daniel R. Catchpoole, Dachuan Guo, Simeon J. Simoff, Longbing Cao, Philip S. Yu, Chengqi Zhang, Huaifeng Zhang

    Research output: Chapter in Book / Conference PaperChapter

    2 Citations (Scopus)

    Abstract

    Gene expression datasets used in biomedical data mining frequently have two characteristics: they have many thousand attributes but only relatively few sample points and the measurements are noisy. In other words, individual expression measurements may be untrustworthy. Gene Feature Ranking (GFR) is a feature selection methodology that addresses these domain specific characteristics by selecting features (i.e. genes) based on two criteria: (i) how well the gene can discriminate between classes of patient and (ii) the trustworthiness of the microarray data associated with the gene. An example from the pediatric cancer domain demonstrates the use of GFR and compares its performance with a feature selection method that does not explicitly address the trustworthiness of the underlying data.
    Original languageEnglish
    Title of host publicationData mining for business applications
    Place of PublicationU.S
    PublisherSpringer
    Pages159-168
    Number of pages10
    ISBN (Print)9780387794204
    Publication statusPublished - 2009

    Keywords

    • data mining
    • gene expression
    • medical informatics
    • microarray analysis

    Fingerprint

    Dive into the research topics of 'Microarray data mining : selecting trustworthy genes with gene feature ranking'. Together they form a unique fingerprint.

    Cite this