TY - GEN
T1 - Critical review of data mining techniques for gene expression analysis
AU - Aouf, Mazin
AU - Liyanage, Liwan
AU - Hansen, Stephen
PY - 2008
Y1 - 2008
N2 - Classification of gene expression data has been exploded in the recent years. This can aid in the development of efficient methodology in the field of bio-informatics to be used for tumours diagnosis and treatment. Data mining is an effective technique being used in this field. One of the most difficulties facing this technology is the inappropriate classification methods that examine complex structure of gene expression data. In this paper, we give a brief introduction of gene expression data with experiment and we have made a critical review of major techniques being applied in the field of gene expression data with help of data mining. It can be seen that researchers have developed various techniques for gene data classification. In addition, they may differ from one to another whereas results are still showing the need for enhancement in this field. Some of these techniques are addressed in this paper in term of advantages and disadvantages. Accordingly, the deoxyribonucleic acid (DNA) is considered as the maestro of the tumour-derived factors. Analyzing changes on the gene expression may give rise for diagnosis enhancement of affected tissues in their early stages. For that reason, an ongoing research is addressing the problem of subspace clustering methodologies suitable for high dimensional datasets and verify of the new methodologies using appropriate datasets, particularly suitable for the analysis of gene expression data. In this context, researchers have identified various limitations of these methods particularly in the areas of information integration systems, text-mining and bio-informatics. This paper aims too at providing an overview of the published literature with a particular focus on the current status of subspaces clustering for knowledge discovery toward tumour diagnosis. This is considered to be an essential step in attempt to overcome the limitations and provide effective statistical model in sense of genetic knowledge discovery.
AB - Classification of gene expression data has been exploded in the recent years. This can aid in the development of efficient methodology in the field of bio-informatics to be used for tumours diagnosis and treatment. Data mining is an effective technique being used in this field. One of the most difficulties facing this technology is the inappropriate classification methods that examine complex structure of gene expression data. In this paper, we give a brief introduction of gene expression data with experiment and we have made a critical review of major techniques being applied in the field of gene expression data with help of data mining. It can be seen that researchers have developed various techniques for gene data classification. In addition, they may differ from one to another whereas results are still showing the need for enhancement in this field. Some of these techniques are addressed in this paper in term of advantages and disadvantages. Accordingly, the deoxyribonucleic acid (DNA) is considered as the maestro of the tumour-derived factors. Analyzing changes on the gene expression may give rise for diagnosis enhancement of affected tissues in their early stages. For that reason, an ongoing research is addressing the problem of subspace clustering methodologies suitable for high dimensional datasets and verify of the new methodologies using appropriate datasets, particularly suitable for the analysis of gene expression data. In this context, researchers have identified various limitations of these methods particularly in the areas of information integration systems, text-mining and bio-informatics. This paper aims too at providing an overview of the published literature with a particular focus on the current status of subspaces clustering for knowledge discovery toward tumour diagnosis. This is considered to be an essential step in attempt to overcome the limitations and provide effective statistical model in sense of genetic knowledge discovery.
UR - https://www.scopus.com/pages/publications/64349122886
U2 - 10.1109/ICIAFS.2008.4783954
DO - 10.1109/ICIAFS.2008.4783954
M3 - Conference Paper
AN - SCOPUS:64349122886
SN - 9781424429004
T3 - Proceedings of the 2008 4th International Conference on Information and Automation for Sustainability, ICIAFS 2008
SP - 367
EP - 371
BT - Proceedings of the 2008 4th International Conference on Information and Automation for Sustainability, ICIAFS 2008
T2 - 2008 4th International Conference on Information and Automation for Sustainability, ICIAFS 2008
Y2 - 12 December 2008 through 14 December 2008
ER -