TY - JOUR
T1 - AI-based ICD coding and classification approaches using discharge summaries : a systematic literature review
AU - Kaur, Rajvir
AU - Ginige, Jeewani Anupama
AU - Obst, Oliver
PY - 2023/3/1
Y1 - 2023/3/1
N2 - The assignment of codes to free-text clinical narratives have long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research. The current scenario of assigning clinical codes is a manual process which is very expensive, time-consuming and error prone. In recent years, many researchers have studied the use of Natural Language Processing (NLP), related machine learning and deep learning methods and techniques to resolve the problem of manual coding of clinical narratives and to assist human coders to assign clinical codes more accurately and efficiently. The main objective of this systematic literature review is to provide a comprehensive overview of automated clinical coding systems that utilise appropriate NLP, machine learning and deep learning methods and techniques to assign the International Classification of Diseases (ICD) codes to discharge summaries. We have followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and conducted a comprehensive search of publications from January, 2010 to December 2021 in four high quality academic databases: PubMed, ScienceDirect, Association for Computing Machinery (ACM) Digital Library, and the Association for Computational Linguistics (ACL) Anthology. We reviewed 6128 publications; 42 met the inclusion criteria. This review identified: 6 datasets having discharge summaries (2 publicly available, 4 acquired from hospitals); 14 NLP techniques along with some other data extraction processes, different feature extraction and embedding techniques. The review also shows that there is a significant increase in the use of deep learning models compared to machine learning. To measure the performance of classification methods, different evaluation metrics are used. Efforts are still required to improve ICD code prediction accuracy, availability of large-scale de-identified clinical corpora with the latest version of the classification system. This can be a platform to guide and share knowledge with the less experienced coders and researchers.
AB - The assignment of codes to free-text clinical narratives have long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research. The current scenario of assigning clinical codes is a manual process which is very expensive, time-consuming and error prone. In recent years, many researchers have studied the use of Natural Language Processing (NLP), related machine learning and deep learning methods and techniques to resolve the problem of manual coding of clinical narratives and to assist human coders to assign clinical codes more accurately and efficiently. The main objective of this systematic literature review is to provide a comprehensive overview of automated clinical coding systems that utilise appropriate NLP, machine learning and deep learning methods and techniques to assign the International Classification of Diseases (ICD) codes to discharge summaries. We have followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and conducted a comprehensive search of publications from January, 2010 to December 2021 in four high quality academic databases: PubMed, ScienceDirect, Association for Computing Machinery (ACM) Digital Library, and the Association for Computational Linguistics (ACL) Anthology. We reviewed 6128 publications; 42 met the inclusion criteria. This review identified: 6 datasets having discharge summaries (2 publicly available, 4 acquired from hospitals); 14 NLP techniques along with some other data extraction processes, different feature extraction and embedding techniques. The review also shows that there is a significant increase in the use of deep learning models compared to machine learning. To measure the performance of classification methods, different evaluation metrics are used. Efforts are still required to improve ICD code prediction accuracy, availability of large-scale de-identified clinical corpora with the latest version of the classification system. This can be a platform to guide and share knowledge with the less experienced coders and researchers.
UR - https://hdl.handle.net/1959.7/uws:69354
U2 - 10.1016/j.eswa.2022.118997
DO - 10.1016/j.eswa.2022.118997
M3 - Article
SN - 0957-4174
VL - 213
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - Pt. B
M1 - 118997
ER -