TY - JOUR
T1 - Text classification : Naïve Bayes classifier with sentiment Lexicon
AU - Le, Cong-Cuong
AU - Prasad, P. W. C.
AU - Alsadoon, Abeer
AU - Pham, L.
AU - Elchouemi, A.
PY - 2019
Y1 - 2019
N2 - This paper proposes a method of linguistic classification based on the analysis of positive, negative and neutral sentiments expressed within text written in Vietnamese and English. It includes a process for document preparation and is based on the development of training data using Naïve Bayes classification in conjunction with a sentiment lexicon dictionary, thus reducing the size of the training corpus and limitation of using bag-of-words. Naïve Bayes, a machine learning and information mining algorithm, was chosen for its proven viability and its central role in data retrieval in general. The effectiveness of Naïve Bayes is further enhanced through the use of the dictionary as the input source, reducing the magnitude of the training corpus and consequently training time. In addition, the implementation of a document preparation process significantly improves accuracy to 98.2 % when compared with traditional Naïve Bayes (96.1%) and the lexical method (87.3 %).
AB - This paper proposes a method of linguistic classification based on the analysis of positive, negative and neutral sentiments expressed within text written in Vietnamese and English. It includes a process for document preparation and is based on the development of training data using Naïve Bayes classification in conjunction with a sentiment lexicon dictionary, thus reducing the size of the training corpus and limitation of using bag-of-words. Naïve Bayes, a machine learning and information mining algorithm, was chosen for its proven viability and its central role in data retrieval in general. The effectiveness of Naïve Bayes is further enhanced through the use of the dictionary as the input source, reducing the magnitude of the training corpus and consequently training time. In addition, the implementation of a document preparation process significantly improves accuracy to 98.2 % when compared with traditional Naïve Bayes (96.1%) and the lexical method (87.3 %).
UR - https://hdl.handle.net/1959.7/uws:64601
UR - http://www.iaeng.org/IJCS/issues_v46/issue_2/IJCS_46_2_01.pdf
M3 - Article
SN - 1819-656X
VL - 46
SP - 141
EP - 148
JO - IAENG International Journal of Computer Science
JF - IAENG International Journal of Computer Science
IS - 2
ER -