Text classification: Naïve bayes classifier with sentiment Lexicon

Cong Cuong Le, P. W.C. Prasad, Abeer Alsadoon, L. Pham, A. Elchouemi

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)
434 Downloads (Pure)


This paper proposes a method of linguistic classification based on the analysis of positive, negative and neutral sentiments expressed within text written in Vietnamese and English. It includes a process for document preparation and is based on the development of training data using Naïve Bayes classification in conjunction with a sentiment lexicon dictionary, thus reducing the size of the training corpus and limitation of using bag-of-words. Naïve Bayes, a machine learning and information mining algorithm, was chosen for its proven viability and its central role in data retrieval in general. The effectiveness of Naïve Bayes is further enhanced through the use of the dictionary as the input source, reducing the magnitude of the training corpus and consequently training time. In addition, the implementation of a document preparation process significantly improves accuracy to 98.2 % when compared with traditional Naïve Bayes (96.1%) and the lexical method (87.3 %).

Original languageEnglish
Pages (from-to)141-148
Number of pages8
JournalIAENG International Journal of Computer Science
Issue number2
Publication statusPublished - 27 May 2019


Dive into the research topics of 'Text classification: Naïve bayes classifier with sentiment Lexicon'. Together they form a unique fingerprint.

Cite this