TY - JOUR
T1 - Text mining and sentiment analysis of newspaper headlines
AU - Hossain, Arafat
AU - Karimuzzaman, Md.
AU - Hossain, Md. Moyazzem
AU - Rahman, Azizur
PY - 2021/10/9
Y1 - 2021/10/9
N2 -
Text analytics are well-known in the modern era for extracting
information and patterns from text. However, no study has attempted to
illustrate the pattern and priorities of newspaper headlines in
Bangladesh using a combination of text analytics techniques. The purpose
of this paper is to examine the pattern of words that appeared on the
front page of a well-known daily English newspaper in Bangladesh, The Daily Star,
in 2018 and 2019. The elucidation of that era’s possible social and
political context was also attempted using word patterns. The study
employs three widely used and contemporary text mining techniques: word
clouds, sentiment analysis, and cluster analysis. The word cloud reveals
that election, kill, cricket, and Rohingya-related terms appeared more
than 60 times in 2018, whereas BNP, poll, kill, AL, and Khaleda appeared
more than 80 times in 2019. These indicated the country’s passion for
cricket, political turmoil, and Rohingya-related issues. Furthermore,
sentiment analysis reveals that words of fear and negative emotions
appeared more than 600 times, whereas anger, anticipation, sadness,
trust, and positive-type emotions came up more than 400 times in both
years. Finally, the clustering method demonstrates that election,
politics, deaths, digital security act, Rohingya, and cricket-related
words exhibit similarity and belong to a similar group in 2019, whereas
rape, deaths, road, and fire-related words clustered in 2018 alongside a
similar-appearing group. In general, this analysis demonstrates how
vividly the text mining approach depicts Bangladesh’s social, political,
and law-and-order situation, particularly during election season and
the country’s cricket craze, and also validates the significance of the
text mining approach to understanding the overall view of a country
during a particular time in an efficient manner.
AB -
Text analytics are well-known in the modern era for extracting
information and patterns from text. However, no study has attempted to
illustrate the pattern and priorities of newspaper headlines in
Bangladesh using a combination of text analytics techniques. The purpose
of this paper is to examine the pattern of words that appeared on the
front page of a well-known daily English newspaper in Bangladesh, The Daily Star,
in 2018 and 2019. The elucidation of that era’s possible social and
political context was also attempted using word patterns. The study
employs three widely used and contemporary text mining techniques: word
clouds, sentiment analysis, and cluster analysis. The word cloud reveals
that election, kill, cricket, and Rohingya-related terms appeared more
than 60 times in 2018, whereas BNP, poll, kill, AL, and Khaleda appeared
more than 80 times in 2019. These indicated the country’s passion for
cricket, political turmoil, and Rohingya-related issues. Furthermore,
sentiment analysis reveals that words of fear and negative emotions
appeared more than 600 times, whereas anger, anticipation, sadness,
trust, and positive-type emotions came up more than 400 times in both
years. Finally, the clustering method demonstrates that election,
politics, deaths, digital security act, Rohingya, and cricket-related
words exhibit similarity and belong to a similar group in 2019, whereas
rape, deaths, road, and fire-related words clustered in 2018 alongside a
similar-appearing group. In general, this analysis demonstrates how
vividly the text mining approach depicts Bangladesh’s social, political,
and law-and-order situation, particularly during election season and
the country’s cricket craze, and also validates the significance of the
text mining approach to understanding the overall view of a country
during a particular time in an efficient manner.
KW - Newspaper
KW - Bangladesh
KW - Headlines pattern and context
KW - Word cloud
KW - Cluster analysis
KW - Sentiment analysis
U2 - 10.3390/info12100414
DO - 10.3390/info12100414
M3 - Article
SN - 2078-2489
VL - 12
SP - 1
EP - 15
JO - Information
JF - Information
IS - 10
M1 - 414
ER -