Abstract
The spread of ‘fake’ health news is a big problem with even bigger consequences. In this study, we examine a collection of health-related news articles published by reliable and unreliable media outlets. Our analysis shows that there are structural, topical, and semantic patterns which are different in contents from reliable and unreliable media outlets. Using machine learning, we leverage these patterns and build classification models to identify the source (reliable or unreliable) of a health-related news article. Our model can predict the source of an article with an F-measure of 96%. We argue that the findings from this study will be useful for combating the health disinformation problem.
Original language | English |
---|---|
Title of host publication | The Web Conference 2019 |
Subtitle of host publication | Companion of The World Wide Web Conference WWW 2019 |
Editors | Sihem Amer-Yahia, Mohammad Madian, Ashish Goel, Geert-Jan Houben, Kristina Lerman, Julian McAuley, Ricardo Baeza-Yates, Leila Zia |
Place of Publication | USA |
Publisher | Association for Computing Machinery (ACM) |
Pages | 981-987 |
Number of pages | 7 |
ISBN (Print) | 9781450366755 |
DOIs | |
Publication status | Published - May 2019 |
Event | The Web Conference 2019 - The Hyatt Regency, San Francisco, United States Duration: 13 May 2019 → 17 May 2019 https://www2019.thewebconf.org/ |
Conference
Conference | The Web Conference 2019 |
---|---|
Country | United States |
City | San Francisco |
Period | 13/05/19 → 17/05/19 |
Internet address |