Abstract
The spread of ‘fake’ health news is a big problem with even bigger consequences. In this study, we examine a collection of health-related news articles published by reliable and unreliable media outlets. Our analysis shows that there are structural, topical, and semantic patterns which are different in contents from reliable and unreliable media outlets. Using machine learning, we leverage these patterns and build classification models to identify the source (reliable or unreliable) of a health-related news article. Our model can predict the source of an article with an F-measure of 96%. We argue that the findings from this study will be useful for combating the health disinformation problem.
Original language | English |
---|---|
Title of host publication | The Web Conference 2019 |
Subtitle of host publication | Companion of the world wide web conference WWW 2019 |
Editors | Sihem Amer-Yahia, Mohammad Madian, Ashish Goel, Geert-Jan Houben, Kristina Lerman, Julian McAuley, Ricardo Baeza-Yates, Leila Zia |
Place of Publication | USA |
Publisher | Association for Computing Machinery (ACM) |
Pages | 981-987 |
Number of pages | 7 |
ISBN (Print) | 9781450366755 |
DOIs | |
Publication status | Published - May 2019 |
Event | WWW '19 : The Web Conference 2019 - The Hyatt Regency, San Francisco, United States Duration: 13 May 2019 → 17 May 2019 https://www2019.thewebconf.org/ |
Conference
Conference | WWW '19 |
---|---|
Abbreviated title | 30 years of the web |
Country/Territory | United States |
City | San Francisco |
Period | 13/05/19 → 17/05/19 |
Other | It is our great pleasure to welcome you to The Web Conference 2019. The Web Conference is the premier venue focused on understanding the current state and the evolution of the Web through the lens of computer science, computational social science, economics, policy, and many other disciplines. The 2019 edition of the conference is a reflection point as we celebrate the 30th anniversary of the Web. |
Internet address |