AI tools could improve fake news detection by analyzing users’ interactions and comments

In a paper published on the preprint server Arxiv.org, researchers affiliated with Microsoft and Arizona State University propose an approach to detecting fake news that leverages a technique called weak social supervision. They say that by enabling the training of fake news-detecting AI even in scenarios where labeled examples aren’t available, weak social supervision opens the door to exploring how aspects of user interactions indicate news might be misleading.

According to the Pew Research Center, approximately 68% of U.S. adults got their news from social media in 2018 — which is worrisome considering misinformation about the pandemic continues to go viral, for instance. Companies from Facebook and Twitter to Google are pursuing automated detection solutions, but fake news remains a moving target owing to its topical and stylistic diverseness.

Building on a study published in April, the coauthors of this latest work suggest that weak supervision — where noisy or imprecise sources provide data labeling signals — could improve fake news detection accuracy without requiring fine-tuning. To this end, they built a framework dubbed Tri-relationship for Fake News (TiFN) that models social media users and their connections as an “interaction network” to detect fake news.

Interaction networks describe the relationships among entities like publishers, news pieces, and users; given an interaction network, TiFN’s goal is to embed different types of entities, following from the observation that people tend to interact with like-minded friends. In making its predictions, the framework also accounts for the fact that connected users are more likely to share similar interests in news pieces; that publishers with a high degree of political bias are more likely to publish fake news; and that users with low credibility are more likely to spread fake news.

To test whether TiFN’s weak social supervision could help to detect fake news effectively, the team validated it against a Politifact data set containing 120 true news and 120 verifiably fake pieces shared among 23,865 users. Versus baseline detectors that consider only news content and some social interactions, they report that TiFN achieved between 75% to 87% accuracy even with a limited amount of weak social supervision (within 12 hours after the news was published).

In another experiment involving a separate custom framework called Defend, the researchers sought to use as a weak supervision signal news sentences and user comments explaining why a piece of news is fake. Tested on a second Politifact data set consisting of 145 true news and 270 fake news pieces with 89,999 comments from 68,523 users on Twitter, they say that Defend achieved 90% accuracy.

[W]ith the help of weak social supervision from publisher-bias and user-credibility, the detection performance is better than those without utilizing weak social supervision. We [also] observe that when we eliminate news content component, user comment component, or the co-attention for news contents and user comments, the performances are reduced. [This] indicates capturing the semantic relations between the weak social supervision from user comments and news contents is important,” wrote the researchers. “[W]e can see within a certain range, more weak social supervision leads to a larger performance increase, which shows the benefit of using weak social supervision.”