FakeNewsChallenge | Behavioral Informatics Labs

We have extracted linguistic features for the “Fake” or “BS” news articles provided by SBP-BRIMS Data Grand Challenge organizers:

Dataset 0: LINK
New articles shared by the Kaggle website.
(N~12K).

We have created two new “Valid” news article datasets:

Dataset 1: LINK
Sampled 115 articles each from three well-known and largely respected news agencies: National Public Radio, New York Times, and Public Broadcasting Service. Extracted LIWC features also shared.
(N=445).

Dataset 2: LINK
Downloaded 23k+ articles from PBS (Public Broadcasting Service) website. Extracted LIWC features also shared.
(N=23,635).

Note: In each category, some of the collected articles are no longer accessible, did not pass our manual validation steps, or were flagged by LIWC analysis and hence were removed from the set for which we have computed features.