
Abstract
We introduce the Reddit Politosphere, a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. It is to the best of our knowledge the largest and ideologically most comprehensive dataset of its type now available. One key feature of the Reddit Politosphere is that it consists of both text and network data, allowing for methodologically-diverse analyses. We describe in detail how we create the Reddit Politosphere, present descriptive statistics, and sketch potential directions for future research based on the resource.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
EU Funded Grant Agreement Number: | 740516 |
EU Projects: | Horizon 2020 > ERC Grants > ERC Advanced Grant > ERC Grant 740516: NonSequeToR - Non-sequence models for tokenization replacement |
Research Centers: | Center for Information and Language Processing (CIS) |
Subjects: | 400 Language > 400 Language 400 Language > 410 Linguistics |
URN: | urn:nbn:de:bvb:19-epub-107434-0 |
Annotation: | ISBN 978-1-57735-875-6 |
Language: | English |
Item ID: | 107434 |
Date Deposited: | 20. Oct 2023, 06:16 |
Last Modified: | 20. Oct 2023, 06:16 |