Abstract
We introduce the Reddit Politosphere, a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. It is to the best of our knowledge the largest and ideologically most comprehensive dataset of its type now available. One key feature of the Reddit Politosphere is that it consists of both text and network data, allowing for methodologically-diverse analyses. We describe in detail how we create the Reddit Politosphere, present descriptive statistics, and sketch potential directions for future research based on the resource.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
EU Funded Grant Agreement Number: | 740516 |
EU-Projekte: | Horizon 2020 > ERC Grants > ERC Advanced Grant > ERC Grant 740516: NonSequeToR - Non-sequence models for tokenization replacement |
Fakultätsübergreifende Einrichtungen: | Centrum für Informations- und Sprachverarbeitung (CIS) |
Themengebiete: | 400 Sprache > 400 Sprache
400 Sprache > 410 Linguistik |
URN: | urn:nbn:de:bvb:19-epub-107434-0 |
Bemerkung: | ISBN 978-1-57735-875-6 |
Sprache: | Englisch |
Dokumenten ID: | 107434 |
Datum der Veröffentlichung auf Open Access LMU: | 20. Okt. 2023, 06:16 |
Letzte Änderungen: | 20. Okt. 2023, 06:16 |