Abstract
Our goal is to evaluate the usefulness of unsupervised representation learning techniques for detecting stances of Fake News. Therefore we examine several pre-trained language models with respect to their performance on two Fake News related data sets, both consisting of instances with a headline, an associated news article and the stance of the article towards the respective headline. Specifically, the aim is to understand how much hyperparameter tuning is necessary when fine-tuning the pre-trained architectures, how well transfer learning works in this specific case of stance detection and how sensitive the models are to changes in hyperparameters like batch size, learning rate (schedule), sequence length as well as the freezing technique. The results indicate that the computationally more expensive autoregression approach of XLNet (Yanget al., 2019) is outperformed by BERT-based models, notably by RoBERTa (Liu et al., 2019).While the learning rate seems to be the most important hyperparameter, experiments with different freezing techniques indicate that all evaluated architectures had already learned powerful language representations that pose a good starting point for fine-tuning them.
Item Type: | Conference or Workshop Item (Poster) |
---|---|
Faculties: | Mathematics, Computer Science and Statistics > Statistics Mathematics, Computer Science and Statistics > Statistics > Chairs/Working Groups > Methods for missing Data, Model selection and Model averaging |
Subjects: | 500 Science > 510 Mathematics |
URN: | urn:nbn:de:bvb:19-epub-75926-9 |
Place of Publication: | Barcelona, Spain (Online) |
Item ID: | 75926 |
Date Deposited: | 12. May 2021, 05:56 |
Last Modified: | 12. May 2021, 05:56 |