ORCID: https://orcid.org/0000-0002-8045-7101; Rauhut, Holger
ORCID: https://orcid.org/0000-0003-4750-5092 und Schmidt, Mark
(2023):
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models.
Advances in Neural Information Processing Systems, New Orleans, 10. - 16. December 2023.
Abstract
Recent works have shown that line search methods can speed up Stochastic Gradi- ent Descent (SGD) and Adam in modern over-parameterized settings. However, existing line searches may take steps that are smaller than necessary since they require a monotone decrease of the (mini-)batch objective function. We explore nonmonotone line search methods to relax this condition and possibly accept larger step sizes. Despite the lack of a monotonic decrease, we prove the same fast rates of convergence as in the monotone case. Our experiments show that nonmono- tone methods improve the speed of convergence and generalization properties of SGD/Adam even beyond the previous monotone line searches. We propose a POlyak NOnmonotone Stochastic (PoNoS) method, obtained by combining a nonmonotone line search with a Polyak initial step size. Furthermore, we develop a new resetting technique that in the majority of the iterations reduces the amount of backtracks to zero while still maintaining a large initial step size. To the best of our knowledge, a first runtime comparison shows that the epoch-wise advantage of line-search-based methods gets reflected in the overall computational time.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Mathematik > Lehrstuhl für Mathematik der Informationsverarbeitung |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme |
Sprache: | Englisch |
Dokumenten ID: | 126718 |
Datum der Veröffentlichung auf Open Access LMU: | 12. Jun. 2025 06:23 |
Letzte Änderungen: | 12. Jun. 2025 12:28 |