Logo Logo
Hilfe
Hilfe
Switch Language to English

Galli, Leonardo ORCID logoORCID: https://orcid.org/0000-0002-8045-7101; Rauhut, Holger ORCID logoORCID: https://orcid.org/0000-0003-4750-5092 und Schmidt, Mark (2023): Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models. Advances in Neural Information Processing Systems, New Orleans, 10. - 16. December 2023.

Volltext auf 'Open Access LMU' nicht verfügbar.

Abstract

Recent works have shown that line search methods can speed up Stochastic Gradi- ent Descent (SGD) and Adam in modern over-parameterized settings. However, existing line searches may take steps that are smaller than necessary since they require a monotone decrease of the (mini-)batch objective function. We explore nonmonotone line search methods to relax this condition and possibly accept larger step sizes. Despite the lack of a monotonic decrease, we prove the same fast rates of convergence as in the monotone case. Our experiments show that nonmono- tone methods improve the speed of convergence and generalization properties of SGD/Adam even beyond the previous monotone line searches. We propose a POlyak NOnmonotone Stochastic (PoNoS) method, obtained by combining a nonmonotone line search with a Polyak initial step size. Furthermore, we develop a new resetting technique that in the majority of the iterations reduces the amount of backtracks to zero while still maintaining a large initial step size. To the best of our knowledge, a first runtime comparison shows that the epoch-wise advantage of line-search-based methods gets reflected in the overall computational time.

Dokument bearbeiten Dokument bearbeiten