Neural networks with ReLU powers need less depth

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Cabanilla, Kurt Izak; Mohammad, Rhudaina Z. und Lope, Jose Ernie C. (2024): Neural networks with ReLU powers need less depth. In: Neural Networks, Bd. 172, 106073

Volltext auf 'Open Access LMU' nicht verfügbar.

DOI: j.neunet.2023.12.027

Abstract

Despite the widespread success of deep learning in various applications, neural network theory has been lagging behind. The choice of the activation function plays a critical role in the expressivity of a neural network but for reasons that are not yet fully understood. While the rectified linear unit (ReLU) is currently one of the most popular activation functions, ReLU squared has only recently been empirically shown to be pivotal in producing consistently superior results for state-of-the-art deep learning tasks (So et al., 2021). To analyze the expressivity of neural networks with ReLU powers, we employ the novel framework of Gribonval et al. (2022) based on the classical concept of approximation spaces. We consider the class of functions for which the approximation error decays at a sufficiently fast rate as network complexity, measured by the number of weights, increases. We show that when approximating sufficiently smooth functions that cannot be represented by sufficiently low-degree polynomials, networks with ReLU powers need less depth than those with ReLU. Moreover, if they have the same depth, networks with ReLU powers can have potentially faster approximation rates. Lastly, our computational experiments on approximating the Rastrigin and Ackley functions with deep neural networks showed that ReLU squared and ReLU cubed networks consistently outperform ReLU networks.

Dokumententyp:	Zeitschriftenartikel
Keywords:	ReLU powers ; Deep neural networks ; Approximation spaces
Fakultät:	Mathematik, Informatik und Statistik > Mathematik > Lehrstuhl für Mathematik der Informationsverarbeitung
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme 500 Naturwissenschaften und Mathematik > 510 Mathematik
ISSN:	0893-6080
Dokumenten ID:	127300
Datum der Veröffentlichung auf Open Access LMU:	07. Aug. 2025 07:07
Letzte Änderungen:	07. Aug. 2025 07:07

Dokument bearbeiten