Optimal Function Approximation Using Re Lu Neural Networks

Posted on:2022-05-22

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liang

Full Text:PDF

GTID:2480306764494694

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

In recent years,deep learning has achieved state-of-the-art performance in a range of fields such as computer vision,recommender system,natural language processing,and so on.Despite their promising results in applications,there are some important problems in the neural networks theory.The expressive power of neural networks,as an important part of neural networks theory,which plays a vital role for understanding neural networks.In the view of function approximation,the expressive power describes neural networks’ ability to approximate arbitrary functions.According to the universal approximation theorem,the width of the single hidden layer network is wide enough,the objective function can be approximated with arbitrary accuracy.Neural networks used are getting bigger and bigger in order to achieve superior accuracies in industries.This raises the fundamental questions concerning the expressive power of neural networks: What is the accuracy limit one can achieve using a network of given size?How fast does the accuracy increase with network size? Describing in mathematical language,what is the minimal approximation error a network can achieve? How fast does the approximation error decrease with network size? On the other hand,current training techniques e.g.stochastic gradient descent are considered whether it is able to fully expolite the expressive power of neural networks.If not,how big is the gap between network training error and the minimal approximation error?In view of the above problems,the mainly works are as follows:1.We introduce the necessary and sufficient conditions for the optimal approximation of convex functions with a piecewise linear（PWL）function of segment number n.According to the conditions,the upper and lower bounds of the optimal approximation error and the optimal approximation rate are obtained.Because of the nonlinear of neural networks,the structure of Re LU neural network with fixed depth and fixed width are presented to generate the optimal approximation linear segments.The upper bounds of the optimal approximation error with the network structure are explained.2.According to the optimal function approximation theory,we propose an algorithm to compute the optimal approximations and explained its convergence.We conduct experiments to validate its effectiveness and compare with a classic optimal approximation algorithm.We also demonstrate that the theoretical limit of approximation errors is not attained by Re LU networks trained with stochastic gradient descent optimization,which indicates that the expressive power of Re LU networks has not been exploited to its full potential.3.Dividing linear regions is proposed to ensure all the samples divided correctly.The method is used to compute average fitting error of each linear region in the neural networks,which can explain the difference of the expressive power of different network structures.4.For high-dimensional functions,its approximation error is calculated through experiments.The results on different network structures with the same number of neurons demonstrate that deep networks possess stronger function approximation ability than shallow networks.

Keywords/Search Tags:

deep learning theory, ReLU networks, expressive power, optimal approximation, linear region

PDF Full Text Request

Related items

1	Convergence Rate Analysis For Deep Ritz Method With ReLU-ResNet
2	Deep Learning Methods For Solving PDE-constrained Optimal Control Problems
3	Deep Learning Method For Solving High-dimensional Forward-backward Stochastic Differential Equations And Stochastic Optimal Control Problems
4	Optimal Control Of Stochastic Systems:Theory,Numerics,and Applications
5	Research And Implementation Of Key Network Community Discovery Algorithm For Large-scale Academic Papers
6	Investigating Deep Neural Networks For Gravitational Wave Evaluation With Deep Learning Ligo Data
7	The Weighted Least Squares Solutions Of Several Classes Of Linear Matrix Equations And It's Optimal Approximation Problems
8	The Research On The Prediction Method Of Protein Succinvlation Sites Based On PU Learning And Deep Learning Technology
9	The Study For The Prediciton Of Protein Ubiquitination Sites Based On Deep Learning
10	Deep Learning Structural Damage Identification Considering Optimal Sensor Placement