Font Size: a A A

Investigation Into Efficient Learning Algorithms For Spiking Neural Networks

Posted on:2024-12-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:S K DengFull Text:PDF
GTID:1528307373470084Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Spiking neural networks are a unique type of neural network structure inspired by biological neural networks.They utilize sequences of binary spikes to transmit information between layers.By employing this mechanism,neurons in Spiking neural networks possess the potential to degrade the high-power matrix multiplication to low-energy matrix addition when computing synaptic currents.This characteristic empowers spiking neural networks with extreme low-power and high-efficient capabilities,making them advantageous for some special energy-efficient intelligent terminal.However,the binarization of spike signals also bring significant training challenges due to the non-differentiable nature of the spike firing function(step function).Currently,there are two mainstream approaches to achieve deep spiking neural networks: network conversion algorithms and direct training algorithms.Netowrk conversion algorithms convert artificial neural networks(ANN)with ReLU activation functions into structurally equivalent spiking neural networks.Nevertheless,existing network conversion algorithms merely rely on the proportional relationship between ReLU activation values and neuronal activation frequencies,lacking an analysis of the holistic effect of network conversion.As a result,converted spiking neural networks require extensive time steps to achieve comparable performance to the source artificial neural networks.In contrast to network conversion algorithms,direct training algorithms exhibit stronger adaptability to network structures and datasets,albeit with higher training difficulty and costs.The most prevalent direct training approach currently involves replacing the step function with a surrogate gradient during backpropagation.While the introduction of surrogate gradients opens up possibilities for training complex spiking neural networks,it also introduces issues such as gradient error accumulation and reduce the network generalizability.This dissertation focuses on the training challenges of spiking neural networks and investigates the error associated with conversion algorithms as well as the gradient error accumulation problem caused by surrogate gradients.It explores methods to improve training effectiveness,enhance network generalizability,and proposes novel architectures tailored for spiking neural networks.The main content and contributions of this dissertation include:1.Define conversion error in network conversion algorithms and calculate its upper bound.Existing network conversion algorithms primarily rely on the proportional relationship between ReLU activation values and the spike frequencies.However,they lack an analysis of the overall differences in network outputs before and after conversion.In this dissertation,a layer-wise decomposition approach is used to define the output difference between the pre-converted and post-converted networks under the same input as the conversion error.The theoretical upper bound of this error is calculated,filling the theoretical gap in network conversion algorithms and providing a theoretical foundation for future algorithm improvements.Additionally,this dissertation proposes two algorithms to reduce the simulation time steps required for the converted spiking neural network to one-tenth of the original time steps.2.Analyze the reasons for the low generalizability of spiking neural networks and propose the efficient temporal training algorithm to improve network generalization.On small-scale datasets,both spiking neural networks and traditional neural networks often achieve high performance on the training set,but spiking neural networks exhibit significantly inferior performance on the test set compared to traditional neural networks.This dissertation analyzes the causes of this phenomenon and proposes a novel and efficient temporal training algorithm.The effectiveness of the new algorithm is demonstrated both theoretically and experimentally.Temporal training,particularly on neural morphology datasets,significantly enhances the generalization of spiking neural networks across all datasets.For the DVS-CIFAR10 dataset,the proposed algorithm makes the SNN achieve a test accuracy of over 80% for the first time,representing an improvement of over 10%compared to existing training methods.3.Analyze the issue of gradient error accumulation caused by surrogate gradients and propose a surrogate module learning algorithm to alleviate this problem.This dissertation analyzes the gradient error introduced by surrogate gradients in the chain rule and its accumulative effect across layers,which limits the performance and optimal depth of spiking neural networks.To address this issue,this dissertation proposes the surrogate module learning algorithm,which builds new auxiliary pathways to propagate more accurate gradients to preceding layers,thereby mitigating the impact of accumulated gradient errors.The surrogate module learning algorithm achieves optimal performance on various datasets,and this dissertation also provides evidence of its applicability to other network models that utilize surrogate gradients.By employing the surrogate module learning proposed in this dissertation,the test accuracy of SNN with the ResNet-34 architecture on the ImageNet dataset has reached 69.35%,which achieves a 4.56% improvement compared to existing approaches.4.Propose a novel network architecture for spiking neural networks,leading to significant improvement in their performance across diverse datasets.This dissertation thoroughly analyzes and investigates the performance of various token mixers and utilizes a network search algorithm to discover excellent encoder structures that adapt well to different datasets.Based on these experiments,this dissertation introduces the innovative structure of STMixer,which achieves state-of-the-art performance on datasets like ImageNet,while maintaining a comparable parameter count.The spiking neural network has reached 75.52% ImageNet accuracy with the proposed structure and designed surrogate module learning algorithm.This is 0.73% higher than previous structure design works and uses about 72% of the parameter volume.
Keywords/Search Tags:Spiking neural network, Surrogate gradient, Training algorithm, Brain-inspired Computing
PDF Full Text Request
Related items