Font Size: a A A

Research On Feature Representation And Learning Methods For Neuromorphic Event Stream

Posted on:2024-02-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F DongFull Text:PDF
GTID:1528307166999199Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The neuromorphic vision sensor simulates the real biological vision perception process from the device function level,and effectively represents the visual information of dynamic scenes by encoding the relative light intensity changes into asynchronous event streams,which not only has high temporal resolution and high dynamic range perception capability,but also greatly reduces information redundancy and supports low-latency and low-power computing.However,the inherent noise interference,time-varying action semantic information and asynchronous discrete representation of this novel event-based data lead to the lack of effective event stream processing and learning methods.Therefore,this paper presents a systematic study on denoising,sample segmentation,feature representation and learning of event streams,and proposes an event-based processing and learning scheme,including cooperative noise filtering,adaptive sample segmentation,and spatio-temporal and motion feature encoding and learning algorithms.Finally,the real-time sign language recognition in dynamic scenes is used as a typical application case of the proposed system to realize the technical transformation from theoretical methods to practical applications.The main contribution of this paper include:· To address the inherent noise problem of event streams,three typical noises are defined and various noise co-filtering algorithms are proposed based on different spatio-temporal correlation strategies.Firstly,two event-based noise filters are proposed.The spatiotemporal noise filter achieves the cooperative filtering of background and redundant noise by combining spatio-temporal surface and time window constraints;the event density noise filter uses a novel time-and index-based event density representation to synthesize the spatio-temporal correlation of each point in the spatio-temporal neighborhood of the event,and further removes the interference noise.Then,two network-based denoising methods are proposed to achieve adaptive noise filtering,where Hierarchy Of Time-Surfaces(HOTS)and Spiking Neural Networks(SNNs)uses online spatio-temporal prototype clustering and spiking dynamics process to capture the spatio-temporal correlation between events adaptively,respectively.The proposed event denoising methods can effectively remove three types of typical noise,namely background,redundancy and interference,and retain the key feature events while enhancing the sparsity of events.· For the sample segmentation problem of event streams,an membrane potential bipolar detection algorithm based on the correlation of LIF neurons’ membrane potential dynamics and real motion state is proposed to conduct adaptive segmentation.The method uses LIF neurons to integrate the input spatio-temporal information and monitor the local polarization points(peaks and troughs)of the membrane potential,and then divides the event stream within the two troughs with a peak into one action segment.In addition,two threshold mechanisms are introduced to remove some interfering motion events during action onset as well as switching.The proposed algorithm can adapt to different motion speeds and support not only the partitioning of cyclic repetitive actions but also the partitioning of action sequences.· For the spatio-temporal feature representation and learning problem of event streams,an event stream classification model based on event-based spatio-temporal representation and multi-spike learning is proposed.The model consists of a Spatio-Temporal Event Surface(STES)feature descriptor and a Multi-Spike Tempotron learning algorithm(LSMST)based on local search and gradient cropping.The STES feature descriptor can capture the fine-grained spatio-temporal feature differences of events by fusing the spatial and temporal correlation representations of events,so the event stream can be characterized and encoded more accurately.The LS-MST learning algorithm reduces the threshold search time and error by local search strategy,while using gradient cropping to ensure that the SNN can learn the spatio-temporal features of encoded events efficiently and stably.The proposed model has better adaptability to complex objects or actions with richer spatio-temporal dynamics,and significantly outperforms other event stream recognition models in terms of model size,convergence speed,and testing accuracy.In addition,the proposed event-based feature representation and learning method retains the asynchronous nature of events,which can make full use of the event interval for computation and is more conducive to online real-time processing event streams with fine-grained time resolution.· To address the problem of motion information representation and classification of action event streams,a plug-and-play event-based motion feature representation module is proposed for the SNN-based event stream classification model to improve its motion classification accuracy and speed.The module introduces motion history information and gradient direction calculation into the event streams processing,which can efficiently extract motion information of actions and quickly and accurately discriminate motion differences,Compared with traditional event feature representation methods,it has stronger feature differentiability and faster processing speed,especially in the differentiation of clockwise and counterclockwise actions that depend on motion spatio-temporal correlation.Then,three classifiers based on SNN are studied and designed,which can make full use of the extracted gradient direction features and improve the accuracy of action classification.Firstly,the classic single-layer SNN classifier Tempotron is combined with our features to achieve fast and accurate classification.The classification accuracy on multiple data sets is higher than other features using single-layer SNN classifiers,and the results can be compared with the current excellent deep SNN model.Then,a deep SNN classifier based on event index coding input is proposed,which further expands the spatial and temporal differences between input event streams,and can accurately classify single action event streams,and obtains the best classification results in the comparison method.Finally,a Tempotron classifier based on spike cluster causal set is proposed,which can detect and recognize actions from gesture sequence event stream with background noise.· For the sign language recognition applications in real scenarios,an event-based action recognition demonstration system is proposed and a new sign language event stream dataset,Sign Language10-DAVIS,is also proposed for the demonstration.The system enables neuromorphic dataset recording and single or continuous sign language action recognition.Tests in real-world scenarios verified the feasibility and effectiveness of the system for single and continuous action recognition.
Keywords/Search Tags:neuromorphic computing, neuromorphic visual sensors, feature representation, spiking neural networks, action recognition
PDF Full Text Request
Related items