Font Size: a A A

The Application Of Neural Network In Baseband Processing And Its Efficient Implementation

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:W H XuFull Text:PDF
GTID:2518306473999989Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The ultra-high speed of the future network poses a huge challenge to the hardware imple-mentation of the baseband processor.The baseband chip not only needs to be able to handle a variety of communication algorithms,but also needs to have strong computing power.The trend of miniaturization and low power consumption means that the future baseband chips not only need to be small in size,but also have low power consumption.Under the background of Moore's Law slowing down,these challenges make the design and manufacture of baseband chips difficult.Moreover,the existing advanced baseband algorithms face problems such as optimal detection and estimation,inability to model and solve,and lack of joint optimization of algorithms and hardware.First,this paper presents DL methods to optimize polar belief propagation(BP)decoding and concatenated polar-LDPC codes.First,two-dimensional offset Min-Sum (2--D OMS) decoding is proposed to improve the error-correction performance of existing normalized Min--Sum (NMS)decoding.Two optimization methods used in DL,namely back-propagation and stochastic gradient descent,are exploited to derive the parameters of proposed algorithms.Numerical results demonstrate that there is no performance gap between 2--D OMS and exact BP on various code lengths.Then the concatenated OMS algorithms with low complexity are presented for concatenated polar-LDPC codes.As a result,the optimized concatenated OMS de-coding yields error-correction performance with CRC--aided successive cancellation list (CA-SCL) decoder of list size 2 on length--1024 polar codes.The aforementioned optimization methods are extended to massive multiple-input multiple--output (MIMO) systems.Deep neural network (DNN) is utilized to improve message passing detectors (MPDs).A general framework to construct DNN architecture for MIMO detection is first introduced by unfolding iterative MPDs.DNN MIMO detectors are then proposed based on modified MPDs including damped belief propagation(dBP),max-sum (MS)BP,and simplified channel hardening-exploiting message passing (CHEMP).The correction factors are op-timized via DL methods for better performance.Numerical results demonstrate that,compared with state--of-the--art (SOA) detectors including minimum mean-squared error (MMSE),BP,and CHEMP,the proposed DNN detectors can achieve better bit-error-rate (BER) and improve robustness against various antenna and channel conditions with similar complexity.The DNN is required to be trained only once and can be reused for multiple detections,which assures its high efficiency.We also study the equalization problem over the nonlinear channel using neural networks.The joint equalizer and decoder based on neural networks are proposed to realize blind equal-ization and decoding process without the knowledge of channel state information(CSI).Dif-ferent from previous methods,we use two neural networks instead of one.First,convolutional neural network (CNN) is used to adaptively recover the transmitted signal from channel im-pairment and nonlinear distortions.Then the deep neural network decoder (NND)decodes the detected signal from CNN equalizer.Under various channel conditions,the experiment results demonstrate that the proposed CNN equalizer achieves better performance than other machine learning-based methods.The proposed model reduces about 2/3 of the parameters compared to state-of-the-art counterparts.Besides,our model can be easily applied to long sequence with O(n)complexity.Aiming at the low-complexity hardware implementation,this paper develops quantiza-tion schemes and optimization strategies for various tasks,including polar decoding and neural network inference.The contributions lie in three aspects:(a)We present a method to deter-mine the fixed-point quantization scheme and the optimal factor of LLR scaling for polar BP decoder,providing guidelines for the corresponding hardware design.(b)Besides,a low-bit and retraining-free quantization method,which enables CNNs to deal inference with only shift and add operations,is proposed for CNN inference.Experiment results show that our method achieves higher accuracy than other low-precision networks without retraining process on Ima-ge Net.5×to 8×compression is obtained on popular models compared to full-precision coun-terparts while hardware implementation indicates good reduction of slices whereas maintaining throughput.(c)To design and optimize efficient NNs in communication system,we propose an iterative optimization framework with retraining to find the quantization scheme for different NNs.Moreover,the efficient design of convolutional neural networks is presented to reduce the required parameters and computational complexity.On modulation classification,chan-nel decoder and equalizer tasks,compared to full-precision models,the quantized NN models achieve comparable performance with only 4 to 5 weight bits and 8 -bit activation.The size of optimized models is significantly compressed and the hardware complexity of the NN inference is also reduced.Rather than the algorithm level optimization,this paper also focuses on developing energy-efficient and reconfigurable hardware architectures for polar decoder and neural networks.The efficient hardware architectures of scalable polar OMS decoder are described.The proposed decoder is reconfigurable to support three code lengths(N=256,512,1024)and two decoding algorithms(2--D OMS and concat.OMS).The polar OMS decoder implemented on 65 nm CMOS technology achieves a maximum coded throughput of 5.4 Gb/s for code length 1024 and7.5 Gb/s for code length 256,which are comparable to the state-of-the-art polar BP decoders.Moreover,a 5.1 Gb/s throughput is achieved under concat.OMS decoding mode for code length1024 with a latency of 200 ns,which is superior to existing CA-SCL decoders that have similar error-correction performance.The prior works exploit fast algorithms like Winograd and fast Fourier transform(FFT),to reduce the complexity of spatial convolution for CNN.This paper proposes a reconfigurable and low-complexity accelerator on ASIC for both CNN and GAN is proposed to further ac-celerate the convolution(CONV)in CNN as well as the transposed convolution(TCONV)in generative adversarial network(GAN).First,by exploiting Fermat number transform(FNT),we propose two FNT-based fast algorithms to reduce the complexity of CONV and TCONV computations,respectively.Then the architectures of FNT-based accelerator are presented to implement the proposed fast algorithms.The methodology to determine the design parameters and optimize the dataflow is also described for obtaining maximum performance and optimal efficiency.Moreover,we implement the proposed accelerator on 65 nm 1P9M technology and evaluate it on various CNN and GAN models.The post-layout results show that our design achieves a throughput of 288.0 GOP/s on VGG-16 with 25.11 GOP/s/mm~2 area efficiency,which is superior to the state-of-the-art CNN accelerators.Furthermore,at least 1.7× speed-up over the existing accelerators is obtained on GAN.The resulting energy efficiency is 275.3× and 12.5× of CPU and GPU.
Keywords/Search Tags:Deep learning, neural network, polar codes, massive MIMO, channel equalization, hardware architecture, pipelining, fast convolution
PDF Full Text Request
Related items