As a key technology for the fifth-generation wireless communication(5G)system,massive multiple-input multiple-output(M-MIMO)is able to achieve a higher data transmission rate,higher frequency utiliza-tion and stronger data link reliability.Unfortunately,the benefits of M-MIMO come at the expense of a large number of antennas,which brings great computational complexity to its signal detection.In addition,as the only channel coding technology that can be proved to reach the Shannon limit in binary discrete memoryless channels(B-DMCs),polar codes were successfully selected into the 5G system standard in 2016,becoming the standard coding method of the control channel in enhanced mobile bandwidth(e MBB)scenario.However,as one of the most important decoding algorithms of polar codes,the decoding performance of the successive cancellation list(S-CL)algorithm enhances with the increase in the number of lists L.Therefore,to obtain satisfactory decoding performance,the traditional designs usually utilize a large number of lists,thus result-ing in unaffordable hardware consumption.Meanwhile,the development trend of miniaturization and low power consumption of wearable electronic mobile devices means that the future baseband signal processing chip should feature both smaller area and lower power consumption.As a result,in view of the above prob-lems,this thesis will focus on how to utilize the idea of approximate computation to reduce the computational complexity and hardware consumption of signal detection in large-scale MIMO systems and S-CL decoders.Above all,this thesis explores the low-complexity and high-performance linear detection algorithm in M-MIMO,establishes an MIMO with multiple-antenna user equipment(MIMO-MAUE)system,which is more suitable for the actual communication scenario,then utilizes the sparsity of the channel matrix in this scenario to propose a low complexity two-level and block diagonal improved Neumann iterative approxima-tion(TL-BD-INSA)algorithm.Firstly,exerting matrix partition,an approximate matrix inversion algorithm based on a two-level Neumann iteration is proposed.Its performance is almost equivalent to that of the con-ventional accurate inversion algorithm,while exhibiting lower computational complexity.Next,to accelerate the convergence of the proposed two-level Neumann iteration,an improved normalization factor based on mathematical expectation deduction is introduced,which can effectively ensure and accelerate the conver-gence of the iteration.Also,it is worth mentioning that the improved normalization factor can be calculated off-line,which means it would not introduce additional computational burden to the system.Numerical sim-ulation results show that for the 128×32 MIMO-MAUE system with non ideal propagation environment,the proposed TL-BD-INSA algorithm performs only 0.25 d B away from the MMSE detection when bit er-ror rate(BER)=10-3.Further,the implementation results on FPGA suggest that the TL-BD-INSA detector can achieve the hardware efficiency of 1731 bps/slices,which is 1.21×higher than that of the conventional MMSE detector.Therefore,the proposed TL-BD-INSA detector is suitable for both ideal uncorrelated chan-nels and challenging MIMO-MAUE systems with correlated channels,demonstrating its good robustness and low complexity.Secondly,this thesis studies the low-complexity and high-performance message passing detection al-gorithm for M-MIMO,and proposes a block-diagonal Neumann-series-based expectation propagation ap-proximation(BD-NS-EPA)algorithm,which is suitable for both ideal uncorrelated channels and correlated channels in MIMO-MAUE system.Firstly,through transforming the Neumann series from the matrix-based iterations to the vector-based iterations,a single-level Neumann iteration algorithm based on matrix partition is proposed,which successfully decreases the computational complexity and latency.Then,an adjustable sort-ing message updating(ASMU)strategy is proposed to reduce the redundant calculations of the convergent nodes in each iteration.Meanwhile,the normalization factor is also introduced to accelerate the convergence of iterations.In addition,a simplification strategy based on the hard decision is adopted to simplify the ex-ponential operations during iterations.Numerical results show that,for the 128×32 MIMO-MAUE system with non ideal propagation environment,the proposed BD-NS-EPA algorithm exhibits around 0.3 d B away from the EP detection when BER=10-3,at the cost of mere 3%normalized complexity.The implementa-tion results based on the SMIC 65 nm CMOS technology suggest that the proposed BD-NS-EPA detector can achieve 1.483 Gbps/W and 0.326 Mbps/k GE hardware efficiency,further demonstrating that the proposed BD-NS-EPA detector can achieve a good trade-off between error-rate performance and hardware efficiency.Next,this thesis studies the low-complexity S-CL decoder for polar codes based on stochastic comput-ing.A two-level decoding strategy based on stochastic computing is proposed,which effectively reduces the latency of decoding based on stochastic computing.Further,a low-complexity adaptive distributed sorting(ADS)algorithm for the two-level decoding strategy is proposed.By exerting the properties of the nodes in the polar code decoding tree,the process of selecting L optimal paths from 2L or 4L paths is executed with low complexity.Numerical results show that the S-CL decoder based on stochastic computing with list size2L can achieve slightly better performance than the S-CL decoder based on binary calculation with list size L.The hardware implementation results based on FPGA show that the ALMs and registers consumed by the S-CL decoder based on stochastic computing are only 5.6%and 14.2%of those based on binary implemen-tation,respectively.Finally,as a part of the research on the design of MIMO detector assisted by neural networks,this thesis explores the accelerated design of convolution kernel in convolution neural networks,and proposes a low complexity convolution architecture based on the fast FIR algorithm(FFA)and stochastic computing,which is named FFA-PSB.Firstly,exerting the inherent parallelism of the FFA algorithm,an efficient 2-dimension(2D)convolution architecture based on stochastic and binary hybrid computing is proposed.The elaborated hybrid computing can ensure high computing accuracy while reducing the hardware implementation complex-ity as much as possible.Next,by proposing a parallel input mode and combining the Sobol low discrepancy sequence with two-line stochastic computing,the hardware consumption of the delay module and the calcu-lation cycle of the system in the proposed architecture are reduced effectively.The experimental results for applying FFA-PSB convolution architecture to the Le Net-5 convolution neural network show that the imple-mentation based on FFA-PSB architecture can achieve near accuracy to the traditional convolution scheme that is implemented on the binary fixed-point.The implemented results based on the SMIC 65 nm CMOS pro-cess further show that the proposed FFA-PSB convolution architecture can achieve 1.5×the area efficiency and 1.3×energy efficiency compared with the latest stochastic-computing-based convolution accelerator. |