Font Size: a A A

Research On Key Technology Of Accelerating Floating-Point Matrix Multiplication Based On FPGA In Embedded Environment

Posted on:2014-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2298330425984180Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As the basic digital signal processing algorithm, floating-point matrix multiplication is widely applied in communication, network, industrial, medical and other areas. The implementation of these applications mainly depends on embedded systems. However, floating-point matrix multiplication becoming the bottleneck of the computing performance in embedded system, because of the higher computation complexity and lower efficiency. FPGA (Field Programmable Gate Array) co-processor is parallel, fast, programmable and flexible, which has become an effiective way to enhance the computing performance of embedded system. Therefore, the research of accelerating floating-point matrix multiplication based on FPGA in embedded environment is very important.Floating-point matrix multiplication is the key computing part of the mathematical separation algorithm in three-dimensional fluorescence. This paper proposes a novel pipelined parallel architecture to implement floating-point matrix multiplication based on the analysis of the floating-point matrix multiplication algorithm and strcture of FPGA hardware. In order to accelerating this floating-point matrix multiplication in embedded environment, this paper researches the communication mechanism between heterogeneous processors. Main content of this paper is as follows:Multiply accumulator is the core computing unit of matrix multiplication. This paper proposes a pipelined floating-point multiply accumulate structure based on the IP (Intellectual Property) core of floating-point multiplier and adder through analyzing the calculating process of the multiply accumulator in each clock cycle. In this structure, adding the last N stage result of floating-point adder after data being calculated by the floating-point multiplier and adder is the only needed thing to be done. Moreover, the pipeline stage in this structure can be adjusted to adapted the needs of different applications.Based on the above multiply accumulator, this paper designs a novel floating-point matrix multiplier using parallel architecture, which can reduce its computional complexity and improve its computing speed. In this matrix multiplier, the parameter of the line and row of matrices can be configured, and the number of PE (Processing Element) also can be set according to the resources of FPGA. There are no intercommunications between neighboring PEs, so the matrix multiplier has good scalability.For the communication problem between FPGA co-processor and embedded system processor, this paper designs the communication structure based on the UART and PCI-E bus. In the PCI-E communication structure, FPGA designs based on the structure of SOPC (System-on-a-Programmable-Chip) and the embedded processor driver is combined to achieve the cooperative running of hardware and software.This paper implements the floating-point multiply accumulator and matrix multiplier based on the Verilog hardware description language. Meantime, this paper analyzes the performance from the experiment results of simulation and synthesis. To evaluate the accelerating performance of the floating-point matrix multiplication in embedded environment, this paper designs communication system based on the UART and PCI-E protocol on the Intel E6x5C platform. Experimental results show that adapting the PCI-E bus to accelerate the computing of floating-point matrix multiplication can make computing speed be improved by10times and200times in Cortex A9and ARM9embedded platform respectively. Therefore, this accelerating method can effectively improve the floating-point computing performance in embedded system.
Keywords/Search Tags:Floating-Point Matrix Multiplication, Multiply Accumulator, FPGAAcceleration, Embedded System, PCI-E Bus
PDF Full Text Request
Related items