| In recent years,with the wide application of intelligent portable devices,increasingly complex application algorithms have put forward higher and higher requirements for the performance of embedded microprocessors.Traditional single-core general-purpose processors have long execution cycles and low efficiency when executing deep learning and other algorithms with a large amount of computation and a high degree of parallelism,which can no longer meet application requirements.It is of great significance to study processors configured by instructions according to different algorithm characteristics to meet the performance requirements of different application scenarios.Based on the analysis of the RISC-Ⅴ instruction set architecture,and through the analysis of the existing mainstream artificial intelligence algorithms,a computing granularity configurable processor architecture based on RISC-Ⅴ extended instructions is proposed.On the basis of the existing RISC-Ⅴ instruction set,the architecture processor designs five 128-bit custom extended instructions according to the data characteristics and operation types in the algorithm containing high parallel operations,including three calculation type instructions:mac instruction,parallel instruction and compare instruction,and 2 memory access instructions:vload instruction and vstore instruction.The calculation type instruction can configure the same operation unit according to the algorithm requirements,and complete different parallelism and different types of operation operations at one time.The fetch type instruction can complete the reading of multiple data at one time.The use of custom extended instructions reduces the number of instructions in the algorithm and improves the execution efficiency of the processor.The processor in this thesis is improved and optimized on the basis of the general-purpose four-stage pipeline micro-architecture processor Zion designed by our research group,and the pipeline functional components that can support different high-parallel operation algorithms are designed,including instruction fetch module,decoder module and execut module.The whole processor is realized by verilog hardware description language,and the function simulation of the internal modules of the processor is completed by VCS.The 28×28 size white blood cell image is used as the test sample,and the image algorithm implemented by the custom extension instruction is used as the test program to complete the overall test of the computing granularity configuration processor.Finally,the correctness of the processor execution algorithm designed in this thesis is verified.Use DC tools to synthesize the logic of the processor.Under the UMC40nm process,the operating frequency of the processor is 250MHz and the area is 50753.0235μm2.When executing various image algorithms,the speed of Zion processor is increased by 5.95 times to 22.21 times,and the number of instructions is reduced by 1.16 times to 7.49 times. |