Font Size: a A A

Research And Implement Of Floating-point Division And Square Root Unit Based On Unified Structure

Posted on:2016-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:T T HeFull Text:PDF
GTID:2348330536967644Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the applications such as image processing,scientific computation and signal processing,a large number of floating-point operations are needed.It becomes be critical attribute that hardware supports floating-point operation in high-performance computer system,embedded system and mobile applications.In the past,the hardware implement of addition and multiplication had been more and more efficient.However,the implementation of division and square root still are hysteretic.This paper analyzed and studied several kinds of hardware implementation arithmetic of division and square root.Division and square root operations are similar,so,the paper implemented unified unit division and square root based on SRT arithmetic and Goldschmidt arithmetic depending on X-DSP's structure.Finally,the latter is selected to implement in X-DSP.Firstly,division and square root unit of unified structure based on Goldschmidt algorithm was designed and implemented in this paper.With analyzing look-up table algorithms,direct table method and bipartite table method were adopted for getting iterative initial value of division and square root separately.Multipliers' full-pipeline in iterative unit was implemented for enhancing throughput rate by increasing iteration controlling logic.Resource-cost was decreased drastically by reusing exited multipliers in X-DSP.Results from experiment show the implementation having the advantages that area overhead is acceptable and fast operation.But it influences performance of multiplication and is hard to support four round modes.Therefore,division and square root unit of unified structure based on SRT-8 algorithm was also designed and implemented in this paper and independent mantissa computing unit and normalization unit structure was adopted and special instructions was design to split iteration for avoiding adverse effect of long cycle instructions.Parallel structure was adopted in mantissa computing unit for decreasing iteration latency.And “on-the-fly” technology was improved to decrease complex of logic.Finally,performance and cost of the implementation were analyzing according to experiment result.Comparing the two implementations,the implementation based on Goldschmidt algorithm is suit for high-speed operation and its area-cost is less and throughput rate is higher.However,the implementation based on SRT-8 algorithm needn't reuse other operation unit and achieve remainder directly easy to support 4 round modes.X-DSP need to support 4 round modes,and process a large number of multiplications in practical applications,then the front's influence to multipliers can't be ignored.So,the design was implemented in X-DSP.
Keywords/Search Tags:division, square root, DSP, floating-point operation, SIMD
PDF Full Text Request
Related items