| This thesis presents the design of low-power bit-serial and digit-serial DSP systems. Digit-serial systems process a certain number (digit-size) of bits instead of an entire word as the popular bit-parallel systems do in a single clock cycle. They promise to be more area and power efficient than the bit-parallel counterparts, and are well suited for moderate sample rate applications. Digit-serial systems reduce to bit-serial systems when the digit-size equals one.;This thesis will explore the digit-serial implementation styles from three aspects. First, the design of high-performance digit-serial arithmetic units including adders, multipliers and complex-number multipliers which form the backbone of the DSP systems will be addressed. The proposed arithmetic architectures break the limitation of digit-level pipelining which exists in the conventional architectures, and permit fine-grain level of pipelining which can lead to high throughput rate or very low power consumption. Second, at the same target throughput rate requirement, general qualitative and quantitative comparison between bit-parallel and digit-serial approaches against the programmable and dedicated architectures will be discussed. It will be shown that the digit-serial approach favors the design of dedicated architectures. The reasons why the design of programmable processors is not suitable for digit-serial implementation will also be identified. Finally, several important DSP algorithms such as FIR/IIR, FFT and Viterbi decoders are implemented by bit-serial/digit-serial architectures to demonstrate the practical applications of the bit-serial/digit-serial approach.;In addition, this thesis will also present a heuristic algorithm for high-level synthesis of simple DSP filtering applications using heterogeneous (bit-parallel, digit-serial and bit-serial) functional units. The algorithm not only generates the near-optimum synthesis schedule in significantly less amount of time but also illustrates that bit-serial/digit-serial arithmetic units can be integrated with bit-parallel ones to reduce the overall cost without degrading the throughput performance. |