In this thesis, we present a cache-based mediaprocessor memory system that requires minimal programmer input, but can transfer data between processor and main memory as efficiently as a DMA controller. We have tested this new memory system with a set of multimedia functions. The data transfer is efficient for three reasons. First, it utilizes a hardware prefetcher to tolerate main memory latency. Second, it transfers both input and output data in blocks to minimize page misses in the of chip memory and sustain a high main memory throughput. Third, a no-write-allocate write-miss policy is used to efficiently utilize main memory bandwidth. The simulation results show that our cache-based memory system, on average, reduces execution time by 53.9% compared to a baseline cache-based architecture. In comparison, the average execution time reduction of a DMA controller over the baseline architecture was 56.0%.; The thesis is that cache-based mediaprocessors are desirable due to their simple programming paradigm despite of lower performance, and the efficient main memory data transfer of a DMA-based mediaprocessor can be effectively modeled and incorporated into cache-based mediaprocessors. The efficient main memory data transfer can be achieved with a cache-based memory system by complementing it with a prefetching cache, a blocking write buffer, and a no-write-allocate write-miss policy to tolerate memory latency and better utilize memory bandwidth. The hardware cost of such a cache-based memory system is expected to be comparable to the cost of a DMA-based memory system. |