Font Size: a A A

An Adaptive Speech Processing Algorithm For Chip Based On Binary Neural Networks

Posted on:2019-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:D D SongFull Text:PDF
GTID:2428330590451645Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of smartphones and tablets,the use of voice technology and interaction is becoming more and more common and important.For example,Google provides the ability to search by voice on Android.Apple's i OS device is equipped with a session function assistant called Siri,and Amazon provides a smart speaker Echo.These products support simple wake-up words and command query functions.The more important application scene is the driving scene.It is particularly important to wake up the device to recognize continuously speech and can by controled by commands.However,such a system must be highly accurate,low-latency,small-sized,low-power,and can be operated in a constrained environment like mobile device.So that it can run on off-line devices and avoid delay and power consumption.I started my research on this scene.This paper mainly focuses on the Thinker-II end-to-end speech's chip which is developed by the author's laboratory's self-designed architecture.This architecture is mainly applied to voice interaction scene on mobile device,such as voice activity detection module,keyword spotting,command word control function module and speaker recognition.In order to design a highly accurate,low-latency,small-sized and low-power algorithm.First of all,This paper designs a algorithm framework which is suitable for Multitask.And in order to design these scenes on chips,this paper designs specific confidence judgment for these tasks to reduce latency and computational complexity,making it friendly for the design of hardware.Then this paper researches on various compression and quantification methods,and then quantify the feature using the direct 16-bit quantizations and quantify the network to binary.In order to achieve a better quantification effect and prevent overfitting,This paper adopts sparse processing of binary networks.At the same time,in order to facilitate the hardware design,this paper adopts approaches of softmax classification selector and batch standardization layer.Lastly,after designing a low-bit quantization system,because the use of low-bit networks,this architecture has a loss of accuracy.This paper uses stochastic gradient descent method combined with SADMM to update weights,and introduces adaptive techniques to increase specific speaker's command words' recognition.The main innovations of this article are:· Designed a algorithm framework of speech processing system for chip design and development.· Low-precision quantization of speech features and confidence criteria,binarized neural networks to achieve end-to-end quantification;· sparse neural networks?approximate softmax classifiers,and batch normalizations to reduce computational costs;· Design high-efficiency,low-delay confidence criterion for different tasks;· Using SADMM for training and adaptive techniques to increase specific person recognition accuracy rates.
Keywords/Search Tags:speech processing, binary network, sparce, confidence judgment, adaption
PDF Full Text Request
Related items