Design And Implementation Of Deep Learning-based Open Speech System For Innovative Enterprises

Posted on:2021-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liu

Full Text:PDF

GTID:2428330614972387

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

After decades of technological development,with the successful breakthrough of deep learning technology,automatic speech recognition technology is maturing,recognition accuracy has been greatly improved,and it has the ability to be commercialized on a large scale.Artificial intelligence automatic speech recognition technology has been widely used in various industries such as finance,medical treatment,court trial,vehicle,and tourism,and plays an extremely important role.But automatic speech recognition technology based on deep learning still encountered several problems in the process of industrialization.Among them,the "high threshold" of cost and technology has become an important factor restricting the rapid development of speech recognition technology.Based on the above problems,in order to allow more start-up companies to obtain free voice technology,thereby reducing the friction cost of the voice industry during the landing process,an open system that supports enterprise-level private engines is urgently needed.The author follows the thinking method of software engineering,and constructs an open speech recognition system with modules such as language recognition experience,downloading of voice service resources,language model customization,and model performance testing,combined with the specific business needs of the company where the internship is located,the Django web development framework and lightweight Nginx and u WSGI services are applied.After a lot of research related to speech recognition,the author selected the TDNN-f algorithm and Chain training criteria that can be efficiently modeled for acoustic model training,and used the n-gram model and the method of adding hot words for language model training and optimization.WER serves as an indicator of speech recognition rate to ensure the recognition effect and working efficiency of the automatic speech recognition engine.The author participated in and completed the design of the system architecture,the implementation of online speech recognition module,resource download,language model adaptive training module and model performance test module,assisted in the construction of speech recognition engine.Finally,through testing,it is verified that the functions of each module of the system meet the requirements.Finally,through testing and verification,it is verified that the functions of each module of the system meet the requirements.At present,the system has been online and operates steadily,and has been adopted by many manufacturers to provide convenient voice services for many startups.

Keywords/Search Tags:

automatic speech recognition, open source system, language model, Django, TDNN-f

PDF Full Text Request

Related items

1	Yi Language Speech Recognition Using Deep Learning Methods
2	Application Research On Statistical Language Model Of Large Vocabulary Continuous Speech Recognition System
3	Research On Evolution Technology Of Open Source Software Oriented To Speech Recognition Application
4	Researching Of The Mogolian Language Model Based On Speech Recognition
5	Research On Chinese Speech Recognition Based On Kaldi
6	Design And Implementation Of Speech Recognition System Based On DNN-LSTM
7	Chineses Speech Recognition System Based On CLDNN Hybrid Model
8	Researching And Building Of The Mongolian Large Vocabulary Independent Continuous Speech Recognition System
9	Research On Statistical Language Model Of Large-Vocobulary Continuous Speech Recognition System
10	Mongolian Language Model Based On Recurrent Neural Network