Font Size: a A A

Design And Implementation Of Voice User Interface Script Editor

Posted on:2016-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:W XiongFull Text:PDF
GTID:2308330473952512Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Speech recognition technology(ASR, Automatic Speech Recognition), has a major breakthrough in 2012 by using Deep Neural Network(DNN) to replace Gaussian distribution in voice feature abstraction. This breakthrough improves recognition accuracy by at least 30% and make self-service by using speech recognition become popular in China especially on IVR application. Enterprises including ICBC(Industrial and Commercial Bank of China), China CITIC Bank, Ping An Bank, Bank of communications, Shanghai Pudong Development Bank, etc. All started to deployed voice enable IVR service. However, many of these services failed to reach expectations compare with the successful experience in the Japan and US market. The main reason for this failure is China enterprises tend to focus on the recognition accuracy but ignore the important of voice user interface design(VUI). In speech recognition application, VUI is as important as graphical user interface(GUI) in web self-service application. Poor VUI design will caused user to quickly abandon self-service and request operator directly no matter how rich the service content is and how quickly the service can be completed.In traditional IVR system, application is designed based on different business logic and different hardware platform that has the following problems as lack of portability, flexibility, limited access to internet information. Interface design becomes a very challenging and tedious job which will required huge amount of development effort and long lead time. In order to shorten this cycle, improve application portability and easy to create business logic flow chart, this thesis focus on how to design a script editor based on Voice-XML to improve VUI in speech recognition application.This script editor can be considered from architecture layer as divided into user interface layer, business function layer, and basic function layer three different layers. In the user interface layer, this editor provides menu bar, toolbar, drawing bar, project bar, log bar, feature bar and node bar as user selection. In business function layer, this editor provides functions as the project management module, the toolbox module, the file management module, the edit attribute module and the window function module. In the basic function layer, this editor provides functions including third party database, data storage, node model, interface development, event monitoring etc. The toolkit is based on swing components, and divided into 3 levels as top level, middle and the basic level for different purposes. For business flow design, this editor use graphics components(node) to describe basic functions those are repeated use in flow and allow designed to use graphic tools to logically link different nodes which will reduce designer’s coding effort and focus on the logic itself to make user interface more friendly. After input proper information into the script editor, it will automatic generate target Voice-XML document and can be used by IVR directly to improve development cycle time. The script editor can also be used as a testing tool to ensure the feasibility of different function modular.
Keywords/Search Tags:IVR(Interactive Voice Response), Script Editor, Voice-XML, ASR(Automatic Speech Recognition)
PDF Full Text Request
Related items