Font Size: a A A

Research And Implementation Of Analysis Tool For Big Data In Power Grid

Posted on:2017-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:H M ZhengFull Text:PDF
GTID:2348330518995519Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Due to the rapid development of digital information era,the global information is growing explosively.Smart grid becomes one of the big data sources,which is the basis of the world's second largest economy energy supporting system.As an important infrastructure information,the value of smart grid big data can help to understand the economy.Currently,the management strategy of electric power industry is moving towards to service-centric.The volume of speech data in the call center will become bigger and bigger.However,current method is hard to adapter to the growing size.To find potential value of the data,improve company's strategy decision-making,improve service and management innovation,the smart grid should provide a variety of data statistical and analysis tools.This thesis is aiming to study and implement an analysis tool in power grid,particularly for the massive speech data in call center.Based on big data analysis techniques(such as Hadoop,MapReduce),the study implements multiple algorithms for speech data mining,which include speech recognition,speech analysis,text analysis,etc.For speech analysis algorithm,the procedures include pre-processing,feature extraction,voice activity detection(VAD)and speech emotion recognition.Furthermore,STT(Speech to Text)is introduced in speech recognition part,whose output becomes the input of text analysis algorithm.Text analysis solution is then presented,including word segmentation,text data cleaning,text clustering and emotion analysis,etc.When considering the data scale,a distributed text analytics solution can be used.This thesis is organized as follows.Firstly,the background is introduced which include significance of analysis tool for big data in power grid.Then the related technology of this study is proposed,including the basic principle of machine learning algorithms,speech emotion recognition technology,common text mining algorithm,etc.After that,functional and non-functional requirements of the system based on user scenarios are analyzed.Based on the requirement specification,four major problems in the study are proposed in detail.Firstly,speech characteristics retrieving method is described,which helps to get accurately extract volume,zero crossing rate(ZCR),pitch,formant,MFCC,LPCC and other parameters of speech characteristics.Secondly,a compound silence detection algorithm is presented,which using both volume and ZCR to overcome the defects of VAD by single volume parameter.Thirdly,considering the performance of massive text data,a distributed text analysis architecture is given based on R and distributed file system.Fourthly,in order to improve the accuracy of speech emotion recognition,the study proposes an improved algorithm using the emotion context reasoning rule of speech based on Hidden Markov Model(HMM)and Dynamic Time Warping(DTW).Based on the model and algorithm above,an overall structure design is proposed,which includes layer interaction design and detail implementation.After that,the tool is deployed in an experiment environment and tested.Finally,a conclusion is drawn which contains what has done and what will be done in the future.
Keywords/Search Tags:big data in power grid, call center, text mining, speech analysis, emotion recognition
PDF Full Text Request
Related items