Font Size: a A A

Design And Implementation Of Large-scale Short Text Classification System

Posted on:2020-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:J J PengFull Text:PDF
GTID:2428330575457047Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The popularity of the Internet has made customers' demand for customer service quickly satisfied by online customer service.However,with the increase of enterprise product lines and the growth of Internet users,the traditional customer service system can no longer easily meet the needs of modern enterprises and users.And the intelligent customer service system powered by the rapid development of Artificial Intelligence(AI)technology can help solve this problem.In the intelligent customer service system,there is a large class based on FAQ(Frequently Asked Questions).It is the main task of this kind of intelligent customer service system to accurately match the user questions with the standard questions in the FAQ.This task can be completed by using text classification technology.However,the number of FAQs provided by enterprises is increasing now,and there are more than tens of thousands of FAQ,and most of the user questions are short texts,which contain less information and often have noise and lack of theme,which greatly increases the difficulty of classification.Therefore,solving this problem requires facing both large-scale and short-text challenges.At present,there are solutions based on traditional methods and deep learning methods for short text classification.Traditional methods usually rely on external data with high correlation.The deep learning method has been more prominent in(?)recent years,but it has not been performed on FAQ matching corpus with classification tasks.For large-scale text categorization,hierarchical classification is usually considered,but the current hierarchical classification structure is mostly constructed based on existing relationships between categories,and the representation information between categories is not fully utilized.Aiming at the short-text classification requirements of large-scale categories in the intelligent customer service system,this paper designs and implements a large-scale short text classification system to achieve this demand by providing corresponding services to the administrators and users.The system improves the data sparse problem of short text by designing a good representation method for short texts in the intelligent customer service scene,optimizes on multiple text representation methods based on traditional methods or deep learning methods,and finally selects the scheme which based on Convolution Neural Networks methods as the short text representation scheme.Then use the short text representation to get the category representation,and then construct the hierarchical classification structure based on the inter-class separability to alleviate the problem of large-scale categories.In the system design stage,this paper proposes five hierarchical classification structure,and through contrast experiments,finally the two hierarchical classifications with the best effect in the context of corpus balance and corpus imbalance are selected fr-om the structure of five experimental schemes.The classification structure,while improving the performance of large-scale short text classification,provides the system with the function of automatically selecting the optimal structure in the experimental scheme according to the corpus balance.The test results on the system show that the system scheme proposed in this paper can provide intelligent customer service with better performance than the traditional scheme.The test results on the system show that the system scheme proposed in this paper can provide intelligent customer service with better performance than the traditional scheme.
Keywords/Search Tags:text classification, text representation, large-scale, short text, hierarchical classification
PDF Full Text Request
Related items