Font Size: a A A

Research And Application Of Customer Negative News Automatic Retrieval Method

Posted on:2016-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:B D ZhouFull Text:PDF
GTID:2298330470457884Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The department of risk assessment of a financial institution usually uses search engine to search the negative news of customers which they are interested in on the Internet to find the risk as soon as possible. In this way, the risk assessment department can make the correct decision. However, doing this work manually is low efficiency and time-consuming. Because negative news belongs to the emotional text, the research on the automatic retrieval and identification of Internet emotional text has great significance and practical value.On the basis of in-depth research on today’s emotional tendentiousness recognition technology, two algorithms are proposed:1) Emotional Tendentiousness Recognition Algorithm based on Tendency Word Collocation (ETRTWC).2) Negative News Extraction Algorithm based on the Context Framework (NNECF).The ETRTWC algorithm is used to judge the emotional tendency of customer news. Based on the results, the news can be divided into three categories:positive, neutral and negative and customers comprehensive scoring results is concluded; The NNECF algorithm is used to extract negative news from the set of news, and take intersection with the negative emotional news that the ETRTWC algorithm identified to extract negative news set jointly.The main work of this thesis as follows:1. Two kinds of emotional and negative news recognition algorithms are proposed, namely, the ETRTWC algorithm and the NNECF algorithm. The ETRTWC algorithm is put forward assigning four attributes to a single word. And it obtains the emotional value of the entire sentence by dependency grammar and scoring rules. Finally it gets the emotional value of the whole news. The NNECF algorithm defines the context framework by every single context negative news set. By building context framework library and frameworks vocabulary level library, the NNECF algorithm is used to judge whether the sentence belong to a particular framework that combined with the corresponding Chinese natural language processing technology and a log-linear model theory. Then NNECF algorithm is used to judge whether the sentence is negative news.2. This thesis is designed and implemented a set of customer negative news retrieval system automatically (CNNRSA). The system uses the B/S architecture. CNNRSA take the negative news recognition algorithm presented in this thesis as the core, and used Fudan University Natural Language Processing system (FNLP) for Chinese sentence segmentation, POS tagging and dependency relationship analysis. The system includes Internet news crawl, news emotional preliminary classification, extraction of negative news, and news warehousing/query/retrieval and other major functional modules.3. Experiment shows that the ETRTWC algorithm and the NNECF algorithm have good performance. Using "诺基亚" as customer’s keyword, the feasibility and effectiveness of the system are tested and verified in this thesis. The development work of the main module of the CNNRSA has been basically completed and the system can run normally.
Keywords/Search Tags:Internet news, emotional tendentiousness recognition, contextframework, tendentiousness words collocation, dependency syntax
PDF Full Text Request
Related items