Font Size: a A A

Research On Web Entity Event Fusion For Market Intelligence Analysis

Posted on:2015-03-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:T SunFull Text:PDF
GTID:1268330431455375Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid progress of Internet, Web has become an open and global information center. The companies want to obtain valuable market intelligence by big data analysis, obtain the opportunity in fierce market competition. On the Web, the companies concern about the events of entities related to them (include companies, products, people, etc.), these events describe the entities’activity and the latest status, and provide the first-hand information for mining market intelligence. A large number of event information on the Web as the form of news, reviews and message. It has high redundancy, poor accuracy and discrete characteristic, brings great inconvenience for market intelligence analysis. How to eliminate redundancy, discriminating, association events, integrate event information become a preconditions that accurate access to market intelligence.As an important step for market intelligence analysis, web entities event fusion can provide high-quality data, comprehensive, truthful and reliable data for market intelligence analysis. Therefore, it has attracted more and more researchers. However, event information in the form such as news appeares on the web, has the characteristics of expression freely, various forms and publish freely, etc. Web entity event fusion has to solve the following problems:(1)There is a big difference to describe the same event in different web sites. So the first thing shoud be solved is event coreference resolution;(2)Since the reason of events progress, different sites provide different event mention, website preferences and editor errors, makes the information on the web incomplete, outdate, erroneous, false, etc. Therefore, in order to ensure market intelligence analysis has accuracy data, web entity event fusin need to solve the events conflict resolution;(3) It is difficult to find the event whole picture from a single event, cannot know the ins and outs. So in order to provide an entities, events panorama, Web entities event fusion need to found the correlation between entities and events. The research of Web entity event fusion is a prerequisite for high-quality data and market intelligence analysis. The main work and contribution of this thesis is summarized as follows:(1) How to identify a number of different event mention on the Web, we presents a methods of Web entity event coreference resolution based on heterogeneous information network in this paper, it effectively improve the accuryacy of event coreference resolution.The method adopts a hierarchical clustering algorithm of event coreference resolution, and using the interaction between decision and making, then iteractive implement the event coreference resolution. In the event similarity measurement, the method of this thesis uses the relation of entities, events, documents and data sources, using event similarity measurement from different angle, obtain reasonable the similarity of event mentions. The experiments on the enterprises event data set, characters event data set and products event data set, the proposed algorithm can accomplisth the tasks of event coreference resolution, has better recall and precision.(2)Since the different event mentions provide incomplete, outdate and contradiction data, we puts forward a solution of event conflict resolution based on D-S evidence thory in this paper, can effectively solve the problem of event conflict resolution.According to the type of event conflict, the method adopts the strategy to solve the confliction, and uses the combination rules of D-S evidence theory, can effectively improve the accuracy of event conflict resolution. In the calculation of the credibility of event attributes, using the frequency of event attributes, location in the document, the quality of data source and other factors, adopting semi-supervised merchine learning method, calculating the credibility of event attributes’s fact. As the combination rule paradox problems existing in the traditional D-S evidence theory, then extend the theory and increase the accuracy of event confliction resolution, and allows to add new features, therefore the method has strong adaptability. (3)Since it cannot describe the event’s cause and progress from one event mention, we present a method to construct panorama based on entity and event in this paper.The method of this thesis uses five basic event relations and entity relation, describes the complex relation of entities and events, and lay the foundation for mining implicit relationship exists in events. In the event relation, according to the event relation types, we put forward a method to construct an event relation graph; We use the entity relationsip to link the event relation graph form a panorama in this paper. According to the experimental results, the proposed method can effectively establish entity, event correlation, has high accuracy.The research of Web entity event fusion solves the data quality problems of market intelligence, and lays the foundation for large-scale information analysis. Therefore, the research of this paper is very significance. In addition, the event detection, event pattern discovery and new event representation mechanism is the next research direction.
Keywords/Search Tags:Market Intelligence Analysis, Event Fusion, Event CoreferenceResolution, Data Conflict Resolution, Event Correlation
PDF Full Text Request
Related items