Font Size: a A A

Research And Implementation Of Domain Entity Disambiguation And Event Filling System

Posted on:2022-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:X A QiFull Text:PDF
GTID:2518306773475204Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the popularization and application of the Internet,network data is growing at an exponential rate.The network has become one of the largest data warehouses,and a large amount of data is presented in the form of natural language on the network.However,the quality of such data is uneven,and good data governance technology can effectively improve the data quality.Presenting semi-structured and unstructured texts containing event information in natural language in a structured form,that is,identifying specific types of events and extracting event subjects and related arguments,will provide strong support for applications such as automatic summarization,automatic question and answer,information retrieval and auxiliary decision-making.However,the extracted event subject and some arguments have a high degree of ambiguity and diversity.Ambiguity is reflected in that the same entity reference can refer to different entities in different contexts,and diversity means that the same entity will have different references in the text.If we can eliminate the above ambiguity,connect the event information with the existing data resources,and realize the important supplement to the existing resources,it will be more helpful for people to make decisions,so as to make effective use of network data for data analysis.Data in specific fields often have specific data characteristics.Taking the financial field as an example,this paper studies the entity disambiguation and event filling technology in data governance,designs and implements a domain entity disambiguation and event filling system integrating multi feature map and entity influence,so as to display high-quality data to users in a structured form.The system consists of three parts: multi feature graph construction module based on candidate entities,domain entity disambiguation module and domain event filling module.In the multi feature graph construction module based on candidate entities,firstly,the triples of financial category related keywords in CN dbmedia are extracted to construct the financial domain knowledge base;Then,for the text of financial activities,the entity reference to be disambiguated is extracted,and the candidate entities are selected by fusing the similar characteristics of string and semantics;The2-hop relationship between candidate entities is obtained by using the triplet information of knowledge base,and the similarity between candidate entities is calculated as the edge weight.Then the multi feature information is fully integrated into the graph model to complete the construction of multi feature graph.In the domain entity disambiguation module,aiming at the problems existing in the existing methods,such as only realizing single entity reference disambiguation,ignoring the influence of entity influence and similarity between candidate entities on disambiguation results,and increasing the computational complexity of redundant graph nodes,a domain entity disambiguation method integrating multi feature graph and entity influence is proposed,which adopts dynamic decision strategy and improved Page Rank algorithm,Combined with entity influence,the comprehensive score of candidate entities in multi feature map is calculated,and then the disambiguation result with high reliability is obtained.The test results verify the accuracy and efficiency of the proposed method in specific domain entity disambiguation.In the event filling module,for 12 types of financial events such as equity increase and bankruptcy liquidation,two storage strategies of event filling for single event element disambiguation and event filling for multi event element disambiguation are designed to fill new events and update the related attributes of the association table.In order to realize the above system,this paper first expounds the research background and research status;Then,the overall analysis and design of the system are carried out according to the requirements and related technologies.Aiming at the key problems of domain entity disambiguation,the paper studies and gives solutions,and experiments and analyzes the effectiveness of the proposed algorithm according to the evaluation criteria.Finally,the paper tests the system to prove the availability and effectiveness of the system.
Keywords/Search Tags:domain entity disambiguation, multi-feature graph, entity influence, knowledge base, domain event filling
PDF Full Text Request
Related items