| With the rapid development of our economy and the continuous progress of science and technology,the demand for information and knowledge in all walks of life is increasing day by day.Among all industries,the need for information and knowledge is more prominent in the financial sector.However,the rapid growth of financial information and the existence of a large number of unstructured financial announcement texts have brought difficulties to financial research teams in information processing and announcement study.In order to solve this problem,the financial knowledge base comes into being,which aims to extract the relevant entity,relationship and attribute information from the massive text,so as to help people better understand and grasp the development dynamics and trends of the market.This thesis aims to build a knowledge base system of news in the financial field.The knowledge base system is designed and implemented by taking the news in the financial field as the data source.The main research work is as follows:(1)Understand the related technologies needed to build the knowledge base of news in the financial field,and design the system according to the requirements of this thesis,which can be mainly divided into four modules: data acquisition,knowledge extraction,event extraction and knowledge storage.(2)This thesis puts forward the word vector representation model of SAH(Skip-gramattention-hownet),and integrates the Attention mechanism and HowNet meaning in Skip-gram frame,so as to better understand the meaning of words and improve the efficiency of vocabulary learning.The SAH-Bi LSTM-CRF-HowNet model is proposed by using the semanthems in HowNet as semantic features and adding them to the Bi LSTM-CRF model.The model uses HowNet to dig deeply into words and find the correlation between named entities.After comparative experiments,Compared with other methods,the accuracy of this method increased by 3.2%,the recall rate increased by 2.43%,and the F1 value increased by 1.7%.(3)Based on the SAH-Bi LSTM-CRF-HowNet model proposed in this thesis,an entity relation library in the financial field is constructed.This library first uses knowledge extraction technology to obtain entity information,and then uses the upper and lower relation and hierarchy structure in HowNet to calculate the distance between the meaning of each entity,so as to obtain the similarity between entities.The objective of entity disambiguation is achieved.(4)This thesis first classifies the news text and defines different types of event templates.Then the key information of news events is extracted by event extraction technology to construct the structured description of events.Then,based on the entity relation database constructed in(3),the subject of events is associated with the relation,and the relation between events is constructed.Finally,a complete knowledge base of financial news is realized.In general,this thesis proposes a construction scheme of the financial news knowledge base based on HowNet,and realizes the integration,display and search functions of the knowledge base through the development of the application platform in Python,so as to provide users with services to quickly obtain the required information.After the realization and testing of the system function,the construction of the financial news knowledge base is completed. |