Font Size: a A A

Research And Application Of Joint Methods For Chinese Information Extraction

Posted on:2024-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:W D MuFull Text:PDF
GTID:2558307079472354Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Data is a scarce resource,and both data analysis and deep learning rely heavily on data support.However,network data is often unstructured,which poses challenges for its utilization and analysis.In this context,information extraction has become one of the important ways to deal with massive unstructured data.However,if the extracted information is not fully utilized,it becomes worthless.As a downstream task of information extraction,the recommendation system can provide personalized recommendations based on user interests,thereby achieving effective information filtering.Therefore,this thesis conducts research on the joint methods of entity relation extraction and event extraction for Chinese information extraction,and sequence recommendation algorithms.The main content is as follows:(1)A model based on gated convolutional networks and LSTM is proposed for entity relation extraction in Chinese text.The method leverages the ability of gated convolutional network to extract word-level information and the ability of LSTM to capture long-term dependencies.The gated convolutional network is also optimized to have better coding and generalization capabilities.To address the problem of boundary division in Chinese text,a multi-feature fusion embedding layer is proposed,which enhances the vector representation ability by combining features such as entities and parts of speech with characters as the basic unit.Experimental results show that this method outperforms Cas Rel by 0.4% on the Du IE dataset,which verifies the effectiveness and generalization ability of the proposed method.(2)A weight-gated convolutional network is proposed for event extraction of Chinese text,and the Transformer is optimized on this basis.Inspired by the scaled dot product attention in the attention mechanism,the weight gate mechanism is designed to enable the network to retain important features while forgetting non-important features.The modified Transformer also further enhances the ability to capture important features and nonlinear transformations.Compared with all other methods,the optimized method achieves over 3% improvement on two datasets.(3)Long-short interest network based on graph method is proposed to better capture the long-term and short-term interests of users.In the path sampling algorithm of this network,weight information is added to enable the sampled paths to reflect the evolution process of user interests,thus generating graph embedding vectors that better represent user preferences.At the same time,the network models long-term and short-term interests of respectively,making the final recommendation results more consistent with the actual situation.Experimental results confirm the effectiveness of this method,with a 5%improvement in NDCG evaluation compared to SRSRec.(4)A news recommendation system is designed and implemented.The system utilizes the information extraction method proposed in this thesis to extract entities and events from news data.Then,a graph embedding-based recommendation algorithm is used to achieve personalized recommendation for users.The system is based on the principle of high cohesion and low coupling,and is implemented as a front-and back-end system.Finally,the system is validated with test cases to ensure that meets all the requirements proposed in the requirements analysis.In summary,this thesis conducts relevant research on entity relation extraction,event extraction,and sequence recommendation,and proposes corresponding optimization methods.And based on the above research,a personalized news recommendation system is designed and implemented.
Keywords/Search Tags:Entity-Relationship Extraction, Event Extraction, Sequential Recommendation, Self-Attention
PDF Full Text Request
Related items