Chinese Open Relation Extraction For Complex Text

Posted on:2022-12-06

Degree:Master

Type:Thesis

Country:China

Candidate:J H Xiong

Full Text:PDF

GTID:2518306758974579

Subject:Electronics and Communications Engineering

Abstract/Summary:

Open relation extraction can extract the corpus flexibly without presetting the relation word list,and organize the knowledge quickly and effectively.However,corpus extracted by open relation usually contains a large number of complex texts.The existing open relation extraction methods are not effective in extracting such complex texts,the main problems are as follows :Firstly,The sentence structure of the text is complex,so it is difficult to analyze the accurate result of syntactic analysis to provide data support for open relation extraction.Secondly,Entity words in complex texts are usually noun phrases composed of multiple words,which are difficult to identify.Finally,it is difficult to extract all the relational data completely due to the overlapping of relationships in complex texts.Aiming at the above problems,this paper proposes two optimization methods of open relation extraction based on long sentence simplification and joint relation extraction based on multi-task learning to improve the performance of open relation extraction for complex texts.The main research contents are as follows:（1）An open relation extraction method based on long sentence simplification is proposed.The method firstly simplifies complex long sentences by using sequence to sequence model,and then extracts the simplified sentences according to rule template extraction method.In the process of relation extraction,entities are identified according to the heuristic rules of part-of-speech information,and then special extraction rules are designed for simplified clauses based on the results of syntactic analysis.（2）An open relation combined extraction method based on the multitasking learning is proposed.The method using sequence to sequence model to complex text directly.In this method,multi-relational data is sequentially transformed by a special relational sequence representation method,and then multi-task learning of entity label prediction and relation extraction is realized based on sequence annotation and special mask mechanism.Finally,the model is guided to generate entities in the relational data according to the predicted labels.（3）The knowledge base of the development process of thermometer paint was constructed.Firstly,according to the opinions of experts in the field,the corpus of knowledge about the development process of temperature indicating paint was collected from reference books and domestic periodicals.Then according to the open relation extraction method proposed in this paper,the relational data of the development knowledge of temperature indicating paint is extracted.Finally,according to extracted relational data and sorted entry data,the domain knowledge base is constructed and visualized.

Keywords/Search Tags:

Open Relationship Extraction, Long Sentence Simplification, Deep Learning, Sequence To Sequence Model, Multitasking Learning

Related items

1	Improving Sentence Simplification Models Based On Sequence To Sequence Model
2	Research On Deep Learning Algorithm For Sequence Data
3	Joint Entity Recognition And Relationship Extraction Model Based On Deep Learning
4	Research On Emotional Dialogue Generation Model Based On Deep Learning
5	Deep Sequence Learning Based Open-Domain Dialogue Generation
6	Research On Deep Keyword Generation Method Integrating Auxiliary Information
7	Research On Entity Relationship Extraction Algorithm Based On Deep Learning
8	Research On Chinese Lip Reading Recognition Based On Deep Learning
9	Research On Personalized Recommendation Method Of Learning Resources Based On Behavior Sequence
10	Study On Online Learning Text Feature Extraction Based On Sequence Model And Graph Convolution Model