Font Size: a A A

Research On Text Cross-language Information Retrieval Technology Based On Conceptual Graph

Posted on:2020-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y L HuFull Text:PDF
GTID:2428330575961961Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Traditional cross-linguistic information retrieval methods mainly rely on translation technology,through the translation of source text and information retrieval in another linguistic environment.In recent years,semantic-based text processing methods have performed well in many fields of natural language processing.In this regard,this paper studies a technology of text cross-language information retrieval through semantics,which is based on text conceptual graph to retrieve cross-language text semantically,including the construction of bilingual conceptual graph,the vectorized representation and retrieval of bilingual conceptual graph.The construction part of conceptual graph is to formalize the full text of the text,which can preserve the important information in the text under the condition of greatly reducing the size of the text.Firstly,using LSTM network which integrates Attention mechanism,a generative summarization model is constructed to automatically abstract long text.The model preliminarily filters the important concepts and relationships in the whole paper.After briefly annotating the concepts and relations in the abstract,the edges between concepts are established through the relations,and then the sub-important relations are eliminated by the methods of edge expansion and fusion.The indirect relations are introduced and the important relations are retained to generate the topological structure between concepts.The vectorized representation and retrieval part of bilingual conceptual graph,embedding the conceptual graph in vector space to generate the graph-level labels of the conceptual graph in vector space,so as to carry out similar retrieval.By embedding the structure and content of the graphs,similar cross-lingual conceptual graph are similar after embedding.This paper proposes a cross-language information retrieval framework CG-CLIR framework for conceptual graph,which integrates the context node relationship information and the structure information of conceptual graph.With Skip-gram and CBOW as the semantic support,random walk based on Gumbel distribution is combined with LSTM network for the semantic representation of bilingual conceptual graph,and then high-order semantic representation is extracted through full connection layer.After that,the similarity score of the conceptual graph is output to fulfill the retrieval requirement.This paper sets up different experiments on the effect of generating conceptual graph and cross-language retrieval of conceptual graph to verify the feasibility and advantages of this technology.Experiments show that conceptual graph construction based on relation fusion and CG-CLIR are effective in text application.The results of cross-language information retrieval in this paper are better than those of traditional CLIR and ontology-based information retrieval.
Keywords/Search Tags:Conceptual graph, Cross-language Information Retrieval, Semantic representation, Similarity, Semantic search
PDF Full Text Request
Related items