Font Size: a A A

Cross-language Information Retrieval Method Based On Multi-task Learning

Posted on:2023-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:J Y DaiFull Text:PDF
GTID:2568307079488314Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cross-language information retrieval technology,which enables users to retrieve information in a foreign language directly in their native language without learning the foreign language,is of great relevance in this era of globalization and has therefore received a lot of attention from researchers.Thanks to the development of neural network technology in recent years,neural retrieval models have achieved certain research results in the field of cross-language information retrieval.However,the differences in grammar and vocabulary among different languages in cross-lingual environments have affected the feature extraction ability of neural retrieval models,limiting the prospect of cross-lingual neural retrieval methods.Therefore,how to improve the feature capturing ability of neural retrieval models in cross-language environments has gradually become a research hotspot.Existing cross-lingual neural retrieval methods usually unify the languages of queries and documents through translation techniques or semantic models before inputting them into neural retrieval models to calculate relevance scores.Since most of these approaches are single-task learning,i.e.,the neural networks are trained only for the task goal of retrieval,so that the models focus on capturing the interaction information and matching relationships between queries and documents,while ignoring the semantic features of both at other levels.Some studies have shown that the semantic features of queries and documents at other levels are also important for the retrieval task,and ignoring these information will affect the performance of the retrieval model.To address the above issues,we mainly does the following two aspects:Existing In this paper,we propose a cross-language information retrieval method based on soft parameter sharing multi-task learning.The method uses an interaction-based neural retrieval model as the main model and a semantic-based text classification model as the auxiliary model,and performs soft parameter sharing multitask learning by exchanging hidden vectors between the feature extraction layers of the two models.This enables the neural retrieval model to learn the representation features of the text classification task while capturing the interaction information,thus improving its representation capability and ultimately the quality of the retrieval task.Experiments on four language pairs show that the approach significantly improves the performance of the cross-lingual neural retrieval model.1.we propose a cross-language information retrieval method based on soft parameter sharing multi-task learning.The method uses an interaction-based neural retrieval model as the primary model and a semantic-based text classification model as the secondary model,and exchanges hidden vectors between the feature extraction layers of the two models for soft parameter sharing multitask learning.This enables the neural retrieval model to capture interaction information while learning to extract semantic information through the text classification task,thus improving its representation capability and ultimately the quality of the retrieval task.Experiments on four language pairs show that our approach significantly improves the performance of the cross-lingual neural retrieval model.2.To mitigate the problem of extraneous semantics introduced by the external corpus of the auxiliary task,we propose a cross-lingual information retrieval method based on hard parameter sharing multi-task learning.The method first transforms the retrieval task corpus so that it can be used for both classification and retrieval tasks,thus eliminating the irrelevant information introduced by the classification corpus;then the same word feature extraction layer is shared between the two tasks for multitask learning;and then this shared layer extracts text representations to be input into the semantic-based text classification model and the interactionbased neural retrieval model to perform the two tasks,respectively.Experimental results show that our approach outperforms most current cross-language retrieval methods.
Keywords/Search Tags:Information retrieval, multi-task learning, cross-language information retrieval, neural retrieval model, external corpus
PDF Full Text Request
Related items