Artificial intelligence is divided into three important levels:computational intelligence,perceptual intelligence and cognitive intelligence.Natural language processing research,on the other hand,is one of the important areas of cognitive intelligence.In recent years,with the emergence of large-scale pre-trained language models,language models have been driven by the depth of massive data to produce surprising semantic representation performance,gradually overturning the inherent human perception.Among them,semantic representation learning and application of multilingualism has received much attention because of its linguistic and geographical universality.According to statistics,there are more than 5,000 languages in the world,but only a dozen of them are spoken by more than 10 million people.Therefore,it has become a hot topic to realize cross-language information transfer or multilingual semantic sharing to break the language barrier.At present,when large-scale multilingual pre-trained models are used to solve various cross-language downstream tasks,they often face problems such as under-utilization of pre-trained models,lack of supervised information for cross-language alignment,and difficulty in importing specific external knowledge base,resulting in poor performance of the models in specific scenarios,which affects the cross-language transfer performance or multilingual representation effect.In this paper,we focus on multilingual systems based on pre-trained models,and carry out research work in the areas of model structure exploration,pre-training method improvement and external knowledge base introduction,specifically:First,a study on double-layer feature aggregation based on a multilingual pretrained model was conducted.Currently,the mainstream approach to solve cross-lingual downstream tasks always uses the output of the last Transformer layer of the multilingual pre-trained model mBERT as a representation of linguistic information.In this paper,we explore the complementarity of the lower layer with the last layer of mBERT and propose a feature aggregation module based on an attention mechanism for fusing the information contained in different layers.The approach achieves performance improvements on four zero-shot cross-language transfer benchmark tasks and empirically explores the interpretability of mBERT.Secondly,a study on pre-training methods for multilingual models based on multilevel contrastive learning is conducted.At this stage,multilingual models are always trained in massive multilingual data for basic contextual word modeling,and this pretraining approach only encourages implicit cross-linguistic alignment in the semantic vector space.In this paper,we propose a cross-linguistic pre-training approach based on multi-level contrastive learning,and construct a supervised method for explicit crosslinguistic alignment at the sentence and word level through parallel corpora and bilingual dictionaries,which significantly improves the cross-linguistic representation of the model and achieves performance gains on six zero-shot cross-linguistic transfer benchmark tasks.Finally,a study on multilingual complex named entity recognition based on a multilingual gazetteer is conducted.Faced with multilingual long and difficult entities,most multilingual named entity recognition systems have difficulty in distinguishing them.In this paper,we construct a multilingual entity library containing seven million named entities across eleven languages through Wikipedia,and fuse it with a multilingual pretraining model at the feature level.The model can effectively discriminate long and difficult named entity boundaries in low-context environments and achieves a significant improvement in recognition performance. |