| With the rapid development of the semantic network,the complexity of and the scale of it is getting higher and bigger individually,and it is showing a trend of tremendous growth.Semantic web data composed of tens of billions of Resource Description Frameworks(RDF)three tuples is continuously published on the World Wide Web,and these data can be repres ented by knowledge graphs.Meanwhile,with the deepening of international communication,as a universal language of globalization,English is the basic medium of international information dissemination.However,nonnative English speakers are less familiar with English,which leads to their getting inaccurate English information.Based on the research and application of crosslanguage retrieval,a connection is established between different languages,which is convenient for people to retrieve non-native information through their mother tongue.With the rise of green energy,new energy vehicles have been vigorously developed.However,foreign automobile industries have far exceeded the domestic ones in the concept and technology of new energy vehicles.Domestic manufacturers urgently need to analyze the problems to narrow the gap.As most foreign information is in English,if domestic employees do not master English enough,it is difficult to collect the latest and most accurate foreign information in the field of new energy vehicles,the information of which is unequal at home and abroad,making it difficult for domestic employees to formulate development strategies leading other vehicle enterprises.In order to make employees quickly and accurately obtain the latest news at home and abroad,the research on cross-language pattern knowledge matching between Chinese and English is vital fundamental.It is of great significance to use cross-language concept matching method to study cross-language retrieval and apply it to the field of new energy vehicles.In the era of big data,most of the information is presented in the form of network data.First,starting from web page data,this paper adopts a common feature extraction method in both Chinese and English,and conducts pattern knowledge mining,so as to obtain useful information from web page knowledge;Complete cross-language concept matching,laying the foundation for cross-language retrieval.Secondly,according to the data characteristics of new energy vehicles,the data is processed,and some adjustments are made to the algorithm to apply it to the field of new energy vehicles.Finally,a cross-language retrieval system for new energy vehicles is realized,which is convenient for the managers of automobile enterprises.Control the future development direction of the company in the field of new energy vehicles,and formulate the most appropriate development strategy.The main research work of this paper is as follows:(1)Research on pattern mining method.Pattern knowledge mining is to extract and classify the existing data in the data set or database,and classify and sort out the new knowledge concurrently.This paper proposes an unsupervised classification algorithm based on Self-Training,which divides the relationship between representation concepts into three types: Equal,Belonging and Irrelevant.This unsupervised classification algorithm combines the improved adaptive K-Nearest Neighbor(KNN)classification algorithm and Support Vector Machine(SVM)classification algorithm,namely SA-KNN-SVM classification algorithm.By comparing with other three-classification algorithms on two public data sets,it is proved that this algorithm has higher accuracy in classification results.(2)Research on cross-language concept matching method.Cross-language concept matching is mainly based on machine translation and the construction of bilingual subject model.The text of different languages is matched to achieve the purpose of cross-language retrieval.This paper proposes an unsupervised crosslanguage conceptual model based on neural network machine translation.Language understanding ability and the accuracy of cross-language alignment of the model are improved through pre-training and combined with replication mechanism respectively.Experimental results and comparison with other models on public data sets show that this method can effectively improve the accuracy of cross-language concept matching.(3)Build a cross-language retrieval system for new energy vehicles.According to the data characteristics of new energy vehicles,the corresponding parameters of the algorithm are adjusted,so that the algorithm can be better applied to the data of new energy vehicles,and the information display and cross-language retrieval system related to the field of new energy vehicles is designed and developed.The system mainly includes a basic information display module for new energy vehicles,a specific configuration comparison display module for new energy vehicles,a Chinese and English news display module for new energy vehicles,and a cross-language retrieval module for new energy vehicles at home and abroad.Through this system,users can directly obtain domestic and foreign information of new energy vehicles,thereby improving the information island and providing a more intuitive reference for the direction and pricing of new models developed by enterprises. |