| Global climate change and outbreaks of plant diseases are major threats to rice cultivation.The lack of scientific and technological information services for farmers affects rice planting,pest control and so on.Solving rice production problems in a timely,accurate and efficient manner is the key to ensuring improved rice production and improving its quality.In the Q&A community of the China Agricultural Technology Promotion Information Platform,more than 1,000 rice-related Chinese questions are added every day.Due to the high dimensionality and sparseness of text data and the complexity of Chinese itself,many questions with the same semantics but different expressions appear.question.It takes a lot of manpower and material resources for experts to repeat the same semantic question,so it is very important to quickly detect the same semantic question.Aiming at the technical problem of matching the same semantic information in the agricultural question answering community,this study uses deep learning technology to build a model of rice knowledge questions with the same semantics,so as to improve the overall performance of semantic matching of rice knowledge questions.In this study,the rice question and answer data were collected and sorted,the Word2 Vec method was used to process the text vectorization,and compared with the One-hot and TF-IDF methods,the BiLSTM-CNN model based on the Siamese similarity calculation method of the Siamese network was constructed and matched with other texts.Models are compared and analyzed to achieve efficient matching of rice knowledge questions.The specific research content and results are as follows:(1)First,use the jieba word segmentation tool for the rice knowledge text,implement the accurate method to divide the rice question data,and use some methods to vectorize and analyze One-hot,TF-IDF and Word2 vec,and analyzed from the word vector and character vector processing respectively,and concluded that Word2 vec vectorized representation results are the best among the three word vector models.At the same time,the word vector trained by the CBOW model in Word2 vec is better than the word vector,and the accuracy rate is increased by 3%.(2)Using the Siamese neural network as the basic model framework,three text similarity comparison models LSTM,BiLSTM and Attention-BiLSTM are used for modeling and analysis.Three comparison models were used in the rice question pair dataset,among which the BiLSTM-Attention model converged the fastest,and finally reached a state of convergence when the number of rounds was equal to 12 rounds.The upper F1 value of the dataset is 16.28% and 10.78% higher than that of the LSTM model and the BiLSTM model,respectively.From the comprehensive results of model training and validation of various datasets,the BiLSTM-Attention model has the highest accuracy.This study proposes a similar matching model for rice questions based on Sim-BiLSTM-CNN,and compares it with the LSTM model based on the Siamese network framework,the BiLSTM model,and the BiLSTM-Attention model,on the dataset constructed in this study.BiLSTM-CNN model accuracy and F1 value are higher than other text matching models,reaching 98.2% and 96.68%.And compared with other semantic models,compared with the long-short-term memory network(LSTM)model,the bidirectional long-short-term memory network(BiLSTM)model and the BiLSTM-Attention model,the accuracy rates were increased by 21.16%,11.25% and 2.181% respectively.Thus,the problem of low detection efficiency of the same semantics of rice-related questions in the question answering community is solved. |