With the penetration of big data technology into agricultural production,agricultural data has shown explosive growth.Agricultural Science and technology information service platform is a professional,comprehensive service platform providing agricultural technology questions and answers,expert guidance,online learning,results in delivery,and technical exchanges.However,due to the high dimensionality,sparsity,and professionalism of agricultural text data.It is not ideal for managing agricultural text data by relying on manual feature screening and a shallow learning model.It is not easy to mine deep-seated semantic features of agricultural texts to extract high-quality question-answer pairs.Obtain high-quality question pairs and build a question answering system that generally includes three parts: semantic analysis of user’s question sentences,answer extraction,and answer generation.At the same time,as one of the essential food crops,rice has a wide planting area in China,and the problem of pests and diseases in rice production has been one of the main factors affecting its quality and yield.How to quickly and accurately provide the management methods and means of rice-related diseases and insect pests in the production process.Therefore,this paper took the rice-related question and answer community as an example,aiming at the four critical technical problems faced by the agricultural question and answer community.It is challenging to classify agricultural questions accurately and automatically;it is difficult for the Q&A community to mine the same semantic questions accurately,redundancy of platform knowledge under certain circumstances;it is difficult for the Q&A community to automatically and accurately identify the correct answer in the candidate answers.The accuracy of the existing agricultural question and answer model is low,and it is challenging to meet users’ requirements for real-time access to answers in the production process.Deep learning and natural language processing technology were used to construct semantic models in four aspects: question classification,semantic question similarity,answer selection and answer generation.In order to improve the overall performance of rice-related question answering system,we mined high quality rice-related question answering pairs in question answering community.Firstly,in order to solve the problem of fast,automatic classification of rice-related question data in the rice-related question answering community,a rice-related question text classification method based on attention mechanism and dense connected convolutional neural network was proposed.According to the characteristics of rice text,the word2 vec method was used to process and analyzed the text data and combined with an agricultural word segmentation dictionary to vectorize the text data.The Word2 vec method can effectively solve the problems of high dimension and sparsity of the text.By establishing a dense connection between the convolution neural network’s upstream and downstream convolution blocks,the transmission of text features was strengthened,and the flow of text features between convolution blocks was enhanced so that the model can automatically extract and learn text features.Combined with the attention mechanism,the keyword features in the text can be fully reflected so that the text classification model has better text feature extraction accuracy to improve the classification accuracy.The test results showed that the rice-related question classification model of Dense CNN based on attention can improve the utilization of text features,reduce feature loss,and realize the automatic classification of rice question texts quickly and accurately.The classification precision and F1 values were 95.6% and 94.9%,respectively.Compared with the other seven neural network question classification methods,the classification effect was significantly improved.Secondly,to allow fast and automatic detection of the same semantic agriculture-related questions,we proposed a new method based on BERT-Coattention-Dense GRU(Gated recurrent unit).According to the agriculture question characteristics,we applied twelve layers of the Chinese BERT model method to process and analyze the text data and compare it with the Word2 vec,Glo Ve,and TF-IDF methods,effectively solving the problem of high dimension and sparse data in the agriculture-related text.Each network layer employed the connection information of features and all previous recursive layers’ hidden features.To alleviate the problem of feature vector size increasing due to dense splicing,an autoencoder was used after dense concatenation.The experimental results show that agriculture-related question similarity matching based on BERT-Coattention-Dense BiGRU can improve the utilization of text features,reduce the loss of features,and achieve fast and accurate similarity matching of the agriculture-related question dataset.The precision and F1 values of the proposed model were 97.2% and 97.6%.Compared with six other kinds of question similarity matching models,we present a new state-of-the-art method with our agriculture-related question dataset.Thirdly,to allow the intelligent detection of correct answers in the rice-related question-and-answer communities of the "China Agricultural Technology Extension Information Platform," we propose an answer selection model with dynamic attention and multi-perspective matching(DAMM).According to the characteristics of the rice-related dataset,the twelve layers Chinese BERT pre-training model was employed to vectorize the text data.It was concluded that BERT could effectively solve the agricultural text’s high dimensionality and sparsity problems.As well as the problem of polysemy having different meanings in different contexts,dynamic attention with filtering strategies was used in the attention layer to remove the sentence’s noise effectively.The sentence representation of question-and-answer sentences was obtained.Secondly,two matching strategies(Full matching and Attentive matching)were introduced in the matching layer to complete the interaction between sentence vectors.Thirdly,a bi-directional gated recurrent unit(BiGRU)network spliced the sentence vectors obtained from the matching layer.Finally,a classifier was employed to calculate the similarity of splicing vectors,and the semantic correlation between question-andanswer sentences was acquired.The experimental results show that DAMM has the best performance in the rice-related answer selection dataset compared with the other six answer selection models,which MAP(Mean Average Precision)and MRR(Mean Reciprocal Rank)of DAMM gained 85.7% and 88.9%,respectively.Compared with the other six kinds of answer selection models,we present a new state-of-theart method with the rice-related answer selection dataset.Finally,the network based on Attention-Res LSTM-seq2 seq was used to realize the construction of the rice question and answer model.Firstly,the text presentation of rice question and answer pairs was obtained using the GPT pre-training model based on 12 layer transformer.Then Res LSTM(Residual Long Short-Term Memory)was used to extract text features in the encoder and decoder,and the output project matrix and output gate of LSTM were used to control the spatial information flow.When the network contacts the optimal state,the network only retains the constant mapping value of the input vector,which effectually reduces the network parameters and increases the network performance.Next,the attention mechanism was connected between the encoder and the decoder,which can effectually strong the weight of the keyword feature information of the question.Finally,the softmax function was used to calculate the final probability distribution in the decoding process.The results showed that the BLEU and ROUGE of Attention-Res LSTMSeq2 seq model reached the highest 35.3% and 37.8%,compared with the other six rice generative question answering models. |