The popularization of computer network,multimedia,computer-aided teaching,online courses and other technologies has led to the emergence of various computer intelligent marking techniques.At present,most scholars have studied subjective questions,and basically proceed from the similarity of texts,analyzing the syntax,semantics,structure and so on.However,the sentence structure,the semantic direction[1]and the non-meaningful adverbs are often ignored,so the results obtained cannot satisfy the demand.In view of these situations,this paper proposes a scoring algorithm that fuses the topic and sentence similarity.Firstly,it is judged whether the two texts are generally consistent in the subject,and then carefully analyzes the similarity of each sentence in the text.The similarity of the sentences adopts different algorithms according to the main components of the sentence,namely based on the TF-IDF algorithm and based on the triplet algorithm.This article also focuses on sentence-based subjective scoring.For this part,the following work is mainly carried out:(1)Analyze the semantic information of words,judge the semantic direction,and analyze the word information modified by the words with semantic direction.(2)There are multiple principal-predicate components in complex sentences.If complex sentences are compared with simple sentences,the result will be low.Therefore,according to the subject-predicate component,the complex sentence is divided into multiple single sentences.Compare similarities in a single sentence.(3)The sentence is composed of words,so we first need to calculate the similarity of words.This paper uses the word2vec model to train and obtain the similarity of words according to the context.(4)The subject-predicate is the main component of the sentence.If the similarity of the two sentences is higher,it proves that the two sentences are similar in topic,and only the words that modify the main components are inconsistent.However,when the two sentences are compared,it is very likely that the components are inconsistent.Therefore,this paper analyzes the situation:when the principal components are consistent,first calculate the principal-predicate similarity,then construct the constituents of the modified subject-predicate into triads,and calculate Its similarity.Conversely,when the principal components are inconsistent,the TF-IDF algorithm incorporating semantic information is used.Finally,this paper combines the topic model and the sentence similarity model to optimize the scoring algorithm,and applies it to the online subjective examination system.The score results are analyzed based on the word2vec algorithm,the improved TF-IDF algorithm and the manual scoring results.The accuracy,error rate and F value are compared.It is found that the accuracy and F value of the proposed algorithm are higher than those of the traditional algorithm,and the error rate is low.This indicates that the algorithm has certain research significance and can Provide the basis for follow-up research. |