| With the increasing popularity of computers and networks, more types of resources with highly sharing facilitate the work and life. Large amounts of information are obtained and processed by people everyday, how to extract data from a mass of valuable information has become a hot issue. However, there is another opposite problem with massive data: How can we identify the similarities?The research of text comparison algorithm based on sentence similarity measures the target text and standard text similarity with some kind of algorithm which based on the analysis of the sentence form and sentence meaning. This result can be used to judge the similar level for the text comparison and identification in the future.First, this thesis discussed the key issues about the common words, sentences and text similarity calculation,then analyzed the TF-IDF method based on the vector space model, the text similarity calculation method based on Hamming distance, Recessive Semantic indexing, the text similarity algorithm based on property view, the similarity calculation based on semantic understanding and the similarity algorithm based on HowNet.Secondly, this thesis improves several similarity algorithms, hoping to get better results of the similarity. The paper also gives the achievement of text comparison algorithm based on sentence similarity on the computer. And then validate it with some relevant texts. |