Research On Text Similarity Detection Algorithm Based On Human-Computer Interaction Semantics

Posted on:2024-09-16

Degree:Master

Type:Thesis

Country:China

Candidate:J B Lin

Full Text:PDF

GTID:2568307172981799

Subject:Control Science and Engineering

Abstract/Summary:

In the past,most of the traditional text similarity calculation methods were based on the surface text similarity calculation method,but they did not consider the semantic information contained in words in different contexts,and the calculation method lacked important semantics information.After that,computers use the external knowledge dictionary or external corpus for deep learning,which to some extent solves the above problems.But this method rarely considers the part of speech information or position information of words in the sentence or text and relies on external resources.The resources need to be pre-built and have a high complexity.At the same time,most of the past methods have verified their validity on English datasets,but not on Chinese datasets.In view of the differences in grammar and sentence structure between English and Chinese,algorithms that perform well on English datasets may not perform as well on Chinese datasets.In response to the above problems,the main work of this article is as follows: First,we propose a keyword similarity calculation method that improves YAKE and integrate deep learning.Keyword extraction techniques are integrated into text similarity calculations to extract important information about text in advance.By adding word span features,the YAKE algorithm can be better applied to keyword extraction in Chinese.The feasibility and effectiveness of the proposed method are proved by comparing the two sets of experiments with the traditional algorithm.Secondly,we propose a text similarity calculation method that integrates multi-level features.According to the different levels of the document,the weight value of the word is calculated through syntactic dependency analysis,semantic role identification and keyword score,so that the weight value contains semantic and position information.Then we calculate the weight of sentences and paragraphs according to the position result,logical structure and keyword overlap of sentences and paragraphs.The vector representation of documents is completed sequentially through different levels of weights,and then the similarity degree is calculated.

Keywords/Search Tags:

Related items

1	Research On Text Similarity Calculation Method And Its Application In Financial Field
2	A Study On The Method Of Constructing Bilingual Corpus In Chinese And
3	Research On Graph-based Text Keyword Extraction Integrating Deep Learning
4	Research On Text Keyword Extraction Based On Eye-tracking Data
5	Automatic Extraction Of Keywords And Text Summarization In Text Mining
6	Research Of Answer Ranking Method Based On Weighted Keywords
7	Research On Semantics Independence Deep Learning Methods For Scene Text Recognition
8	Research And Implementation Of Text Mining Technology Based On Public Security Information
9	Research On Similarity Detection Method Of Science And Technology Project Application Text Based On Deep Learning
10	Research And Application Of Text Similarity Algorithm Based On Deep Learning