Font Size: a A A

Short Text Semantic Filtering Technology

Posted on:2009-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z H TanFull Text:PDF
GTID:2208360242989117Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the quick development of Internet, peoples depend on Internet for searching the information more and more.Because the most parts of information are deposited by the text way,peoples' demand the technology of the text information filtering is higher.Nevertheless, the traditional algorithms for text information filtering unable to recognize the semantic of text because they only implement the judgement on the level of structure matching and can not comprehend the context better,its filtering effect is difficult to meet the intellectual requirement.This paper mainly combines with the knowledge of Chinese informaiton processing,puts forward and carries out the short text semantic filtering algorithms.It includes the some key steps such as Chinese Word Segmentation,Word Mark,Sentence Expression Analysis,the semantic-based frame forming, calculating the similar degree of two semantic-based frameworks etc. Based on HHMM model for Chinese Word Segmentation and Word Mark and Word Sense Disambiguation which supports the PKU standard,973 standard and XML format output. By Sentence Expression Analysis to short text,according to the rule library of syntax and the information of sentence expression,the key word such as subject,predicate, object,area,time,space and so on can be distinguished from sentence, then the semantic-based frame can be filled; according to the long distance match function and the formula for calculating the similar degree of two semantic-based frameworks, the value, which can represent the similar degree of two semantic-based frameworks, can be calculated and decide to filter or not.The algorithms processes the exact Chinese Word Segmentation and Word Mark,and analyses the sentence expression which considers sufficiently such as the sentence ecdysis and so on special sentence ,then extracts the semantic frameworks.It changes the operation for comparing the similar degree of two semantic-based frameworks to mathematic calculation.It improves the formula of Similarity Calculating,increases the adjustive coefficient and filters according to the max key number of similar degree.The results of experimentations prove that the filtering effect is higher than traditional algorithms' on the level of semantic match.A Proxy Server with Semantic-based Content Filtering (SemanticFR) is implemented. SemanticFR has some functions such as monitoring network flow, packet filtering on the network layer, semantic-based filtering on the application layer, Content recurrence etc.
Keywords/Search Tags:Short Text Semantic Filtering, Semantic Frame, Similarity Calculate, Polarity Text Filtering
PDF Full Text Request
Related items