Font Size: a A A

Research On Quality Evaluation Of Social Short Text Based On AMR

Posted on:2021-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330629982565Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the rapid popularization of its applications,social platforms such as Weibo,WeChat circle of friends,and QQ News have attracted more and more Internet users 'attention due to their simple and popular content,convenient and timely publishing,and convenient user communication.Become an important platform for expressing emotions and expressing ideas.However,the lowering of the publishing threshold has also led to a flood of information.Many netizens create at will.Whether it is the selection of words,the syntax structure or the expression of the content are uneven,which brings great trouble to the information acquisition.Manual screening and labeling are unrealistic.,So an automated social short text quality assessment and filtering method is needed.The popularity of social short texts has caused a lot of grammatically confusing and semantically ambiguous sentences in the Internet.To this end,this paper proposes a social short text quality evaluation algorithm that combines syntactic structure and modified semantics.In order to facilitate the analysis of this method,the PENMAN tree form of Abstract Meaning Representation(AMR)is used to fully study the syntactic structural integrity of text content and the closeness of modified semantics.The existing Chinese AMR parsing algorithm has low accuracy and does not consider the impact of the connection of conceptual nodes on its parsing results.If all node relationships are analyzed in detail,there may be cases where a node is accessed multiple times,making it impossible to determine the final parsing.operating.Based on the importance of predicates in Chinese syntax,this paper proposes to improve the Predicate Relation-transition based Chinese AMR parser(PR-CAMR)by analyzing the relationship between predicates.After researching with three continuous predicate forms,the Chinese dependency tree feature is used to find more accurate operation behaviors.The quality assessment method proposed in this paper divides short social text into two modes: single sentence mode first analyzes the sentence into abstract semantic representation,then analyzes the completeness of the syntactic structure of the predicate,and then calculates the sentence sequence according to different modification relationships.The degree of closeness of the sentence,combined with the structural integrity and tightness of the sentence,yields a single sentence short text quality evaluation value;the multi-sentence mode first selects the keywords in each sentence,and then calculates the similarity and total similarity with the keywords in other sentences.The highest single sentence is used as the core sentence,and the quality evaluation value of the core sentence is used as the quality evaluation value of multiple short texts.The corpus of Chinese AMR is recorded as dataset A,and the corpus constructed by manually selecting Weibo text is labeled as dataset B.The validation of the Chinese AMR parser was first improved through data set A verification.Through five sets of comparative experiments,it was found that the relationship between predicates can effectively improve the accuracy of AMR.Then,the validity of the social short text quality assessment is verified in datasets A and B.The experimental results show that the short text quality evaluation algorithm combining syntactic structure and modified semantics can accurately analyze the quality of social short texts;compared with other quality evaluation methods,the AMR graph structure can accurately and effectively express the syntax and propagation of text content The amount of information is more reasonable to study the quality of social short text.
Keywords/Search Tags:Short Text, Quality Evaluation, Syntactic Structure, Modified Semantics, AMR
PDF Full Text Request
Related items