Font Size: a A A

Research On The Concept Extraction Method Of Public Opinion Ontology Based On Stable Word Recognition

Posted on:2018-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:X Z FuFull Text:PDF
GTID:2348330533456158Subject:Engineering, software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,micro-blog,forums and other social media will produce a large number of public opinion information every day,which has the characteristics of short content,flexible expression,large amount of data,rapid increment and so on.In addition,most of the information in the public opinion focus on a theme event.it may be associated with other theme events or domain knowledge at the same time.These knowledge relationships provide a good support for the study of network hot events discovery and analysis of public opinion.When discovering the hot event,core concepts should be extracted.Then relationship between these core concepts is stored in public opinion ontology,which is conducive to the analysis of public opinion events.However,most of the public opinion information appears in the form of short text,which has the characteristics of sparse data association,noise,redundant information and poor semantic coherence.It brings a great challenge to the extraction of the concept of Ontology.Public opinion information often contains a large number of stable new words,which are summed up by hot events.It is of great significance for the public opinion analysis and monitoring to identify the stable new words which are representative of the events.In order to solve the problem that the public opinion information contains a large number of stable new words which cannot be recognized correctly,this paper proposes a new method based on the rule of survival.It can identify these stable new words as candidate concepts of public opinion.Firstly,the conditional random field segmentation model is used to segment the corpus and part of speech tagging,then it reform the short string to candidate strings,and filtrate the candidate words with the word and part of speech rules,detect new word from candidate words by integrity and competitiveness of word.They are added to the concept of public opinion Ontology.Aiming at the problem of data sparseness and dynamic character of public opinion information,this paper calculated the relationship between the candidate concepts by using the string correlation and the semantic similarity of words in the process of extracting the concept of public opinion Ontology,which can compensate for the problem of poor semantic correlation caused by data sparsity.It add or delete the core concepts of events based on event membership and event impact,and discover hot events in public opinion information through event influence powe.Finally,the core concepts of all events are merged into the concept of public opinion ontology.The experimental results show that the proposed method can effectively improve the accuracy and recall rate of the concept,It also plays a positive role in the construction of public opinion ontology and the knowledge sharing and reuse in the later period.
Keywords/Search Tags:Stable new word, New words recognition, Public opinion ontology, Concept extraction, Event membership
PDF Full Text Request
Related items