Research On Public Opinion Ontology Concept Extraction Based On Short Text

Posted on:2019-12-01

Degree:Master

Type:Thesis

Country:China

Candidate:C Zha

Full Text:PDF

GTID:2428330566467004

Subject:Computer application technology

Abstract/Summary:

With the rapid development of information technology,the Internet has produced a variety of formats of mass data(words,pictures,sound,video,etc.).Mass data contains relevant public opinion information,making the Internet information and knowledge become an important source of public opinion.How to extract public opinion ontology from language materials faster and more accurately It has become a hot spot in public opinion research.The past public opinion corpus often collected from the news,the standard of the news format,the specific characters,the time,the place of occurrence,the occurrence process,the result and so on,and the news information is usually long text.With the introduction of various social networking tools,massive short text data have been generated.Short text is different from long text.It has two unique processing characteristics: real-time and sparsity.The short text on the Internet is updated in real time,refreshing fast and difficult to collect.It requires a higher efficiency for the classification of short text information.The length of short text is within 200 words,usually only a few sentences,so the effective information is very few,and the features of the sample are very sparse,and the dimension of the feature set is very high.High,it is difficult to extract accurate sample features.Aiming at the long tail phenomenon in statistical word frequency,data smoothing technology is used to adjust word frequency to accomplish tasks.Based on the characteristics of word frequency combined with feature words,document feature words are extracted.In order to effectively improve the computational efficiency,this paper uses set intersection characteristics to calculate and calculate text correlation by comparing set correlation numbers.The noun words or phrases are extracted as candidate concept sets for the subject text after recognition;the similarity degree between the candidate concepts is evaluated according to the semantic similarity method and the weights of the concepts are sorted;thus the core concepts related to the subject are extracted.The experimental results show that the precision ratio of the concept extraction of public opinion ontology for short text is 0.62% higher than that of the TFIDF method,the recall rate is 0.4%,and the average consumption time is 30% less than that of the TFIDF method.It has made a useful exploration of ontology concept extraction from short texts.

Keywords/Search Tags:

Public opinion ontology, Concept extraction, Short Text, Word similarity, Word frequency statistics, Set

Related items

1	Research On Subject-oriented Extraction Of Public Opinion Ontology Concepts And Relations
2	Research On The Concept Extraction Method Of Public Opinion Ontology Based On Stable Word Recognition
3	Short Text Processing Method Based On Wikipedia
4	Research On Key Problems In WEB Text Mining
5	Research On Opinion Mining For Customer Service Conversation Text
6	Design And Implementation Of The Public Opinion Monitoring System Based On Semantic Web
7	Research On Word Similarity Computation Method Based On Non-IID Learning
8	Design And Implementation Of The Uighur Word Frequencyâ€™s Statistics System
9	Research On The Concept And Relationship Extraction Of Public Opinion Ontology For Dynamic Topic
10	Research On Text Clustering Of Micro-blog Public Opinion: Word Sense Cluster And Collocation-Based Method