Font Size: a A A

Research On SAO-Based Science And Technology Text Mining And Its Application

Posted on:2017-07-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:C YangFull Text:PDF
GTID:1368330596964366Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Science and technology text mining becomes the key method in decision-making of technology development,and plays an important role in promoting knowledge sharing and innovation process.But at present,the Science and technology text mining mainly focuses on the using of topic words/phrases,cannot clearly identify the relationship between topic words/phrases,and is facing the problem of ambiguous interpretations resulted by homonyms and synonyms of words.This paper researches Subject-Action-Object(SAO)-based text mining methods,which focus on the semantic structure extraction,topic identification and classification,and technology trend analysis.This method includes four models: SAO extraction model,SAO-based topic model,SAO-based core technological components' identification model and SAO network-based technology trend analysis model.This paper is an interdisciplinary research of technological management and text mining.The significances of this essay are given as below:(1)To build up a hierarchical and parse tree-based SAO identification methodA hierarchical and parse tree-based SAO identification method is proposed on the basis of the former rule-based naming entities relationship identification.This method includes three parts: 1)In order to ensure that the SAO structure has a strong correlation with the topic,an SAO components identification model is proposed on the basis of term clumping processes and co-word analysis;2)a parse tree-based hierarchical SAO extraction model is proposed to ensure the recall and precision of SAO extraction;and 3)a Term Frequency Inverse Document Frequency(TF-IDF)-based SAO weighting model is proposed to rank SAO structures for key SAOs selection.The case study verifies the accuracy of SAO identification.The proposed method ensures that the SAO structure falls within the scope of the target topic and supports the weighting of the SAO structures.(2)To build up an SAO-based LDA modelAn SAO-based LDA model is proposed,which includes: 1)identifying and exploring the problem & solution patterns embodied in SAO structures;2)proposing “bag-of-SAO” assumption;3)SAO-Based LDA(Latent Dirichlet Allocation)model is built based on the “bag-of-SAO” assumption.The proposed topic model can effectively identify the topic structure,and achieve great improvement in topic recognition and semantic disambiguation compared with the traditional LDA model.(3)To build up a requirement-oriented core technological components' identification model based on SAO structureIn order to understand and monitor the core technological components(e.g.,technology process,operation method,function and material preparation)of a technology,this paper proposes a requirement-oriented core technological components' identification model based on SAO structure,in which 1)a syntax-based approach is constructed to identify the SAO structures describing the function,relationship and operation in specified topics;2)"Importance indicator" and "innovation indicator" are built based on frequency statistics,technological components' correlation and technological component life cycle analysis,to judge the importance and innovativeness of technological components,and finally to screen technological components;and 3)this paper proposes a “relevance indicator” to calculate the relevance of the technological components to requirements,and finally identify core technological components based on this indicator.The proposed method can be used to describe the complete technical details accurately,judge the technical requirement corresponding to the core technological components.(4)To build up an SAO network-based technology trend modelAn SAO network-based technology trend model is proposed considering the actor network theory.SAO network is built based on the "Subject(node)-Action(edge)-Object(node)" link.After that,the relationship strength between actors is calculated.The development trend of emerging technology is analyzed with five indicators: in&out degree,key action,"Burt constraint",node "degree distribution" evolution and network center deviation.The proposed SAO network-based technology trend model can identify the core technologies and requirements,identify the relationship details and relationship strength between actors,and finally implement technology competitive advantage analysis.The empirical study is performed to demonstrate the proposed methods.
Keywords/Search Tags:Subject-Action-Object(SAO), Semantic Analysis, Bibliometrics, Text Mining, Technology Trend Analysis
PDF Full Text Request
Related items