| In recent years,the amount of scientific and technological literature has grown exponentially,making it increasingly difficult for users to extract useful information from them.In the era of big data,although people have diverse ways of searching and reading literature,they still need to read each article in order to discover the necessary knowledge,after obtaining scientific and technological literature from professional literature retrieval systems and academic search engines.This method of acquiring knowledge not only requires a lot of time and effort from researchers,but also demands high control of their professional knowledge,which cannot guarantee the effectiveness of knowledge acquisition.The fundamental reason for this phenomenon is that users obtain full-text scientific and technological literature instead of internal fine-grained knowledge units within the literature.To meet the current user’s fine-grained knowledge needs,this study proposes a scientific and technological literature ontology model from the perspective of literature components,to provide accurate,intuitive and convenient knowledge services.This study first summarizes and analyzes the research status of literature components and scientific and technological literature ontology models at home and abroad,and points out the problems in Chinese scientific and technological literature knowledge representation.Secondly,it explains the relevant theories of ontology,scientific and technological literature,and literature components,and introduces the rule-based extraction method,Word2 Vec semantic model,and Agnes and K-means clustering methods,laying the theoretical foundation for this research.Then,taking scientific papers and patent literature as examples,the main text component is defined as the key subject content that includes the main research problems,core research techniques,innovative research results,and the secondary text component is defined as the basic bibliographic information that includes the literature title,author information,affiliation,journal,invention number,legal status,etc.Based on the above,this study designs a construction plan for the ontology model using the main and secondary text components.Individual literature components of single scientific papers and single patent literature were identified manually,added to the Protégé tool,and visually presented and structurally parsed.The main text component instances were extracted using regular expressions,and the secondary text component instances were exported from the collection database.The Agnes and K-means clustering algorithms were used to cluster the main and secondary text component instances separately.Based on the clustering results,the scientific journal ontology model and the domain patent ontology model were analyzed and evaluated.Finally,this study discusses the application of the scientific and technological literature ontology model from two aspects: semantic organization and reading recommendations,and analyzes the utility of literature components.The scientific and technological literature ontology model can change the traditional coarse-grained representation of scientific and technological literature,and provide fine-grained descriptions and organization of bibliographic information and core content in scientific and technological literature from the perspective of literature components,to clearly present the text structure and knowledge context of scientific and technological literature.This model realizes the accurate refinement of scientific and technological literature components and the explicit expression of knowledge relationships,improves the relevance,openness,and sharing of scientific and technological literature,and solves problems such as isolated knowledge,low degree of association,and low degree of semantization in scientific and technological literature,thus improving the efficiency of literature acquisition for scientific researchers. |