Font Size: a A A

Research On Key Technologies Of Intelligent Processing Of Data Resources For Digital Publishing

Posted on:2021-04-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:K L ChenFull Text:PDF
GTID:1368330632961655Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
As a "content is king" industry,the news publishing industry continuously provides the public with high-quality content based on books,magazines,magazines,etc.,and promotes the steady progress of social culture.In recent years,along with the rapid development of computer and Internet technologies,especially the artificial heat technology,all walks of life have begun to use artificial intelligence technology to improve production efficiency and provide more intelligent and humanized services or products.In the era of rapid information elimination led by artificial intelligence,the content that the news publishing industry is proud of has become somewhat "out of date",and compared with the "reading the full-text summary center",users prefer to acquire knowledge intelligently and efficiently from the knowledge resource library.On the other hand,the content of the news publishing industry is less intelligent,and most publishers still use the old "offline writing,mailbox manuscript,manual proofing" mode,which directly causes that content creation and editing are less efficient,the quality of the content is excessively dependent on the level of the individual,and the management of the manuscript version is difficult to trace back.This topic focuses on the above-mentioned problems faced by the news publishing industry,and designs solutions based on natural language processing and other intelligent processing methods.At the level of content resource data organization,this project designed a scheme to build a knowledge system with high quality and efficiency,and at the same time,a new word discovery algorithm based on label weight for news publishing industry was proposed for the construction of the thesaurus.At the content creation level,this topic proposes a set of perfect online writing schemes,and designs an intelligent text retouching algorithm based on natural language processing for the creative link.At the level of manuscript editing and review,based on the online review system,this paper designs an intelligent typos proofing algorithm and modified trace matching algorithm which can improve the efficiency and quality of editing.At the content publishing level,this topic designed an intelligent reading platform,providing a variety of intelligent reading solutions including user reading behavior collection and analysis programs,cross-platform encryption and decryption programs,and bilingual corpus-based bilingual control solutions.The specific research content and main innovations of this topic include the following aspects:(1)Research on the construction system of knowledge system based on new word discovery algorithm.The knowledge system construction system based on the new word discovery algorithm discussed in this study is designed to realize the transformation of the news publishing industry from content service to knowledge service.The core research point is the new word discovery algorithm based on tag weight.The algorithm can realize the new word intelligent discovery and automatic cleaning extension of the basic lexicon,and establish the knowledge point element word relationship system to realize the association extraction of the targeted knowledge points,and enrich the output content from the knowledge points to the related knowledge system.In order to improve the scalability and universality of the system,the extended architecture system of the thesaurus,algorithm,computing power,label,and exception vocabulary is considered in the design,and the implementation of "machine automatic+manual assist" is adopted and the experimental basis for the optimization algorithm is provided on the basis of improving system availability.The results of this research will bring another innovation after the digitization of the publishing industry,and transform the content service into an intelligent service based on the knowledge system.(2)Research on intelligent creation algorithm and media fusion scheme based on natural language processingIn this part of the research,I designed a fusion media platform that integrates intelligent creation,editing and editing,and the research purpose is to fragment the excellent content resources of the news publishing industry and then analyze and process the semantics,and finally output the corpus.The core idea of this solution is to process high-quality content into a retouching corpus through processes such as resource preprocessing,Chinese word segmentation,metadata completion,and semantic processing.Then,the content evaluation model is used to evaluate the matching degree between the text content input by the user and the content in the retouching corpus,and finally output the content in the corpus that is similar to the user's expression,thereby providing the content creator with the content retouching suggestion.The system implemented based on this scheme has been piloted in cooperation with news publishing media organizations.Through the results of the operation and the use of user feedback,it can be shown that the platform can provide news media workers with intelligent assistant services for content creation,which can improve the efficiency of work and the quality of content output.This research will provide an intelligent working platform for the news publishing industry,which is of great value for promoting the comprehensive integration of the media.(3)Design of intelligent proofreading and modified trace matching algorithm in manuscript editingThe main research contents of this research point are typos check algorithm based on confusion set and N-Gram and modified trace comparison algorithm based on the longest common subsequence algorithm(LCS problem).The typos verification algorithm can realize the verification of the word errors and similar errors in the manuscript content,and finally give the user modification suggestions.The main idea of the algorithm is as follows:first,the input text sequence is segmented,then the confusion set is used to replace the word segmentation,and finally the algorithm is used to score the result to confirm the correct usage of the input text sequence.The modified trace comparison algorithm can compare the differences of Chinese texts,and is designed to intelligently record the modification records of the contents of the manuscript,thereby facilitating editors to perform content backtracking and problem tracking.The main idea of the algorithm is to use the dynamic path planning algorithm to solve the difference between the text sequence before and after the modification and use a two-dimensional array for storage.The algorithm can extract the sequence from the array when the modification information is needed.(4)Research on digital reading platform based on big data and artificial intelligence technologyThis research point is an innovative study of the content carrier of the publishing industry-reader.This paper not only studies how to design the content analysis and presentation of the reader,so that the reading experience can be beautiful and convenient,and also how to intelligently Collect user's reading behavior,and analyze the user's reading time,frequency,content preferences,and then provide suggestions on how to improve the attractiveness of publishing content and product sales.This study selects the Android application as the entry point,and designs a reader system for content layered analysis,data multi-end encryption and user behavior intelligent collection and cloud analysis.In addition,since the system will provide services for bilingual teaching in universities,the reading behavior analysis algorithm will be based on the reading data of the big data technology intelligent statistical readers(student),and output reading habits reports that can provide services for teaching quality assessment,thereby guiding the teaching process.
Keywords/Search Tags:Intelligent publishing, Knowledge services, Intelligent review, Natural language processing, Digital reading
PDF Full Text Request
Related items