Font Size: a A A

The Design And Realization Of Document Processing System Based On Text Analysis And Collaborative Filtering

Posted on:2016-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:N LiuFull Text:PDF
GTID:2298330467491768Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularity of Internet, the content and technology of office automation is also in constant development. However, the existing document processing system has some problems on depth analysis of document and the processing of structured data. It also cannot mine the existing document processing experience effectively. How to achieve effective recommendation of documents and daily office intelligent auxiliary has become an urgent problem to be solved.This thesis analyzed the existing document processing system, through comparative study and learning from each other, designed and realized a document processing system based on text analysis and collaborative filtering. At the same time, this system uses cloud computing technology to recommend related documents for the users in order to support decision-making system.This thesis mainly works on the following aspects:1. By analyzing the deficiencies of the existing document processing systems, this system introduces text analysis and collaborative filtering to solve those problems, and basing on the combination with the analysis of the system’s role and application scenarios, this thesis apply the appropriate text analysis and collaborative filtering algorithm to document processing system.2. By analyzing user behavior, this system improves the original data matrix’s assignment strategy in decision system, and solves the cold start problem in collaborative filtering algorithm.3. This thesis accomplished module division and detailed design based on the application scenario and the requirement analysis. In addition, dividing the databases into two parts is a plus. One is basic information of system stored in Mysql and the other is preference information stored in MongoDB. And this thesis introduces star model to complete system basic information modeling, in order to system extensions.4. This thesis builds distributed environment. Based on the detailed design, This thesis realized those modules of the system. Besides, proving the correctness and reliability of the system according to the results of functional test and performance test is a plus.Experimental results show that the designed document processing system has advantages in decision-making system, security, reliability and expansibility. In terms of decision-making system, the system provides users with recommendations, intelligence of document classification and support of processing documents. For security, this system introduces access control to achieve confidentiality. On robustness, the system uses master-slave database to ensure the document data will not be lost. On the scalability, this system use Hadoop which is cheap and easy to expend. Under the premise that does not affect the stability of the system, Hadoop can be configured to increase computing nodes, convenient extension.
Keywords/Search Tags:Document processing system, Text analysis, Collaborative filtering, Distributed computation
PDF Full Text Request
Related items