Font Size: a A A

Design And Implementation Of Knowledge Base Quality Control Platform

Posted on:2017-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:S XiongFull Text:PDF
GTID:2308330485960401Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The Internet is rich in resources, but most of the resources can only be understood by man, not machine. A vast amount of knowledge graph can help understand the machine in Chinese text, and make the search engine moving towards the next generation. It can accurately analyze the user’s search intention, making the search range wider and deeper, so the result will satisfiy the user better. But all this needs the support of knowledge data, which is the output of the knowledge graph department. It is a kind of strong structured data, and is more complex than unstructured data such as web page data. So its quality is a big problem. This quality control platform is developed to ensure the quality of knowledge graph output data, and provide comprehensive and systematic monitoring for all types of data.The thesis takes the software engineering thought as the guide, and completes the data calculation, monitoring and routine assessment of this platform independently.First of all, the thesis analyzes the system mission, target users and the key function module. On the basis of dividing each module function point, the thesis makes clear the functional and non-functional requirements of the system, then completes the summary design from the perspective of logical layered architecture, data interfaces and database profile design.Secondly, the thesis develops the system base on the Django framework and JQuery, Vue, Ajax technology to realize the calculation, alarm configuration, data display, and other functions.At last, the thesis builds the indicators system and the computing system to effectively measure the data of knowledge base and effect, then develops the data monitoring system to ensure the stability of the application data and the service. After that, the thesis finishs the routine assessment system to regularly sample data and provide accurate analysis and evaluation.Now this platform has established a relatively perfect indicators system, and supports for 7 types of data sources, such as kgbase, scan, userbase, streaming_ds and hdfs. Four kinds of data format can be detected by the platform, and six types,40 kinds of indicators run monitoring on it. From September 2015 to now, the platform has found 47 online problems and all promoted repair solution. It makes monitoring easier, and continuously optimizes the alarm effect, provides a strong protection for the quality of the knowledge graph output data.
Keywords/Search Tags:Knowledge Graph, Knowledge Data, Data Quality, Data Monitoring, Django
PDF Full Text Request
Related items