Font Size: a A A

Research Of Automatic Knowledge Base Construction Based On Hierarchical Multi-labels

Posted on:2019-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:L L ShengFull Text:PDF
GTID:2428330596460875Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the “people-oriented” concept deeply rooted,digital campus is gradually transitioning to smart campus.Campus information system is facing the change from traditional management to service.Based on the research and application of smart campus,this paper is aimed at solving the key technical problem in automatic knowledge base construction of WisQA,which is the campus intelligent question answering system.The main research work of this paper is as follows.(1)A knowledge base structure combining the domain knowledge graph and the hierarchical multi-label knowledge base is proposed.Aiming at the vocabulary characteristics in rules,regulations and service guide documents in campus area,a schema for constructing domain knowledge graph based on the universal sementic ontology HowNet is proposed,which solves the identification problem of domain proper nouns and their semantic relations.(2)Considering the vocabulary distribution,phrase structure and sentence features of the rules,regulations and service guide documents,a document classification algorithm based on convolutional neural network is proposed,which solves the problem of classification of the fact document and process document,and laid foundation for extracting knowledge from the two types of documents.(3)For fact documents,a header hierarchy analyze scheme based on table mapping rules is proposed,which solves the problem of knowledge units extracting and tag lists constructing in nested tables.For process documents,an XML document tagging algorithm based on document structure is proposed to implement document segmentation and markup.Based on this,an information extraction algorithm based on XML documents,and a semantic annotation algorithm that combines the document title structure and the LDA topic model are proposed to solve the problem of knowledge units and tag lists constructing in plain text.(4)Define the description and registration specifications of the service API required for campus Q&A,which solves the problem of automatic translation from service API description documents to service semantics tags,service API parameter information nodes and service API access nodes in the knowledge base.The service API calls required for the campus question and answer are unified as knowledge items in the hierarchical multi-label knowledge base.(5)Based on the above research,an example of hierarchical multi-label knowledge base is built.Through several Q&A examples,its advantages compared to the keyword-retrieval knowledge base in the field of campus intelligence question and answer is verified.Through the completeness of the knowledge units and the matching degree between the tags and the knowledge units defined in this paper,the performance of the knowledge base in the actual Q & A application scenario is quantitatively analyzed.
Keywords/Search Tags:knowledge base, hierarchical multi-labels, text classification, convolutional neural network, knowledge extraction
PDF Full Text Request
Related items