In construction industry, there are a lot of complex technical specifications forconstruction management. While these documents cannot be used by computer directlybecause they are paper-based and semi-structured, we have to search for information byhands, which is a tedious and time-cost process. In order to solve this problem, it isnecessary to establish a normative knowledge structure combined with the experience,achieve the efficient use of building codes.This thesis aims to solve the problem of extracting the knowledge from massiveconstruction specifications and structure of the information. The solution of the problemwill improve the automation level of construction monitor and quality reviewed. Thisstudy will introduce an efficient method for information extraction, which will facilitatethe reference of construction specification and construction quality acceptancespecification.Four aspects are included in this thesis:1) the construction of domain ontology,2)specification document by NLP,3) knowledge extraction,4) extraction system design.First of all, use ontology editing tool combined with the construction quality acceptanceknowledge–protégéto construct domain ontology model of quality acceptances; secondly,combine with the domain’s dictionary, use ICTCLAS for NLP, including Chinese wordsegmentation and part-of-speech tagging. Then the quality acceptance specification isdivided into training set and test set according to the rules of training set to writeknowledge extraction rules, to extract knowledge according to the provision of variousspecifications. Then storage the knowledge points in the form of ontology instance.Finally, quality acceptance specification knowledge extraction system is designed, and theexperiment results are analyzed, and the feasibility and accuracy of the research isvalidated. |