Font Size: a A A

Numerical Information Extraction And Application For Industrial Domains

Posted on:2020-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:J P WangFull Text:PDF
GTID:2428330578969614Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,a large amount of data is generated every day in the network,and both the enterprise and the individual's dependence on network resources are particularly prominent.In the industrial domain,numerical value as an intuitive expression can reflect the relevant information of industry and has always been needed by enterprises and individuals.Because there is currently no effective numerical information extraction method in the industrial domain,this paper studies its extraction method.First of all,this paper improves the representation of numerical information.In this paper,the numerical information in the industrial domain is defined as a seven-tuple form(subject,attribute,attribute value,comparison word,comparison object,time,place),and the extraction of numerical information is divided into two steps,identification of numerical information elements and the identification of relationships among numerical information elements.In the aspect of numerical information element identification,a phased numerical information element identification method is adopted.That is,according to the characteristics of each numerical information element,different methods are used to identify the corresponding numerical information elements at different stages.And the recognition result of the current stage is input to the next stage for use.For attribute values,because of its fixed expression,a template-based method is adopted;For comparative words,because of its limited amount,this paper uses dictionary and rule methods to identify;for subjects and attributes,sequence labeling algorithms are used;The identification of the comparison objects is achieved by the rules in the previous numerical information element identification phase.Experimental results show that the method combines the advantages of the rules and Bi-LSTM-CRF model,and has a satisfactory effect on the identification of numerical information elements.In the aspect of numerical information element relationship identification,this paper develops a set of rules to identify the relationship between attribute values and other numerical information elements by analyzing the text features,and then extract the complete numerical information.Finally,this paper develops a numerical information extraction system for the industrial domain in combination with the needs of actual knowledge services.The system can extract relevant numerical information accurately.
Keywords/Search Tags:Numerical Information, Bi-LSTM-CRF Model, Element Recognition, Relationship Recognition
PDF Full Text Request
Related items