With the advent of the big data era,information extraction technology can effectively extract important semantic information from massive text data and transform unstructured and semi-structured data into unified structured data efficiently.With the government’s digital transformation and reform deepening,the increasing number of government affairs text data has become an important resource for social governance and decision-making.Therefore,knowledge sorting and information mining of government affairs text data with the help of artificial intelligence technologies of natural language processing,information extraction,and knowledge graph can improve the knowledge reading and information acquisition efficiency of decision makers.And that is important for creating a good working environment for smart government.The current research on information extraction methods for the text domain of government proposals is in its infancy,for two reasons: on the one hand,existing models of named entity recognition and relationship classification that fail to extract semantic feature information from text,which leads to the existing models being less accurate and unable to match the business needs of the government domain;on the other hand,there is a lack of benchmark data sets in the government domain.Based on this,the paper develops the research of the information extraction method and its application to the construction of knowledge graphs in the text domain of government proposals.(1)Addressing the problem of the existing named entity recognition model with low accuracy and difficult-to-achieve parallel computation,a named entity recognition method based on the hierarchical Softmax computation strategy is proposed.By combining Transformer structure characteristics with the hierarchical Softmax computation strategy,a high-performance named entity recognition model with parallelized computation is established by using character-level,word-level,and location feature information in the text to obtain context-dependent information.Our model obtained the best F1 scores of 96.24% and 70.32% on the Resume and Weibo public datasets,respectively,and its performance was significantly better than other compared models.(2)Addressing the problem of low-computational efficiency in traditional attention mechanisms,a relationship classification method based on the target attention mechanism is proposed.By introducing the target attention mechanism to solve the computational redundancy problem and making full use of the word embedding information and location embedding information to obtain the important semantic information of the context,a relationship classification model with simple structure and high accuracy is realized.Our model obtained the best F1 scores of 85.27% and 71.39% on the Sem Eval-2010 task 8 and Conll04 public datasets,respectively,and its performance was significantly better than other compared models.(3)Addressing the lack of benchmark datasets in the field of government affairs,a method for constructing a text dataset of government proposals based on coarse and fine granularity segmentation is proposed to create a benchmark dataset for the text field of government affairs proposals.Meanwhile,a knowledge map of government proposal text based on the information extraction method is constructed by using the Neo4 j graph database,and the visualization display of the knowledge map of government proposal text is realized,which makes the constructed knowledge map of government proposal text effectively improve the level of data collection,organization and application of government text,as well as promote the development of smart government in government organizations. |