With the rapid development of information technology,the construction of smart transportation led by big data and artificial intelligence technology has become an important trend to promote China’s "traffic power".At present,a good informatization construction has been carried out in the field of bridge engineering,and several informatization management systems such as bridge inspection and health monitoring have been constructed,but the upgrade from "informatization" to "knowledge" has not yet been achieved.The decision-making support for nutrition is insufficient,and its degree of intelligence needs to be further improved.The text of the field of bridge inspection is an important data resource in the bridge management and maintenance business system,and contains a large amount of information about the basic attributes of the bridge,structural parameters,and disease detection.Fully identifying the fine-grained information in the bridge inspection text is important for carrying out business such as bridge structure status assessment or maintenance decision support.However,the current bridge Inspection text is more stored in various types of information management systems in the form of electronic documents.When carrying out related follow-up business activities,it still mainly relies on manual methods to check,and the research on automatic extraction of fine-grained text information in this field is insufficient.In recent years,research on text information extraction methods with named entity recognition and entity relationship extraction as core tasks has made great progress.It can extract useful information from unstructured text and save it in a structured form.This is for text data analysis or field The construction of knowledge graph laid a solid foundation.However,the research of information extraction for text characteristics in the field of bridge inspection is still in its infancy,and its key technical solutions have not been proposed.In view of the current research status of text information extraction technology,this thesis carried out the following studies:(1)In this thesis,for the field of bridge inspection,we first determine the targets that need to be identified in the task of named entity recognition and entity relationship extraction in information extraction,and analyze the text features of the domain in terms of content structure,domain terminology,and description methods.The analysis in this part provides a clear task goal for the subsequent research on the construction of corpus in the field of bridge inspection,named entity recognition and entity relationship extraction.(2)In view of the fact that there is not yet an open text corpus in the field of bridge inspection,this thesis constructs a corpus in the field of bridge inspection,which lays a solid data foundation for the follow-up research.In this process,under the guidance of professional inspectors,detailed naming entities and entity relationship labeling specifications for the bridge inspection field were formulated,and a complete set of bridge inspection field labeling solutions was proposed through the analysis of a large number of bridge inspection reports..(3)For the analysis of text features in the field of bridge inspection,based on the characteristics of the text field of bridge inspection in my country,a method for named entity recognition in the field of bridge inspection based on the Transformer-Bi LSTM-CRF model is proposed.This method uses the Transformer Encoder to extract long-distance correlation features of character contexts,and uses Bi LSTM to extract character orientation sensitivity features,and finally uses CRF to label the sequence of domain-named entities.The experimental results show that this method can effectively identify entities in the fields of bridge names,structural members,and structural diseases.Compared with existing methods,it has better accuracy,recall rate,and F1 values of 91.96%,89.54%,and 90.73%.(4)For the analysis of text characteristics in the field of bridge detection,this thesis proposes a Lattice-LSTM-Softmax model for extracting entity relationship extraction in the field of bridge detection.This model integrates word-level features into character features,enabling LSTM neural networks to obtain character features At the same time,it also obtains explicit word features and word order features,which solves the problem of inaccurate entity relationship recognition due to inaccurate word segmentation.A comprehensive evaluation of the model on the data set shows that the model in this thesis is significantly better than other methods,and the corresponding accuracy,recall rate and F1 value are 73.08%,74.95%,and 74.00%,respectively.To sum up,this thesis first analyzes the textual characteristics of the field text in the context of bridge detection,develops the labeling specifications for bridge detection reports,and constructs a corpus for bridge detection.Based on the Transformer-Bi LSTM-CRF model,the research on named entity recognition in the bridge detection field and the research on entity relationship extraction in the bridge detection field based on the Lattice-LSTM-Softmax model have been performed. |