| This research studies the relationship extraction and semantic matching problems required by the new generation network security evaluation platform,and establishes the relevant model suitable for the evaluation platform,so that they can be applied to the task of text processing of the new generation network security evaluation platform.In the construction.of a new generation of network security evaluation platform,the platform should have the basic ability to intelligently analyze long text security evaluation reports.In other words,the platform can structurally extract noun entities and their corresponding relationships in the text,and can analyze the consistency of the statements describing the relationships between the same entities in the context.For the extraction of entity relationships,a table-sequence joint extraction model is constructed to identify named entities and extract the relationships between entities in the field of security evaluation.This model can effectively avoid error accumulation in PipeLine extraction and task characterization conflict in single model joint extraction model.For semantic consistency analysis,an unsupervised semantic matching model is constructed,which can not only complete the task of semantic consistency judgment,but also use the unsupervised feature to reduce human investment,and also have the ability of model self-optimization.At the same time,in view of the lack of data in the field of network security evaluation,the training,construction and verification of professional field models are completed by constructing the data processing process of the evaluation report to obtain a data set that corresponds to the real security assessment report entity distribution and context.In this thesis,experiments are designed on external data sets and data sets in the field of security evaluation to prove that the entity-relationship joint extraction model is superior to the conventional PipeLine model and joint extraction model.At the same time,experiments verify that the"combination" brings 2.4%and 1.2%improvement of model performance in NER and RE tasks.The experiments also prove that the semantically consistent matching model is superior to the traditional unsupervised semantic matching model.The effects of model parameter selection on the results are explored experimentally to verify the rationality of model parameter selection.In addition,it is also verified that the unsupervised model can build a more complete semantic space by stacking the amount of data.At this time,and the AUC index is about 3.06%ahead of the conventional data model. |