| With the wide application of big data and artificial intelligence technologies,natural language processing has also developed rapidly.Among them,corpus,as the basis of natural language processing,has become the focus and hotspot of research.Corpus refers to a largescale electronic text library that has been scientifically sampled and processed,and researchers can carry out relevant language theory and application research with the help of computers.In the field of natural language processing,the construction technology of English-Chinese bilingual parallel corpus is progressing rapidly,but the research on the construction technology of bilingual parallel corpus of NOTAM is still in its infancy,and there is a lot of research and development space for corpus scale and related technologies.Therefore,the research and construction of the relevant corpus in the field of NOTAM has mainly done the following work: based on the analysis of the common methods of constructing the English-Chinese bilingual parallel corpus and the characteristics of the NOTAM corpus itself,the construction method of the NOTAM parallel corpus is proposed;the use of crawler technology and Other methods complete the collection of corpus,and use algorithms to complete the preprocessing of corpus;on the basis of corpus preprocessing,paragraph alignment and sentence alignment algorithms are used to construct corresponding paragraph alignment and sentence alignment corpora;paragraph alignment and sentence alignment corpus construction method After completion,in order to facilitate the continuous accumulation of NOTAM corpus,a NOTAM corpus parsing system was developed using python language and flask framework;a NOTAM machine translation model based on transfer learning and backtranslation was proposed,which made up for the lack of machine translation models in the NOTAM field.blank.Based on the above work,a method for collecting NOTAM corpus and algorithms for paragraph alignment and sentence alignment are proposed,a corpus is constructed,and the algorithm is integrated into the system to facilitate the subsequent collection of parallel corpora;a navigation method based on back-translation and transfer learning is proposed.Announce machine translation model.Experiments show that the paragraph alignment algorithm can accurately parse the paragraph-aligned corpus,the sentence alignment algorithm can parse the sentence-aligned corpus,the parallel corpus parsing system can complete the corpus accumulation task,and the machine translation model can accurately translate the professional corpus in the field of NOTAM with excellent results. |