| Very High Frequency(VHF),as one of the most important means of maritime mobile radio communication,is the primary mode of communication between vessels and Vessel Traffic Service(VTS)centers.However,due to issues such as high noise levels,unclear transmissions,and diverse accents in VHF voice communication at sea,difficulties in recognizing communication intentions and low conversation efficiency arise,significantly impacting the safety of vessel navigation and the efficiency of VTS management.In response to this problem,this study conducts research on VHF voice intelligent processing algorithms,utilizing state-of-the-art deep learning methods to construct models for VHF voice enhancement,recognition,and text analysis,thereby improving the quality,efficiency,and automation level of VHF communication.In summary,the main contributions of this study are as follows:(1)Based on an analysis of the major problems faced by VHF communication in the maritime domain and in conjunction with the latest advancements in speech and natural language processing technologies,a comprehensive VHF voice intelligent processing solution is proposed.(2)To address the issue of VHF voice enhancement,a VHF voice dataset is constructed using collected real-world VHF voice samples along with noise extracted from these samples and added to a clean public speech dataset.For VHF voice recognition and text analysis,a VHF text analysis dataset is constructed through annotation of matched VHF voice and text,as well as annotation of key entities in the text.(3)A Transformer-based VHF voice enhancement model is proposed,which is tailored to the characteristics of VHF voice in the maritime domain and capable of simultaneously reducing long-term and short-term noise.The speech quality perception evaluation score on the VHF voice test set reaches 2.31,and the short-term intelligibility achieves 0.78,outperforming other speech enhancement models such as SEGAN,Wave-U-Net,and TSTNN.(4)By leveraging the end-to-end speech recognition framework We Net and utilizing the Conformer speech recognition module,VHF voice recognition is achieved to obtain VHF dialogue text.After training on the VHF dataset constructed in this study,the error rate of this model on the test set is 15.73%,surpassing the Transformer model’s 15.96%.(5)A model for joint entity relation extraction is proposed,utilizing the pre-trained language model BERT and the bidirectional recurrent neural network Bi LSTM,enabling analysis of VHF dialogue text and extraction of key information in triple form,thereby improving the accuracy and efficiency of VHF voice processing.This model,trained on the VHF text dataset constructed in this study,achieves evaluation metrics with an F1 value of83.6%,accuracy of 84.3%,and recall of 82.3%,outperforming other mainstream models for joint entity relation extraction such as ETL Span,TDEER,and Bi RTE.By employing the latest deep learning methods in speech and natural language processing,this study conducts research and constructs a series of models for VHF voice intelligent processing in the maritime domain,achieving promising results.It represents a valuable exploration of the application of artificial intelligence technology in the maritime domain,with significant theoretical significance,and holds practical value in improving the level of intelligent voice interaction between vessels and communication management at VTS centers. |