Intent detection is a challenging task and critical problem in dialogue systems,aiming to classify user utterances into different categories based on the respective domains for downstream tasks.With the rapid growth of data on the internet,user intents have become diverse,and many open intents have never appeared in the training set.However,current intent detection research mostly focused on closed datasets,and existing methods face the following three challenges: firstly,it is difficult for existing methods to obtain discriminative feature representations without open intent prior knowledge;secondly,there is a lack of effective methods to obtain a specific and tight decision boundary;finally,annotations are scarce and costly,and unsupervised data is not fully utilized.These challenges make it difficult for existing methods to meet user needs.Against this background,how to enable computers to detect open intent samples with a small amount of labeled data has become a key issue that needs to be solved urgently in academia and industry.To address these challenges,this thesis studies open intent detection based on deep learning methods.The main research contents are as follows:1)This thesis proposes a deep feature representation method based on contrastive learning.Existing research rarely considers distance information during the feature learning stage,which may result in the final features not genuinely reflecting the distance relationship between different intent category samples.This thesis proposes a contrastive learning method that combines distance information into the feature representation stage to learn more discriminative feature representations,thereby improving the performance of subsequent open intent detection tasks.Firstly,an improved contrastive learning loss function is used to fine-tune the pre-trained language model to maximize inter-class distance and minimize intra-class distance.Secondly,trained classifiers are used to classify text features according to intent.Finally,an outlier detection method based on local density is used to detect open intent.The method was evaluated on two publicly available benchmark datasets for open intent detection.Experimental results show that this method effectively detects open intent and outperforms the baseline methods.In addition,this method is a general feature representation learning method suitable for other detection methods.2)This thesis proposes an open intent detection method based on decision boundaries.The lack of prior knowledge of unknown intent samples limits open intent detection methods.This thesis introduces self-supervised generated negative samples to simulate unknown(open)intent samples and learns appropriate decision boundaries for each intent category.Firstly,a feature extraction model is used to extract deep feature representations from user input.Secondly,various negative sample enhancement methods are used to simulate intent distribution in the real world.Finally,positive and negative samples are used to learn adaptive decision boundaries.In addition,both feature level and character level negative sample generation techniques are used to obtain a better decision boundary learning effect and stronger robustness.Experiments on three benchmark datasets show that the performance of the proposed method is about 3%higher than that of the current advanced methods on F1-Score,and can learn decision boundaries close to optimal.3)This thesis proposes a semi-supervised learning method for open intent detection.Current open intent detection work mainly rely on labelled data in the training stage,but a large amount of unlabelled data that is not fully utilized.Therefore,this thesis proposes a semi-supervised learning framework that uses distance information between samples and class centers for pseudo-labelling and proposes a novel confidence score calculation method to alleviate errors in the pseudo-labelling process.Firstly,pre-train the language model using labelled data and use it as a feature extractor.Secondly,pseudo-label the unsupervised samples while obtaining the confidence scores corresponding to each sample.Finally,the method trains the adaptive decision boundary using positive and negative samples and their corresponding confidence scores.Experimental results on three benchmark datasets show that the proposed method effectively utilizing the unlabeled data and improve the F1-Score by up to 6%.The experiments show that the method proposed in this paper has achieved good performance in the corresponding research content.It not only effectively solves key scientific problems in open fields but also facilitates the transformation of academic research results into practical applications.Furthermore,these new research methods are beneficial for constructing task-oriented dialogue systems,laying the foundation for further exploration of human-machine interaction.Finally,this paper thoroughly discusses the challenges currently facing research and the prospects for the future. |