| Suicide is one of the important causes of abnormal deaths in the world’s population.It will cause serious harm to individuals,families and society,and has become a global social problem.Therefore,there is a great need to detect individuals with suicide ideation as soon as possible and intervene or provide assistance to reduce the occurrence of suicidal behaviors.At present,the commonly used suicide ideation detection methods are mainly evaluated by psychologists with the help of questionnaires,scales or interviews,etc.,but such invasive methods may cause patients to resist and conceal their true situation.At the same time,such methods are extremely inefficient and are not suitable for screening suicidal people in a large population.With the development of the Internet,the public are more and more inclined to express their inner thoughts on social media,which also provides a new way for the suicide detection research.Some studies have been conducted to detect suicide ideation based on the text and other information posted by users on social media,and such methods have promising applications as they can efficiently detect suicidal users from a large amount of data.However,the existing research also has some problems to be solved urgently,such as the lack of public data sets with high reliability,data imbalance,and the poor universality of the model.These problems have seriously restricted the development of the field of suicide ideation detection.In this paper,based on the existing research results,we provide tentative solutions to these problems.Specifically,this paper mainly includes the following research contents:(1)Construct a Chinese suicide ideation detection dataset based on Sina Weibo.At present,the suicide detection data set based on social media is mainly in English,and there is a lack of Chinese datasets with high credibility.In this paper,we construct a dataset for Chinese suicide ideation detection based on Sina Weibo and desensitize the private content.This dataset contains a total of 563,336 posts posted by 1606 users with suicide ideation and 2915 control group users.(2)Propose the deep hierarchical ensemble model for suicide ideation detection based on imbalanced social media data.For the problem of data imbalance in the field of suicide detection,this paper propose a deep hierarchical ensemble model for suicide detection(DHE-SD)based on a hierarchical ensemble strategy,which divides the imbalanced data set and integrates the prediction results of multiple classifiers through the integration method.In addition,this paper also proposes a sentence-level mask mechanism to mask users’ posts with strong negative sentiments in a specific environment,so that the trained suicide detection model has higher generalizability.(3)Propose the deep suicide ideation detection model based on multi-feature fusion and decision-level fusion.Based on the features of Sina Weibo platform,this paper extracts three types of representative features from users’ posts,and fuses the three types of features with text information.Meanwhile,this paper proposes a decision-level fusion mechanism,which combines classifiers trained with multiple features,and proposes a deep suicide ideation detection model called TCNN-MF-DL based on multi-feature fusion and decision-level fusion.In order to verify the effectiveness of the DHE-SD and TCNN-MF-DL models proposed in this paper,a large number of comparison experiments and result analysis are conducted in this paper.The experiments show that the model proposed in this paper has excellent suicide ideation detection performance and can accurately and efficiently detect users with suicide ideation in social media. |