Font Size: a A A

Research On Learner Behavior Classification Method Based On Time Sequence Action Detection

Posted on:2022-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:C Y SongFull Text:PDF
GTID:2517306344451234Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
It is one of the hot research topics of intelligent education to understand and master learners’ behaviors in time.Through the time sequence action detection in computer vision technology,students’ learning state can be automatically identified,which provides a strong support for the evaluation of learning effect.This paper focuses on the possible actions in the learning process of students in indoor teaching video,and discusses how to accurately locate the start and end time of actions and identify the corresponding action categories.(1)Based on the current standards of open video datasets such as THUMOS 2014 and Activitynet,and referring to the benchmark of building large-scale video data,aiming at the problem that the existing data sets can not meet the learners’ detection tasks in indoor teaching scene,the paper constructs the learner behavior detection in classroom and computer room scenarios dataset video data set provides data support for students’ behavior detection methods.There are 383 behavioral videos of students in the data set,with the size of 3.61GB,1280 × 720 and 30 frames/second.There are 13 types of actions(such as listening to class,taking notes,standing up and answering questions,etc.),and the average sample size of each action is 55,and the duration is about 6-9 seconds.(2)Aiming at the problem that the previous methods can not make full use of the specific audio information and visual information in the video information in the classroom and computer room scenes,and the two kinds of information are not effectively integrated,a Bi-Modal Transformer Porps sequential action name raising method which can read the audio characteristics in the video is proposed.The method first preprocesses and extracts the audio and video features,then encodes it by the feature encoder,then enters the action nomination decoder,and finally outputs the time segment containing the start and end time after the post-processing.Experiments in the open dataset THUMOS 2014 show that the nomination results are AR-AN@100 It is 42.4%,10.51%better than the existing turn network,13.4%better than TAG network,and the difference between the two networks is 0.2%,which is basically the same as that of CTAP network.It is on the video dataset of "Learner Behavior Detection Dataset" AR-AN@100 65.8%.(3)Aiming at the problem that the two stream features used in the previous methods ignore the time information in the learner behavior dataset video,an Enhanced-Decouple-SSAD temporal action detection method is proposed,which is suitable for the video features with time information.Firstly,the video features containing time information are preprocessed and processed through Enhanced-Decouple-SSAD network.Finally,the decoupled and prediction layers are used to output the action segments containing start and end time and the action categories contained in the segments.The results of experiments in the open dataset THUMOS 2014 show that the map index is 46.4%,It is better than Decouple SSAD network by 2.7%and BSN network by 9.5%,and 13.37%on the video dataset of"Learner Behavior Detection Dataset".
Keywords/Search Tags:Temporal action location, Temporal action detection, Indoor scene, Learner behavior
PDF Full Text Request
Related items