Video Course Portrait Research Based On Seq2Seq Structure

Posted on:2024-02-12

Degree:Master

Type:Thesis

Country:China

Candidate:X Wang

Full Text:PDF

GTID:2557306917465584

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of Internet technology,the content on the Internet has exponentially exploded,and while people enjoy increasingly convenient content provision services,the large amount of redundant information has also brought many troubles to users.Similarly,in the field of education,the content of online educational resources is becoming more and more extensive,and students learn knowledge from a wider and wider range of sources,but as learning progresses,the problem of redundant information becomes more and more serious,which greatly wastes students’ learning time.In this paper,we construct a video course pictorial system based on Seq2 Seq structure by integrating speech recognition technology,named entity recognition technology,and group intelligence optimization technology to facilitate students to quickly overview the video content,precisely locate the key positions of the knowledge points to be learned,and improve students’ learning efficiency.The specific research of this paper is as follows.(1)To address the problem of generalization ability of speech recognition model,a Conformer speech recognition model,Conformer-R,based on R-Drop structure is proposed to enhance the generalization ability of the model by multiplex Dropout.The model is first pre-trained using Aishell1 and Wenetspeech datasets,and later fine-tuned using computer domain audio training data.Comparative tests are conducted on test＿meeting and test＿net test sets provided by wenet and test＿ai test set provided by Aishell1,and better amount results are obtained.The model was fine-tuned using the teaching course data to achieve the expected results.(2)Combining the R-Drop structure with the XLNet pre-trained model and using the Transfomer encoder with relative position encoding for data encoding,the XLNetTransformer-R model is proposed to enhance the accuracy of the model for the named entity recognition task,and it is experimentally demonstrated that the XLNetTransformer-R on MSRA The F1 values of XLNet-Transformer-R are higher than the results of the model before improvement,and the performance is excellent when comparing the experiments with other three models.(3)A multi-spatial cooperative game particle swarm algorithm is proposed,using the speech recognition model batch＿bins values as particles and the model loss values as the fitness values of the algorithm,and recalculating the batch＿bins size after each epoch,and then optimizing the model batch＿bins.experimental results prove that the optimized speech recognition model is more accurate on Aishell1 test set decreased the character error rate by 0.36%,which proved the effectiveness of the method.

Keywords/Search Tags:

speech recognition, named entity recognition, group intelligence, course profiling, deep learning

PDF Full Text Request

Related items

1	Research And Application Of High School Chemistry Test Question Retrieval Method Based On Chinese Named Entity Recognition
2	Design And Implementation Of Named Entity Recognition System Based On Network Text Of Winter Olympic Games
3	Research On Key Technologies Of Ai-assisted Online Education
4	Research On Structured Extraction Of Recruitment Text Data Based On Deep Learning
5	Multi-channel Convolutional Classroom Speech Emotion Recognition Based On Attention Mechanism
6	Research On Named Entity Recognition Based On Deep Learning
7	Research On Automatic Labeling Algorithm Of Mathematical Knowledge Points Based On CRF And Deep Learning
8	Improved Research On Speech Emotion Recognition Based On Phonological Representation
9	Research On Speech Recognition Technology For Online Education Application
10	Research And Application Of Named Entity Recognition In Tourism Domain Based On Lexical Enhancement And Feature Fusion