Font Size: a A A

Convolutional Neural Networks For Research Of Visual Recognition

Posted on:2017-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:K N XueFull Text:PDF
GTID:2348330509461705Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
In recent years, convolutional neural networks(CNN) have made a progress in visual recognition tasks with its powerful feature learning ability and are focus by both academic and industrial community. First, two innovations were implemented on CNN architectures in this word.1) A hybrid model called Bo CW-Net is proposed to solve the problem that full-connection layer in CNN is more sensitive to image's transformations such as translation, rotation and scale, et al. It embeds Bo W model into CNN architectures and replaces the full-connection layer, while it can learn feature, dictionary and classifier in the end-to-end way. In order to realize supervised learning of whole Bo CW-Net, Bo CW encoding based on direction similarity is proposed. In the meanwhile, to take full advantage of the discrimination of both mid-level and high-level features, middle-level auxiliary classifier is integrated to high-level classifier to form the main-auxiliary ensemble classifier. Experimental results show that Bo W model imbedded into CNN has better invariance for a variety of transformation compared with the full-connection layer. Main-auxiliary ensemble classifier could effectively fusion mid-level and high-level features to improve the recognition performance of Bo CW-Net. Compared with the newly developed CNN models, Bo CW-Net acquires improved recognition performance on CIFAR-10?CIFAR-100 and MNIST dataset with 4.88%, 22.48% and 0.21% final test error rate, respectively.2) Though chain CNN can solve coarse classification by using high-level features representing global information, it don't use mid-level features, which to be local, to tackle fine classification. Therefore, another improved model — Bo CW-Fusion Net is proposed in this word. Bo CW representations for mid-level and high-level features are cascaded and then connected to classifier. It learns features, dictionary and classifier in the end-to-end way with supervision. Experimental results show that Bo CW-FusionNet acquired subtle improvement compared with chain CNN, and got 5.36% and 24.82% test error rate respectively on CIFAR-10 and CIFAR-100 dataset.Second, improved CNN models(Bo CW-Net and Bo CW-Fusion Net) were employed to solve the actual application problems including vehicle-pedestrians recognition and male-female gender recognition. Vehicle-pedestrians dataset contain 6 class vehicle images(bus, car, minibus, truck, cyclo and motorcycle) and 1 class pedestrian images with classification accuracy 98.06%(Bo CW-Net) and 97.94%(Bo CW-Fusion Net) respectively. Male-female gender dataset include two class facial or head images with classification accuracy 96.20%(Bo CW-Net) and 94.90%(Bo CW-Fusion Net) respectively. The practical applications show that the improved models got better recognition performances.Recognition performances of Bo CW-Net and Bo CW-Fusion Net indicate that whether a public dataset or application data, compared with cascade way, main-auxiliary classifier ensemble way can effectively fuse mid-level and high level features for mid-level and high level Bo CW representations. Final, Bo CW-Net were adopt to CIFAR-10 object recognition competition and FER2013 facial expression recognition competition on Kaggle big data analysis platform, acquiring classification accuracy 95.10% and 70.10% respectively, which are second in the leadboard.
Keywords/Search Tags:convolutional neural networds, visual recognition, Bo W model, Bo W representation
PDF Full Text Request
Related items