METHODS FOR INCREASING THE CLASSIFICATION ACCURACY BASED ON MODIFICATIONS OF THE BASIC ARCHITECTURE OF CONVOLUTIONAL NEURAL NETWORKS
Object of research: basic architectures of deep learning neural networks.
Investigated problem: insufficient accuracy of solving the classification problem based on the basic architectures of deep learning neural networks. An increase in accuracy requires a significant complication of the architecture, which, in turn, leads to an increase in the required computing resources, as well as the consumption of video memory and the cost of learning/output time. Therefore, the problem arises of determining such methods for modifying basic architectures that improve the classification accuracy and require insignificant additional computing resources.
Main scientific results: based on the analysis of existing methods for improving the classification accuracy on the convolutional networks of basic architectures, it is determined what is most effective: scaling the ScanNet architecture, learning the ensemble of TreeNet models, integrating several CBNet backbone networks. For computational experiments, these modifications of the basic architectures are implemented, as well as their combinations: ScanNet + TreeNet, ScanNet + CBNet.
The effectiveness of these methods in comparison with basic architectures has been proven when solving the problem of recognizing malignant tumors with diagnostic images – SIIM-ISIC Melanoma Classification, the train/test set of which is presented on the Kaggle platform. The accuracy value for the area under the ROC curve metric has increased from 0.94489 (basic architecture network) to 0.96317 (network with ScanNet + CBNet modifications). At the same time, the output compared to the basic architecture (EfficientNet-b5) increased from 440 to 490 seconds, and the consumption of video memory increased from 8 to 9.2 gigabytes, which is acceptable.
Innovative technological product: methods for achieving high recognition accuracy from a diagnostic signal based on deep learning neural networks of basic architectures.
Scope of application of the innovative technological product: automatic diagnostics systems in the following areas: medicine, seismology, astronomy (classification by images) onboard control systems and systems for monitoring transport and vehicle flows or visitors (recognition of scenes with camera frames).
He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778. doi: http://doi.org/10.1109/cvpr.2016.90 DOI: https://doi.org/10.1109/CVPR.2016.90
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K. (2017) Aggregated residual transformations for deep neural networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1492–1500. doi: http://doi.org/10.1109/cvpr.2017.634 DOI: https://doi.org/10.1109/CVPR.2017.634
Hu, J., Shen, L., Sun, G. (2018) Squeeze-and-excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7132–7141. doi: http://doi.org/10.1109/cvpr.2018.00745 DOI: https://doi.org/10.1109/CVPR.2018.00745
Tan, M., Le, Q. (2019) EfficientNet: Rethinking model scaling for convolutional neural networks. International Conference on Machine Learning, 6105–6114.
Zhang, L., Tan, Z., Song, J., Chen, J., Bao, C., Ma, K. (2019). SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models. Advances in Neural Information Processing Systems (NeurIPS), 4029–4038.
Lee, S., Purushwalkam, S., Cogswell, M., Crandall, D., Batra, D. (2015). Why M heads are better than one: Training a diverse ensemble of deep networks. Available at: https://arxiv.org/pdf/1511.06314
Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., Ling, H. (2020). CBNet: A Novel Composite Backbone Network Architecture for Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 34 (7), 11653–11660. doi: http://doi.org/10.1609/aaai.v34i07.6834 DOI: https://doi.org/10.1609/aaai.v34i07.6834
SIIM-ISIC Melanoma Classification. Kaggle. Available at: https://www.kaggle.com/c/siim-isic-melanoma-classification
Merge External Data. Kaggle. Available at: https://www.kaggle.com/shonenkov/merge-external-data
Lin, T., Goyal, P., Girshick, R., He, K., Dollar, P. (2017) Focal Loss for Dense Object Detection. IEEE International Conference on Computer Vision, 2980–2988. doi: http://doi.org/10.1109/iccv.2017.324 DOI: https://doi.org/10.1109/ICCV.2017.324
Copyright (c) 2020 Svitlana Shapovalova, Yurii Moskalenko
This work is licensed under a Creative Commons Attribution 4.0 International License.
Our journal abides by the Creative Commons CC BY copyright rights and permissions for open access journals.
Authors, who are published in this journal, agree to the following conditions:
1. The authors reserve the right to authorship of the work and pass the first publication right of this work to the journal under the terms of a Creative Commons CC BY, which allows others to freely distribute the published research with the obligatory reference to the authors of the original work and the first publication of the work in this journal.
2. The authors have the right to conclude separate supplement agreements that relate to non-exclusive work distribution in the form in which it has been published by the journal (for example, to upload the work to the online storage of the journal or publish it as part of a monograph), provided that the reference to the first publication of the work in this journal is included.