ISSN 0021-3454 (print version)
ISSN 2500-0381 (online version)

vol 67 / January, 2024

DOI 10.17586/0021-3454-2022-65-3-194-203

UDC 004.896


I. D. Nenakhov
ITMO University, Faculty of Control Systems and Robotics, International Laboratory of Biomechatronics and Energy-Efficient Robotics;

K. Artemov
ITMO University, Faculty of Control Systems and Robotics, International Laboratory of Biomechatronics and Energy-Efficient Robotics;

S. h. Zabihifar
Sberbank, Robotics Laboratory;

A. N. Semochkin
Sberbank, Robotics Laboratory;

S. A. Kolyubin
ITMO University, Saint Petersburg, 197101, Russian Federation; Associate Professor

Read the full article 

Abstract. Ways to expand the set of recognized object classes for the task of segmenting them, where it is necessary to build an object mask, as well as to find out its class, are considered. For the first task, methods that do not depend on the classes of subjects and are the most resistant to shape changes were used; for the second task, methods based on iterative learning and methods of metric learning are analyzed. The second approach is chosen as the main one, and various neural network architectures are tested for it. The classification of objects using the k nearest neighbors algorithm is carried out. The COIL-100 set is used as a data set for training a neural network, and after that the trained model was tested on its own data set. The experiments show that the method used allows processing 7-8 images per second on a GTX 1050 ti graphics card with 4 GB of video memory with a classification accuracy of 99%.
Keywords: metric learning, iterative learning, segmentation, classification, convolutional neural networks, robotics, image recognition

  1. Krizhevsky A., Sutskever I., Hinton G.E. Advances in Neural Information Processing Systems, 2012, vol. 25, pp. 1097–1105,
  2. He K., Zhang X., Ren Sh., Sun J. CoRR, 2015, vol. ab¬s/1512.03385.
  3. Howard A.G., Zhu M., Chen B. et al. CoRR, 2017, vol. abs/1704.04861,
  4. He K., Gkioxari G., Doll ́ar P., Girshick R.B. CoRR, 2017, vol. abs/1703.06870,
  5. Ren Sh., He K., Girshick R., Sun J. Advances in Neural Information Processing Systems, 2015, vol. 28, pp. 91–99,
  6. Kirkpatrick J., Pascanu R., Rabinowitz N., Veness J., Desjardins G., Rusu A.A., Milan K., Quan J., Ramalho T., Grabska-Barwinska A., Hassabis D., Clopath C., Kumaran D., and Hadsell R. Overcoming catastrophic forgetting in neural networks, 2017, no. 13(114), pp. 3521–3526.
  7. Zenke F., Poole B., and Ganguli S. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017, vol. 70, pр. 3987–3995.
  8. Lomonaco V. and Maltoni D. Proceedings of the 1st Annual Conference on Robot Learning, PMLR, 2017, vol. 78, Proceedings of Machine Learning Research, pр. 17–26.
  9. Rusu A.A., Rabinowitz N.C., Desjardins G., Soyer H., Kirkpatrick J., Kavukcuoglu K., Pascanu R., and Hadsell R. arXiv preprint, arXiv:1606.04671, 2016.
  10. Hayes T.L., Cahill N.D., and Kanan Ch. arXiv preprint, arXiv:1809.05922, 2018.
  11. Rebuffi S.-a., Kolesnikov A., Sperl G., and Lampert Ch.H. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, 2017.
  12. Hoffer E. and Nir A. International Workshop on Similarity-Based Pattern Recognition, Springer, Cham, 2015.
  13. Nene S.A., Nayar S.K., and Murase H. Columbia Object Image Library (COIL-100), Technical Report CUCS-006-96, February 1996.
  14. Wu Y. Detectron2,, 2019.
  15. Paszke A., Gross S., Massa F. et al. Advancesin Neural Information Processing Systems 32, Curran Associates, Inc., 2019, pp. 8024–8035.
  16. Musgrave K., Belongie S., and Lim S.-N. arXiv preprint, arXiv:2008.09164, 2020.