Multi-keypoints matching network for clothing detection

Li, Ye; Zhang, Wu; Wu, Meiling; Zhang, Di; Wang, Zhiguo; You, Changjiang

doi:10.1007/s00371-024-03337-y

Multi-keypoints matching network for clothing detection

Original article
Published: 25 March 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Ye Li^1,2,
Wu Zhang¹,
Meiling Wu¹,
Di Zhang¹,
Zhiguo Wang¹ &
…
Changjiang You¹

74 Accesses
1 Altmetric
Explore all metrics

Abstract

Clothing detection is a hot research focus as its application of identifying the specific category of clothing, such as long-sleeved and short-sleeved. Image-based clothing detection requires the model to detect accurate position. At present, the approaches of clothing detection are mainly divided into two categories: one is top-down, which is anchor-based and needs to calculate the intersection over union between the anchor box and the bounding box, but it is limited by the setting of the anchor box and does not perform well when the clothing scale is variable; the other is bottom-up, which uses the feature extraction network to get the keypoints and calculates the position and size of the clothing, but the prediction of the keypoints often has a slight error for it lacks the internal information of the clothing. To address the above issues, we propose a multi-keypoints matching network for clothing detection (MKMnet) based on the bottom-up method. It detects three keypoints (top-left corner point, bottom-right corner point, and center point) to ensure high detecting accuracy. Firstly, we perform corner keypoint matching by calculating the distance between the embedding vectors of different corner points to get the initial bounding box, then we get the final bounding box by matching the center point. The way to get the bounding box by corner point matching makes the model have the ability to detect clothing of any scale and shape, and adding the center point for further verification eliminates a large number of false-positive bounding boxes. The MKMnet proposed in this paper can obtain the bounding boxes accurately through the linear combination of the center point, and improving the accuracy of clothing recognition. The experimental results show that the MKMnet has higher accuracy than existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Structured Feature Learning Model for Clothing Keypoints Localization

A deep learning-based feature extraction of cloth data using modified grab cut segmentation

Article 15 July 2022

A Part-Based and Feature Fusion Method for Clothing Classification

Availability of data

The data that support the findings of this study are available online. These datasets were derived from the following public domain resources: [DeepFashion2].

References

Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part III 12, pp. 609–623 (2012). Springer
Yan, S., Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Unconstrained fashion landmark detection via hierarchical recurrent transformer networks. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 172–180 (2017)
Wang, W., Xu, Y., Shen, J., Zhu, S.-C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4271–4280 (2018)
Yamaguchi, K., Hadi Kiapour, M., Berg, T.L.: Paper doll parsing: Retrieving similar styles to parse clothing items. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3519–3526 (2013)
Ji, X., Wang, W., Zhang, M., Yang, Y.: Cross-domain image retrieval with attention modeling. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1654–1662 (2017)
Liao, L., He, X., Zhao, B., Ngo, C.-W., Chua, T.-S.: Interpretable multimodal retrieval for fashion products. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 1571–1579 (2018)
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Ge, Y., Zhang, R., Wang, X., Tang, X., Luo, P.: Deepfashion2: A versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5337–5345 (2019)
Chen, M., Qin, Y., Qi, L., Sun, Y.: Improving fashion landmark detection by dual attention feature enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Shajini, M., Ramanan, A.: An improved landmark-driven and spatial-channel attentive convolutional neural network for fashion clothes classification. Vis. Comput. 37(6), 1517–1526 (2021)
Article Google Scholar
Lin, T.-H.: Aggregation and finetuning for clothes landmark detection. arXiv preprint arXiv:2005.00419 (2020)
Chen, K., Pang, J., Wang, J., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Shi, J., Ouyang, W., et al.: Hybrid task cascade for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4974–4983 (2019)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Majuran, S., Ramanan, A.: A single-stage fashion clothing detection using multilevel visual attention. The Visual Computer, 1–15 (2022)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37 (2016). Springer
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Yang, W., Luo, P., Lin, L.: Clothing co-parsing by joint image segmentation and labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3182–3189 (2014)
Hadi Kiapour, M., Han, X., Lazebnik, S., Berg, A.C., Berg, T.L.: Where to buy it: Matching street clothing photos in online shops. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3343–3351 (2015)
Zheng, S., Yang, F., Kiapour, M.H., Piramuthu, R.: Modanet: A large-scale street fashion dataset with polygon annotations. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 1670–1678 (2018)
Sidnev, A., Krapivin, A., Trushkov, A., Krasikova, E., Kazakov, M.: Deepmark++: Centernet-based clothing detection (2020)
Kim, H.J., Lee, D.H., Niaz, A., Kim, C.Y., Memon, A.A., Choi, K.N.: Multiple-clothing detection and fashion landmark estimation using a single-stage detector. IEEE Access 9, 11694–11704 (2021)
Article Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578 (2019)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet++ for object detection. arXiv preprint arXiv:2204.08394 (2022)
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755 (2014). Springer
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: Point set representation for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9657–9666 (2019)
Tian, Q., Chanda, S., Gray, D.: Improving apparel detection with category grouping and multi-grained branches. Multimedia Tools and Applications 82(5), 7383–7400 (2023)
Article Google Scholar

Download references

Funding

This work was supported in part by the Young People Fund of Xinjiang Science and Technology Department (No 2022D01B05).

Author information

Authors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Ye Li, Wu Zhang, Meiling Wu, Di Zhang, Zhiguo Wang & Changjiang You
Kashi Institute of Electronics and Information Industry, Kashi, China
Ye Li

Authors

Ye Li
View author publications
You can also search for this author in PubMed Google Scholar
Wu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Meiling Wu
View author publications
You can also search for this author in PubMed Google Scholar
Di Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiguo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Changjiang You
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ye Li.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Zhang, W., Wu, M. et al. Multi-keypoints matching network for clothing detection. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03337-y

Download citation

Accepted: 15 October 2023
Published: 25 March 2024
DOI: https://doi.org/10.1007/s00371-024-03337-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-keypoints matching network for clothing detection

Abstract

Access this article

Similar content being viewed by others

A Structured Feature Learning Model for Clothing Keypoints Localization

A deep learning-based feature extraction of cloth data using modified grab cut segmentation

A Part-Based and Feature Fusion Method for Clothing Classification

Availability of data

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-keypoints matching network for clothing detection

Abstract

Access this article

Similar content being viewed by others

A Structured Feature Learning Model for Clothing Keypoints Localization

A deep learning-based feature extraction of cloth data using modified grab cut segmentation

A Part-Based and Feature Fusion Method for Clothing Classification

Availability of data

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation