Parametric Classification for Generalized Category Discovery:
A Baseline Study

Xin Wen1*
Bingchen Zhao2*
Xiaojuan Qi1
1The University of Hong Kong
2University of Edinburgh

Code [GitHub]

ICCV 2023 [Paper]

Slides [Link]

Poster [Link]


Left: building blocks for representation learning or classifier learning; Right: overall abstraction of current works, where ‘→’ separates different stages of the method. Our work builds on GCD, and jointly trains a parametric classifier.

Abstract

Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. Previous studies argued that parametric classifiers are prone to overfitting to seen categories, and endorsed using a non-parametric classifier formed with semi-supervised k-means. However, in this study, we investigate the failure of parametric clas- sifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We demonstrate that two prediction biases exist: the classifier tends to predict seen classes more often, and produces an imbalanced distribution across seen and novel categories. Based on these findings, we propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers. We hope the investigation and proposed simple framework can serve as a strong baseline to facilitate future studies in this field. Our code is available at: https://github.com/CVMI-Lab/SimGCD.


Overview


We observe strong biases in the prediction of the classifier, which are the main causes of the failure of parametric classifiers in GCD. We then propose a simple yet effective parametric classification method that benefits from entropy regularisation, and achieves state-of-the-art performance on multiple GCD benchmarks.


Main results

Results on the generic image recognition datasets, the Semantic Shift Benchmark, and Herbarium 19.


Robustness to unknown category number

Results with different numbers of categories. Stronger entropy regularisation effectively enforces the model's robustness to unknown numbers of categories, but over-regularisation may limit the ability to recognise 'New' classes under ground-truth class numbers.


Citation

Xin Wen, Bingchen Zhao, and Xiaojuan Qi
Parametric Classification for Generalized Category Discovery: A Baseline Study
In ICCV, 2023.


@inproceedings{wen2023simgcd,
  title={Parametric Classification for Generalized Category Discovery: A Baseline Study},
  author={Wen, Xin and Zhao, Bingchen and Qi, Xiaojuan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2023},
  pages={16590-16600}
}


Acknowledgements

This work has been supported by Hong Kong Research Grant Council - Early Career Scheme (Grant No. 27209621), General Research Fund Scheme (Grant No. 17202422), and RGC Matching Fund Scheme (RMGS). Part of the described research work is conducted in the JC STEM Lab of Robotics for Soft Materials funded by The Hong Kong Jockey Club Charities Trust. The authors acknowledge SmartMore and MEGVII for partial computing support, and Zhisheng Zhong for professional suggestions. The design of this project page was borrowed and modified from the template made by Phillip Isola and Richard Zhang.