Co-Segmentation without any Pixel-level Supervision with Application to
Large-Scale Sketch Classification
Co-Segmentation without any Pixel-level Supervision with Application to
Large-Scale Sketch Classification
This work proposes a novel method for object co-segmentation, i.e. pixel-level localization of a common object in a set of images, that uses no pixel-level supervision for training. Two pre-trained Vision Transformer (ViT) models are exploited: ImageNet classification-trained ViT, whose features are used to estimate rough object localization through intra-class …