Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception

Publication
Conference on Computer Vision and Pattern Recognition (CVPR) 2025