Datasets

Refer and Segment Objects in Audio-Visual Scenes (Ref-AVS) Dataset

Long Form Audio-Visual (LFAV) Dataset

Balanced Audiovisual Dataset

MUSIC-AVQA dataset

ADVANCE: AuDio Visual Aerial sceNe reCognition datasEt

DISCO: auDIoviSual Crowd cOunting dataset

MUSIC-Synthetic & VGGSound-Synthetic dataset

Realistic MUSIC & DailyLife dataset

Shuttersong Dataset