Visual Acoustic Fields Dataset
This dataset includes ~2000 visual-sound pairs collected in 15 scenes. For each hitting position, we include its rendered rgb, CLIP features, and corresponding hitting sound.
Visual Acoustic Fields Dataset
This dataset includes ~2000 visual-sound pairs collected in 15 scenes. For each hitting position, we include its rendered rgb, CLIP features, and corresponding hitting sound.