Description: VALUE is a evaluation benchmark built on 11 datasets across 3 tasks with the focus on multi-channel videos (video+subtitle) for Video-And-Language Understanding Evaluation
video captioning (5) video description (3) video+subtitle (2) video-and-language benchmark (1) multi-channel video (1) video retrieval (1) video question answering (1)
A Comprehensive Benchmark for V ideo- A nd- L anguage U nderstanding E valuation.
With both Video Frames and Subtitle/ASR
Diverse video content from YouTube , TV Episodes and Movie Clips