In order to give new teams an easy entry, we provide results of content analysis to all teams. The V3C1 dataset already comes with segmentation information and includes shot boundaries as well as keyframes. Moreover, we provide resulting data from different content analysis steps (e.g., color, faces, text, detected ImageNet classes, etc.). The analysis data is available here and described in this article. Also the ASR data has been released here (many thanks to Luca Rossetto et al.)! Moreover, the SIRET team shared their shot detection network too (many thanks to Jakub Lokoc ans his team)!

Shot Boundary Detection

The SIRET team provides a state-of-the-art shot boundary detection network TransNet V2 in (see paper here:

Existing Browsing Tool

If you want to join the VBS competition but do not have enough resources to build a new system from scratch, you can start with and extend a simple lightweight version of SOMHunter, the winning system at VBS 2020. The system is provided with all the necessary metadata for the V3C1 dataset.

Providing a solid basis for research and development in the area of multimedia management retrieval, vitrivr is a modular open-source multimedia retrieval stack which has been participating to VBS for several years. It’s flexible architecture allows it to serve as a platform for the development of new retrieval approaches. The entire stack is available fromĀ

V3C1 Dataset

VBS2021 will use the V3C1 dataset in collaboration with NIST, i.e. TRECVID 2020 (i.e. with the Ad-Hoc Video Search (AVS) Task) , which consists of 7475 video files, amounting for 1000h of video content (1082659 predefined segments) and 1.3 TB in size. In order to download the dataset (which is provided by NIST), please complete this data agreement form and send a scan to with CC to and You will be provided with a link for downloading the data.

