TheWebConf 2026 Study Repo
This repository provides supplementary material used in our experiments on video ad classification using Multimodal LLMs.
It includes the annotated dataset, metadata sources, transcription pipelines, experimental notebooks, and the annotation codebook.
Contains the final annotated dataset.
Each row includes:
- 🎬 Video ID
- 🏷️ Primary Label
- 🏷️ Secondary Label
- ✏️ English (Translated) Transcription
- ✏️ Native Transcription
- 🗂️ Metadata fields (title, tags, thumbnail, channelTitle, description)
- 🌍 Languages — indicates unavailable or region-restricted videos
Notebooks used for dataset construction, metadata enrichment, and experimental evaluation.
- 📥
YouTube Videos Download.ipynb— Raw video collection & filtering - 🧾
Download Video Metadata.ipynb— Metadata retrieval & preprocessing
Run, evaluate, and reproduce experiments on:
- 🗣️ Transcription-only models
- 🏷️ Metadata-only models
- 🔀 Multimodal fusion pipelines
- 🔮 Gemini-based baselines
- 🔬 Ablations & sampling strategies
Set the directories correctly in the notebook for the required CSVs. Please note that CSVs for video IDs, transcriptions and labels is combined in one CSV.
(All notebook files are listed inside the folder.)
Final annotation guide used by human labelers.
Contains:
- 📚 Definitions for all primary & secondary labels
- 🖼️ Label examples
⚠️ Edge cases & annotation rules
Contains base CoT prompt use across all experiments.
Additional supplementary files supporting the dataset and experiments.
- ✅ Multilingual transcriptions (native + translated)
- ✅ Unified metadata integration
- ✅ Pre-labeled dataset for replication and benchmarking
- ✅ Modular and customizable pipeline
- ✅ Transparent labeling methodology via codebook
- 📄 Load the dataset from
ground_truth.csv - 🧪 Use the Python notebooks to:
- Reproduce experiments
- Extend analyses
- Implement additional modeling pipelines
- 📘 Consult
Updated Codebook.pdffor label semantics
Contact any of the following emails in case of issues or questions: