A large-scale multilingual speech dataset covering 54 languages from 80 districts across India, aimed at developing AI-driven speech recognition (ASR), speech translation (SST), and natural language understanding (NLU) models
VAANI is an India-representative multi-modal multi-lingual dataset. The current version (phase 1- 80 districts) contains ~16,000 hours of spontaenous,image-prompted speech (9.6 Million utterances) by 84.6K speakers across 80 districts, talking about 130K images covering 54 languages. From this audio data, 788.03 hours of transcribed data(text) is available, spanning almost evenly across the 80 districts. Project Vaani, by IISc, Bangalore and ARTPARK, is capturing the true diversity of… See the full description on the dataset page: https://huggingface.co/datasets/ARTPARK-IISc/Vaani.
CC-by-4.0
© 2025 - Copyright AIKosha. All rights reserved. This portal is developed by National e-Governance Division for IndiaAI mission.