Text-to-speech models trained using FastPitch and HiFi-GAN vocoder, separately for each language. Supports both 'female' and 'male' voices
This repository contains a Fastspeech2 Model for 16 Indian languages (male and female both) implemented using the Hybrid Segmentation (HS) for speech synthesis. The model is capable of generating mel-spectrograms from text inputs and can be used to synthesize speech. Fs2 is composed of 6 feed-forward Transformer blocks with multi-head self-attention and 1D convolution on both phoneme encoder and mel-spectrogram decoder.
MIT
IIT Madras
Text to Speech
open
AI4Bharat
Sector Agnostic
21/02/25 13:21:39
Nikhil Narasimhan
0
MIT
© 2025 - Copyright AIKosha. All rights reserved. This portal is developed by National e-Governance Division for IndiaAI mission.