Speaker-Adaptive TTS

Meet the people working on it!

Wenbin Wang
Wenbin Wang

Research Overview

Speaker-Adaptive Text-to-Speech (TTS) aims to synthesize natural-sounding speech that accurately mimics the identity, timbre, and prosody of a specific target speaker, often requiring only a few seconds of reference audio (Zero-Shot scenario).

My research expands the frontiers of this field by addressing critical challenges in generalization, cross-lingual adaptation, and data scarcity.

Project Slides

slide
1 / 5

Related Papers

0%