Cross-Lingual Adaptation for Text to Speech Synthesis (CLAS3)

Recent advances in statistical text to speech synthesis (TTS) have enabled voice personalization via the adaptation techniques normally associated with automatic speech recognition (ASR). Such techniques allow a synthesis voice to match a given voice using a short sample of the given voice. Cross-language adaptation has the potential to enable personalised translation services. However, whilst the adaptation works for a given language, how to do it across languages is still a research issue. Research under the FP7 EMIME project at Idiap has demonstrated feasibility. One current difficulty is that of how to separate speaker specific characteristics from language specific characteristics. In this project, we propose to use bilingual speakers to separate language and speaker characteristics.

Themes

Perceptive and Cognitive Systems

Leader Name

Funding

Start

Nov 01, 2011

Stop

Aug 31, 2012

Groups

Speech & Audio Processing

Cross-Lingual Adaptation for Text to Speech Synthesis (CLAS3)

About

Research

Innovation

Education

News

Events

Careers