Is it possible to train a Turkish text-to-speech model with English data?

ERGÜN, Engin; YILDIRIM, Tülay

doi:10.14744/rase.2022.0001

Engin ERGÜN ¹

, Tülay YILDIRIM ²

¹TÜBİTAK BİLGEM, Kocaeli, Turkey
²Yıldız Technical University, Faculty of Electrical and Electronics Engineering, İstanbul, Turkey

Recent Advances in Science and Engineering 2022; 1(2): 1-5 DOI: 10.14744/rase.2022.0001

Full Text PDF

Abstract

Most natural language processing (NLP) studies need language-specific data for that language. Some languages like Turkish have scarce data sources to train successful deep learning models. Studies like speech synthesis require dozens of hours of professionally recorded speech with its correct transcription. Creating or finding datasets for text-to-speech (TTS) studies can be quite costly for both time and financial perspectives. This study tries to observe whether English acoustic data can be used to train a Turkish text-to-speech model to eliminate the data problem.

Keywords: Natural language processing, nlp, speech processing, speech synthesis, text-to-speech, tts, artificial intelligence