Speech Synthesis¶
Speech Synthesis related modeling class
-
class
pororo.tasks.speech_synthesis.
PororoTtsFactory
(task: str, lang: str = 'multi', model: Optional[str] = None)[source]¶ Bases:
pororo.tasks.utils.base.PororoFactoryBase
Synthesis text to speech using trained model Output audio’s sample rate is 22050
Multi (tacotron)
dataset: TBU
metric: TBU
- Parameters
- Returns
waveform of speech signal
- Return type
ndarray
Examples
>>> import IPython >>> from IPython.display import Audio >>> model = Pororo(task="tts", lang="multi") >>> # Typical TTS >>> wave = model("how are you?", lang="en") >>> IPython.display.display(IPython.display.Audio(data=wave, rate=22050)) >>> # Voice Style Transfer >>> model = Pororo(task="tts", lang="multi") >>> wave = model("저는 미국 사람이에요.", lang="ko", speaker="en") >>> IPython.display.display(IPython.display.Audio(data=wave, rate=22050)) >>> # Code-Switching >>> wave = model("저는 미국 사람이에요.", lang="ko", speaker="en-15,ko") >>> IPython.display.Audio(data=wave, rate=22050)
Notes
Currently 11 languages supports. Supported Languages: English, Korean, Japanese, Chinese, Jejueo, Dutch, German, Spanish, French, Russian, Finnish This task can designate a speaker such as ko, en, zh etc.