R: Text-to-Speech (Speech Synthesis)

tts {text2speech}

R Documentation

Text-to-Speech (Speech Synthesis)

Description

Convert text-to-speech using various engines, including Amazon Polly, Coqui TTS, Google Cloud Text-to-Speech API, and Microsoft Cognitive Services Text to Speech REST API.

With the exception of Coqui TTS, all these engines are accessible as R packages:

aws.polly is a client for Amazon Polly.
googleLanguageR is a client to the Google Cloud Text-to-Speech API.
conrad is a client to the Microsoft Cognitive Services Text to Speech REST API

Usage

tts(
  text,
  output_format = c("mp3", "wav"),
  service = c("amazon", "google", "microsoft", "coqui"),
  bind_audio = TRUE,
  ...
)

tts_amazon(
  text,
  output_format = c("mp3", "wav"),
  voice = "Joanna",
  bind_audio = TRUE,
  save_local = FALSE,
  save_local_dest = NULL,
  ...
)

tts_google(
  text,
  output_format = c("mp3", "wav"),
  voice = "en-US-Standard-C",
  bind_audio = TRUE,
  save_local = FALSE,
  save_local_dest = NULL,
  ...
)

tts_microsoft(
  text,
  output_format = c("mp3", "wav"),
  voice = NULL,
  bind_audio = TRUE,
  save_local = FALSE,
  save_local_dest = NULL,
  ...
)

tts_coqui(
  text,
  exec_path,
  output_format = c("wav", "mp3"),
  model_name = "tacotron2-DDC_ph",
  vocoder_name = "ljspeech/univnet",
  bind_audio = TRUE,
  save_local = FALSE,
  save_local_dest = NULL,
  ...
)

Arguments

`text`	A character vector of text to be spoken
`output_format`	Format of output files: "mp3" or "wav"
`service`	Service to use (Amazon, Google, Microsoft, or Coqui)
`bind_audio`	Should the `tts_bind_wav()` be run on after the audio has been created, to ensure that the length of text and the number of rows is consistent?
`...`	Additional arguments
`voice`	Full voice name
`save_local`	Should the audio file be saved locally?
`save_local_dest`	If to be saved locally, destination where output file will be saved
`exec_path`	System path to Coqui TTS executable
`model_name`	(Coqui TTS only) Deep Learning model for Text-to-Speech Conversion
`vocoder_name`	(Coqui TTS only) Voice coder used for speech coding and transmission

Value

A standardized tibble featuring the following columns:

index : Sequential identifier number
original_text : The text input provided by the user
text : In case original_text exceeds the character limit, text represents the outcome of splitting original_text. Otherwise, text remains the same as original_text.
wav : Wave object (S4 class)
file : File path to the audio file
audio_type : The audio format, either mp3 or wav
duration : The duration of the audio file
service : The text-to-speech engine used

Examples

## Not run: 
# Amazon Polly
tts("Hello world! This is Amazon Polly", service = "amazon")

tts("Hello world! This is Coqui TTS", service = "coqui")

tts("Hello world! This is Google Cloud", service = "google")

tts("Hello world! This is Microsoft", service = "microsoft")

## End(Not run)

[Package text2speech version 1.0.0 Index]