tts {text2speech} | R Documentation |
Text-to-Speech (Speech Synthesis)
Description
Convert text-to-speech using various engines, including Amazon Polly, Coqui TTS, Google Cloud Text-to-Speech API, and Microsoft Cognitive Services Text to Speech REST API.
With the exception of Coqui TTS, all these engines are accessible as R packages:
-
aws.polly is a client for Amazon Polly.
-
googleLanguageR is a client to the Google Cloud Text-to-Speech API.
-
conrad is a client to the Microsoft Cognitive Services Text to Speech REST API
Usage
tts(
text,
output_format = c("mp3", "wav"),
service = c("amazon", "google", "microsoft", "coqui"),
bind_audio = TRUE,
...
)
tts_amazon(
text,
output_format = c("mp3", "wav"),
voice = "Joanna",
bind_audio = TRUE,
save_local = FALSE,
save_local_dest = NULL,
...
)
tts_google(
text,
output_format = c("mp3", "wav"),
voice = "en-US-Standard-C",
bind_audio = TRUE,
save_local = FALSE,
save_local_dest = NULL,
...
)
tts_microsoft(
text,
output_format = c("mp3", "wav"),
voice = NULL,
bind_audio = TRUE,
save_local = FALSE,
save_local_dest = NULL,
...
)
tts_coqui(
text,
exec_path,
output_format = c("wav", "mp3"),
model_name = "tacotron2-DDC_ph",
vocoder_name = "ljspeech/univnet",
bind_audio = TRUE,
save_local = FALSE,
save_local_dest = NULL,
...
)
Arguments
text |
A character vector of text to be spoken |
output_format |
Format of output files: "mp3" or "wav" |
service |
Service to use (Amazon, Google, Microsoft, or Coqui) |
bind_audio |
Should the |
... |
Additional arguments |
voice |
Full voice name |
save_local |
Should the audio file be saved locally? |
save_local_dest |
If to be saved locally, destination where output file will be saved |
exec_path |
System path to Coqui TTS executable |
model_name |
(Coqui TTS only) Deep Learning model for Text-to-Speech Conversion |
vocoder_name |
(Coqui TTS only) Voice coder used for speech coding and transmission |
Value
A standardized tibble
featuring the following columns:
-
index
: Sequential identifier number -
original_text
: The text input provided by the user -
text
: In case original_text exceeds the character limit, text represents the outcome of splitting original_text. Otherwise, text remains the same as original_text. -
wav
: Wave object (S4 class) -
file
: File path to the audio file -
audio_type
: The audio format, either mp3 or wav -
duration
: The duration of the audio file -
service
: The text-to-speech engine used
Examples
## Not run:
# Amazon Polly
tts("Hello world! This is Amazon Polly", service = "amazon")
tts("Hello world! This is Coqui TTS", service = "coqui")
tts("Hello world! This is Google Cloud", service = "google")
tts("Hello world! This is Microsoft", service = "microsoft")
## End(Not run)