audio_dataset_from_directory {keras3} | R Documentation |
Generates a tf.data.Dataset
from audio files in a directory.
Description
If your directory structure is:
main_directory/ ...class_a/ ......a_audio_1.wav ......a_audio_2.wav ...class_b/ ......b_audio_1.wav ......b_audio_2.wav
Then calling audio_dataset_from_directory(main_directory, labels = 'inferred')
will return a tf.data.Dataset
that yields batches of audio files from
the subdirectories class_a
and class_b
, together with labels
0 and 1 (0 corresponding to class_a
and 1 corresponding to class_b
).
Only .wav
files are supported at this time.
Usage
audio_dataset_from_directory(
directory,
labels = "inferred",
label_mode = "int",
class_names = NULL,
batch_size = 32L,
sampling_rate = NULL,
output_sequence_length = NULL,
ragged = FALSE,
shuffle = TRUE,
seed = NULL,
validation_split = NULL,
subset = NULL,
follow_links = FALSE,
verbose = TRUE
)
Arguments
directory |
Directory where the data is located.
If |
labels |
Either "inferred" (labels are generated from the directory
structure), |
label_mode |
String describing the encoding of
|
class_names |
Only valid if "labels" is |
batch_size |
Size of the batches of data. Default: 32. If |
sampling_rate |
Audio sampling rate (in samples per second). |
output_sequence_length |
Maximum length of an audio sequence. Audio files
longer than this will be truncated to |
ragged |
Whether to return a Ragged dataset (where each sequence has its
own length). Defaults to |
shuffle |
Whether to shuffle the data. Defaults to |
seed |
Optional random seed for shuffling and transformations. |
validation_split |
Optional float between 0 and 1, fraction of data to reserve for validation. |
subset |
Subset of the data to return. One of |
follow_links |
Whether to visits subdirectories pointed to by symlinks.
Defaults to |
verbose |
Whether to display number information on classes and
number of files found. Defaults to |
Value
A tf.data.Dataset
object.
If
label_mode
isNULL
, it yieldsstring
tensors of shape(batch_size,)
, containing the contents of a batch of audio files.Otherwise, it yields a tuple
(audio, labels)
, whereaudio
has shape(batch_size, sequence_length, num_channels)
andlabels
follows the format described below.
Rules regarding labels format:
if
label_mode
isint
, the labels are anint32
tensor of shape(batch_size,)
.if
label_mode
isbinary
, the labels are afloat32
tensor of 1s and 0s of shape(batch_size, 1)
.if
label_mode
iscategorical
, the labels are afloat32
tensor of shape(batch_size, num_classes)
, representing a one-hot encoding of the class index.
See Also
Other dataset utils:
image_dataset_from_directory()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
Other utils:
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
text_dataset_from_directory()
timeseries_dataset_from_array()
to_categorical()
zip_lists()