transform_inverse_mel_scale {torchaudio} | R Documentation |
Inverse Mel Scale
Description
Solve for a normal STFT from a mel frequency STFT, using a conversion matrix. This uses triangular filter banks.
Usage
transform_inverse_mel_scale(
n_stft,
n_mels = 128,
sample_rate = 16000,
f_min = 0,
f_max = NULL,
max_iter = 1e+05,
tolerance_loss = 1e-05,
tolerance_change = 1e-08,
...
)
Arguments
n_stft |
(int): Number of bins in STFT. See |
n_mels |
(int, optional): Number of mel filterbanks. (Default: |
sample_rate |
(int, optional): Sample rate of audio signal. (Default: |
f_min |
(float, optional): Minimum frequency. (Default: |
f_max |
(float or NULL, optional): Maximum frequency. (Default: |
max_iter |
(int, optional): Maximum number of optimization iterations. (Default: |
tolerance_loss |
(float, optional): Value of loss to stop optimization at. (Default: |
tolerance_change |
(float, optional): Difference in losses to stop optimization at. (Default: |
... |
(optional): Arguments passed to the SGD optimizer. Argument lr will default to 0.1 if not specied.(Default: |
Details
forward param:
melspec (Tensor): A Mel frequency spectrogram of dimension (..., n_mels
, time)
It minimizes the euclidian norm between the input mel-spectrogram and the product between the estimated spectrogram and the filter banks using SGD.
Value
Tensor: Linear scale spectrogram of size (..., freq, time)