invertSpectrogram {soundgen} | R Documentation |
Invert spectrogram
Description
Transforms a spectrogram into a time series with inverse STFT. The problem is that an ordinary spectrogram preserves only the magnitude (modulus) of the complex STFT, while the phase is lost, and without phase it is impossible to reconstruct the original audio accurately. So there are a number of algorithms for "guessing" the phase that would produce an audio whose magnitude spectrogram is very similar to the target spectrogram. Useful for certain filtering operations that modify the magnitude spectrogram followed by inverse STFT, such as filtering in the spectrotemporal modulation domain.
Usage
invertSpectrogram(
spec,
samplingRate,
windowLength,
overlap,
step = NULL,
wn = "hanning",
specType = c("abs", "log", "dB")[1],
initialPhase = c("zero", "random", "spsi")[3],
nIter = 50,
normalize = TRUE,
play = TRUE,
verbose = FALSE,
plotError = TRUE
)
Arguments
spec |
the spectrogram that is to be transform to a time series: numeric matrix with frequency bins in rows and time frames in columns |
samplingRate |
sampling rate of |
windowLength |
length of FFT window, ms |
overlap |
overlap between successive FFT frames, % |
step |
you can override |
wn |
window type accepted by |
specType |
the scale of target spectroram: 'abs' = absolute, 'log' = log-transformed, 'dB' = in decibels |
initialPhase |
initial phase estimate: "zero" = set all phases to zero; "random" = Gaussian noise; "spsi" (default) = single-pass spectrogram inversion (Beauregard et al., 2015) |
nIter |
the number of iterations of the GL algorithm (Griffin & Lim, 1984), 0 = don't run |
normalize |
if TRUE, normalizes the output to range from -1 to +1 |
play |
if TRUE, plays back the reconstructed audio |
verbose |
if TRUE, prints estimated time left every 10% of GL iterations |
plotError |
if TRUE, produces a scree plot of squared error over GL iterations (useful for choosing 'nIter') |
Details
Algorithm: takes the spectrogram, makes an initial guess at the phase (zero, noise, or a more intelligent estimate by the SPSI algorithm), fine-tunes over 'nIter' iterations with the GL algorithm, reconstructs the complex spectrogram using the best phase estimate, and performs inverse STFT. The single-pass spectrogram inversion (SPSI) algorithm is implemented as described in Beauregard et al. (2015) following the python code at https://github.com/lonce/SPSI_Python. The Griffin-Lim (GL) algorithm is based on Griffin & Lim (1984).
Value
Returns the reconstructed audio as a numeric vector.
References
Griffin, D., & Lim, J. (1984). Signal estimation from modified short-time Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(2), 236-243.
Beauregard, G. T., Harish, M., & Wyse, L. (2015, July). Single pass spectrogram inversion. In 2015 IEEE International Conference on Digital Signal Processing (DSP) (pp. 427-431). IEEE.
See Also
Examples
# Create a spectrogram
samplingRate = 16000
windowLength = 40
overlap = 75
wn = 'hanning'
s = soundgen(samplingRate = samplingRate, addSilence = 100)
spec = spectrogram(s, samplingRate = samplingRate,
wn = wn, windowLength = windowLength, overlap = overlap,
padWithSilence = FALSE, output = 'original')
# Invert the spectrogram, attempting to guess the phase
# Note that samplingRate, wn, windowLength, and overlap must be the same as
# in the original (ie you have to know how the spectrogram was created)
s_new = invertSpectrogram(spec, samplingRate = samplingRate,
windowLength = windowLength, overlap = overlap, wn = wn,
initialPhase = 'spsi', nIter = 10, specType = 'abs', play = FALSE)
## Not run:
# Verify the quality of audio reconstruction
# playme(s, samplingRate); playme(s_new, samplingRate)
spectrogram(s, samplingRate, osc = TRUE)
spectrogram(s_new, samplingRate, osc = TRUE)
## End(Not run)