R: Auditory spectrogram

audSpectrogram {soundgen}

R Documentation

Auditory spectrogram

Description

Produces an auditory spectrogram by extracting a bank of bandpass filters (work in progress). While tuneR::audspec is based on FFT, here we convolve the sound with a bank of filters. The main difference is that we don't window the signal and de facto get variable temporal resolution in different frequency channels, as with a wavelet transform. The filters are currently third-order Butterworth bandpass filters implemented in butter.

Usage

audSpectrogram(
  x,
  samplingRate = NULL,
  scale = NULL,
  from = NULL,
  to = NULL,
  step = 1,
  dynamicRange = 80,
  nFilters = 128,
  minFreq = 20,
  maxFreq = samplingRate/2,
  minBandwidth = 10,
  reportEvery = NULL,
  cores = 1,
  plot = TRUE,
  savePlots = NULL,
  osc = c("none", "linear", "dB")[2],
  heights = c(3, 1),
  ylim = NULL,
  yScale = c("bark", "mel", "ERB", "log")[1],
  contrast = 0.2,
  brightness = 0,
  maxPoints = c(1e+05, 5e+05),
  padWithSilence = TRUE,
  colorTheme = c("bw", "seewave", "heat.colors", "...")[1],
  col = NULL,
  extraContour = NULL,
  xlab = NULL,
  ylab = NULL,
  xaxp = NULL,
  mar = c(5.1, 4.1, 4.1, 2),
  main = NULL,
  grid = NULL,
  width = 900,
  height = 500,
  units = "px",
  res = NA,
  ...
)

Arguments

`x`	path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors
`samplingRate`	sampling rate of `x` (only needed if `x` is a numeric vector)
`scale`	maximum possible amplitude of input used for normalization of input vector (only needed if `x` is a numeric vector)
`from`, `to`	if NULL (default), analyzes the whole sound, otherwise from...to (s)
`step`	step, ms (determines time resolution). step = NULL means no downsampling at all (ncol of output = length of input audio)
`dynamicRange`	dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero
`nFilters`	the number of filters (determines frequency resolution)
`minFreq`, `maxFreq`	the range of frequencies to analyze
`minBandwidth`	minimum filter bandwidth, Hz (otherwise filters may become too narrow when nFilters is high)
`reportEvery`	when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report)
`cores`	number of cores for parallel processing
`plot`	should a spectrogram be plotted? TRUE / FALSE
`savePlots`	full path to the folder in which to save the plots (NULL = don't save, ” = same folder as audio)
`osc`	"none" = no oscillogram; "linear" = on the original scale; "dB" = in decibels
`heights`	a vector of length two specifying the relative height of the spectrogram and the oscillogram (including time axes labels)
`ylim`	frequency range to plot, kHz (defaults to 0 to Nyquist frequency). NB: still in kHz, even if yScale = bark, mel, or ERB
`yScale`	scale of the frequency axis: 'linear' = linear, 'log' = logarithmic (musical), 'bark' = bark with `hz2bark`, 'mel' = mel with `hz2mel`, 'ERB' = Equivalent Rectangular Bandwidths with `HzToERB`
`contrast`	spectrum is exponentiated by contrast (any real number, recommended -1 to +1). Contrast >0 increases sharpness, <0 decreases sharpness
`brightness`	how much to "lighten" the image (>0 = lighter, <0 = darker)
`maxPoints`	the maximum number of "pixels" in the oscillogram (if any) and spectrogram; good for quickly plotting long audio files; defaults to c(1e5, 5e5)
`padWithSilence`	if TRUE, pads the sound with just enough silence to resolve the edges properly (only the original region is plotted, so the apparent duration doesn't change)
`colorTheme`	black and white ('bw'), as in seewave package ('seewave'), or any palette from `palette` such as 'heat.colors', 'cm.colors', etc
`col`	actual colors, eg rev(rainbow(100)) - see ?hcl.colors for colors in base R (overrides colorTheme)
`extraContour`	a vector of arbitrary length scaled in Hz (regardless of yScale!) that will be plotted over the spectrogram (eg pitch contour); can also be a list with extra graphical parameters such as lwd, col, etc. (see examples)
`xlab`, `ylab`, `main`, `mar`, `xaxp`	graphical parameters for plotting
`grid`	if numeric, adds n = `grid` dotted lines per kHz
`width`, `height`, `units`, `res`	graphical parameters for saving plots passed to `png`
`...`	other graphical parameters

Examples

# synthesize a sound with gradually increasing hissing noise
sound = soundgen(sylLen = 200, temperature = 0.001,
  noise = list(time = c(0, 350), value = c(-40, 0)),
  formantsNoise = list(f1 = list(freq = 5000, width = 10000)),
  addSilence = 25)
# playme(sound, samplingRate = 16000)

# auditory spectrogram
as = audSpectrogram(sound, samplingRate = 16000, nFilters = 48)
dim(as$audSpec)

# compare to FFT-based spectrogram with similar time and frequency resolution
fs = spectrogram(sound, samplingRate = 16000, yScale = 'bark',
                 windowLength = 5, step = 1)
dim(fs)

## Not run: 
# add bells and whistles
audSpectrogram(sound, samplingRate = 16000,
  yScale = 'ERB',
  osc = 'dB',  # plot oscillogram in dB
  heights = c(2, 1),  # spectro/osc height ratio
  brightness = -.1,  # reduce brightness
  # colorTheme = 'heat.colors',  # pick color theme...
  col = hcl.colors(30, palette = 'Plasma'),  # ...or specify the colors
  cex.lab = .75, cex.axis = .75,  # text size and other base graphics pars
  grid = 5,  # to customize, add manually with graphics::grid()
  ylim = c(0.1, 5),  # always in kHz
  main = 'My auditory spectrogram' # title
  # + axis labels, etc
)

# change dynamic range
audSpectrogram(sound, samplingRate = 16000, dynamicRange = 40)
audSpectrogram(sound, samplingRate = 16000, dynamicRange = 120)

# remove the oscillogram
audSpectrogram(sound, samplingRate = 16000, osc = 'none')

# save auditory spectrograms of all audio files in a folder
audSpectrogram('~/Downloads/temp',
  savePlots = '~/Downloads/temp/audSpec', cores = 4)

## End(Not run)

[Package soundgen version 2.6.3 Index]