R: Shift pitch

shiftPitch {soundgen}

R Documentation

Shift pitch

Description

Raises or lowers pitch with or without also shifting the formants (resonance frequencies) and performing a time-stretch. The three operations (pitch shift, formant shift, and time stretch) are independent and can be performed in any combination, statically or dynamically. shiftPitch can also be used to shift formants without changing pitch or duration, but the dedicated shiftFormants is faster for that task.

Usage

shiftPitch(
  x,
  multPitch = 1,
  multFormants = multPitch,
  timeStretch = 1,
  samplingRate = NULL,
  freqWindow = NULL,
  dynamicRange = 80,
  windowLength = 40,
  step = 2,
  overlap = NULL,
  wn = "gaussian",
  interpol = c("approx", "spline")[1],
  propagation = c("time", "adaptive")[1],
  preserveEnv = NULL,
  transplantEnv_pars = list(windowLength = 10),
  normalize = c("max", "orig", "none")[2],
  play = FALSE,
  saveAudio = NULL,
  reportEvery = NULL,
  cores = 1
)

Arguments

`x`	path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors
`multPitch`	1 = no change, >1 = raise pitch (eg 1.1 = 10% up, 2 = one octave up), <1 = lower pitch. Anchor format accepted for multPitch / multFormant / timeStretch (see `soundgen`)
`multFormants`	1 = no change, >1 = raise formants (eg 1.1 = 10% up, 2 = one octave up), <1 = lower formants
`timeStretch`	1 = no change, >1 = longer, <1 = shorter
`samplingRate`	sampling rate of `x` (only needed if `x` is a numeric vector)
`freqWindow`	the width of spectral smoothing window, Hz. Defaults to detected f0 prior to pitch shifting - see `shiftFormants` for discussion and examples
`dynamicRange`	dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero
`windowLength`	length of FFT window, ms
`step`	you can override `overlap` by specifying FFT step, ms (NB: because digital audio is sampled at discrete time intervals of 1/samplingRate, the actual step and thus the time stamps of STFT frames may be slightly different, eg 24.98866 instead of 25.0 ms)
`overlap`	overlap between successive FFT frames, %
`wn`	window type accepted by `ftwindow`, currently gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop
`interpol`	the method for interpolating scaled spectra and anchors
`propagation`	the method for propagating phase: "time" = horizontal propagation (default), "adaptive" = an experimental implementation of "vocoder done right" (Prusa & Holighaus 2017)
`preserveEnv`	if TRUE, transplants the amplitude envelope from the original to the modified sound with `transplantEnv`. Defaults to TRUE if no time stretching is performed and FALSE otherwise
`transplantEnv_pars`	a list of parameters passed on to `transplantEnv` if `preserveEnv = TRUE`
`normalize`	"orig" = same as input (default), "max" = maximum possible peak amplitude, "none" = no normalization
`play`	if TRUE, plays the synthesized sound using the default player on your system. If character, passed to `play` as the name of player to use, eg "aplay", "play", "vlc", etc. In case of errors, try setting another default player for `play`
`saveAudio`	full path to the folder in which to save audio files (one per detected syllable)
`reportEvery`	when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report)
`cores`	number of cores for parallel processing

Details

Algorithm: phase vocoder. Pitch shifting is accomplished by performing a time stretch (at present, with horizontal or adaptive phase propagation) followed by resampling. This shifts both pitch and formants; to preserve the original formant frequencies or modify them independently of pitch, a variant of link{transplantFormants} is performed to "transplant" the original or scaled formants onto the time-stretched new sound.

Examples

s = soundgen(sylLen = 200, ampl = c(0,-10),
             pitch = c(250, 350), rolloff = c(-9, -15),
             noise = -40,
             formants = 'aii', addSilence = 50)
# playme(s)
s1 = shiftPitch(s, samplingRate = 16000, freqWindow = 400,
                multPitch = 1.25, multFormants = .8)
# playme(s1)

## Not run: 
## Dynamic manipulations
# Add a chevron-shaped pitch contour
s2 = shiftPitch(s, samplingRate = 16000, multPitch = c(1.1, 1.3, .8))
playme(s2)

# Time-stretch only the middle
s3 = shiftPitch(s, samplingRate = 16000, timeStretch = list(
  time = c(0, .25, .31, .5, .55, 1),
  value = c(1, 1, 3, 3, 1, 1))
)
playme(s3)


## Various combinations of 3 manipulations
data(sheep, package = 'seewave')  # import a recording from seewave
playme(sheep)
spectrogram(sheep)

# Raise pitch and formants by 3 semitones, shorten by half
sheep1 = shiftPitch(sheep, multPitch = 2 ^ (3 / 12), timeStretch = 0.5)
playme(sheep1, sheep@samp.rate)
spectrogram(sheep1, sheep@samp.rate)

# Just shorten
shiftPitch(sheep, multPitch = 1, timeStretch = 0.25, play = TRUE)

# Raise pitch preserving formants
sheep2 = shiftPitch(sheep, multPitch = 1.2, multFormants = 1, freqWindow = 150)
playme(sheep2, sheep@samp.rate)
spectrogram(sheep2, sheep@samp.rate)

## End(Not run)