shiftPitch {soundgen}R Documentation

Shift pitch

Description

Raises or lowers pitch with or without also shifting the formants (resonance frequencies) and performing a time-stretch. The three operations (pitch shift, formant shift, and time stretch) are independent and can be performed in any combination, statically or dynamically. shiftPitch can also be used to shift formants without changing pitch or duration, but the dedicated shiftFormants is faster for that task.

Usage

shiftPitch(
  x,
  multPitch = 1,
  multFormants = multPitch,
  timeStretch = 1,
  samplingRate = NULL,
  freqWindow = NULL,
  dynamicRange = 80,
  windowLength = 40,
  step = 2,
  overlap = NULL,
  wn = "gaussian",
  interpol = c("approx", "spline")[1],
  propagation = c("time", "adaptive")[1],
  preserveEnv = NULL,
  transplantEnv_pars = list(windowLength = 10),
  normalize = c("max", "orig", "none")[2],
  play = FALSE,
  saveAudio = NULL,
  reportEvery = NULL,
  cores = 1
)

Arguments

x

path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors

multPitch

1 = no change, >1 = raise pitch (eg 1.1 = 10% up, 2 = one octave up), <1 = lower pitch. Anchor format accepted for multPitch / multFormant / timeStretch (see soundgen)

multFormants

1 = no change, >1 = raise formants (eg 1.1 = 10% up, 2 = one octave up), <1 = lower formants

timeStretch

1 = no change, >1 = longer, <1 = shorter

samplingRate

sampling rate of x (only needed if x is a numeric vector)

freqWindow

the width of spectral smoothing window, Hz. Defaults to detected f0 prior to pitch shifting - see shiftFormants for discussion and examples

dynamicRange

dynamic range, dB. All values more than one dynamicRange under maximum are treated as zero

windowLength

length of FFT window, ms

step

you can override overlap by specifying FFT step, ms (NB: because digital audio is sampled at discrete time intervals of 1/samplingRate, the actual step and thus the time stamps of STFT frames may be slightly different, eg 24.98866 instead of 25.0 ms)

overlap

overlap between successive FFT frames, %

wn

window type accepted by ftwindow, currently gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop

interpol

the method for interpolating scaled spectra and anchors

propagation

the method for propagating phase: "time" = horizontal propagation (default), "adaptive" = an experimental implementation of "vocoder done right" (Prusa & Holighaus 2017)

preserveEnv

if TRUE, transplants the amplitude envelope from the original to the modified sound with transplantEnv. Defaults to TRUE if no time stretching is performed and FALSE otherwise

transplantEnv_pars

a list of parameters passed on to transplantEnv if preserveEnv = TRUE

normalize

"orig" = same as input (default), "max" = maximum possible peak amplitude, "none" = no normalization

play

if TRUE, plays the synthesized sound using the default player on your system. If character, passed to play as the name of player to use, eg "aplay", "play", "vlc", etc. In case of errors, try setting another default player for play

saveAudio

full path to the folder in which to save audio files (one per detected syllable)

reportEvery

when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report)

cores

number of cores for parallel processing

Details

Algorithm: phase vocoder. Pitch shifting is accomplished by performing a time stretch (at present, with horizontal or adaptive phase propagation) followed by resampling. This shifts both pitch and formants; to preserve the original formant frequencies or modify them independently of pitch, a variant of link{transplantFormants} is performed to "transplant" the original or scaled formants onto the time-stretched new sound.

See Also

shiftFormants transplantFormants

Examples

s = soundgen(sylLen = 200, ampl = c(0,-10),
             pitch = c(250, 350), rolloff = c(-9, -15),
             noise = -40,
             formants = 'aii', addSilence = 50)
# playme(s)
s1 = shiftPitch(s, samplingRate = 16000, freqWindow = 400,
                multPitch = 1.25, multFormants = .8)
# playme(s1)

## Not run: 
## Dynamic manipulations
# Add a chevron-shaped pitch contour
s2 = shiftPitch(s, samplingRate = 16000, multPitch = c(1.1, 1.3, .8))
playme(s2)

# Time-stretch only the middle
s3 = shiftPitch(s, samplingRate = 16000, timeStretch = list(
  time = c(0, .25, .31, .5, .55, 1),
  value = c(1, 1, 3, 3, 1, 1))
)
playme(s3)


## Various combinations of 3 manipulations
data(sheep, package = 'seewave')  # import a recording from seewave
playme(sheep)
spectrogram(sheep)

# Raise pitch and formants by 3 semitones, shorten by half
sheep1 = shiftPitch(sheep, multPitch = 2 ^ (3 / 12), timeStretch = 0.5)
playme(sheep1, sheep@samp.rate)
spectrogram(sheep1, sheep@samp.rate)

# Just shorten
shiftPitch(sheep, multPitch = 1, timeStretch = 0.25, play = TRUE)

# Raise pitch preserving formants
sheep2 = shiftPitch(sheep, multPitch = 1.2, multFormants = 1, freqWindow = 150)
playme(sheep2, sheep@samp.rate)
spectrogram(sheep2, sheep@samp.rate)

## End(Not run)

[Package soundgen version 2.6.3 Index]