CorpusCoder {AcousticNDLCodeR}R Documentation

Codes a corpus for use with NDL with vector of wavefile names and a vector of TextGrid names provided

Description

Codes a corpus for use with NDL with vector of wavefile names and a vector of TextGrid names provided

Usage

CorpusCoder(Waves, Annotations, AnnotationType = c("TextGrid", "ESPS"),
  TierName = NULL, Dismiss = NULL, Encoding, Fast = F, Cores = 1,
  IntensitySteps, Smooth)

Arguments

Waves

Vector with names (and full path to if not in wd) of the wave files.

Annotations

Vector with names (and full path to if not in wd) of the TextGrid files.

AnnotationType

Type of annotation files. Suported formats are praat TextGrids (set to "TextGrid") and ESPS/Wavesurfer (set to "ESPS") files.

TierName

Name of the tier in the TextGrid to be used.

Dismiss

Regular expression for Outcomes that should be removed. Uses grep. E.g. "<|>" would remove <noise>,<xxx>, etc. Default is NULL.

Encoding

Encoding of the annotation file. It is assumed, that all annotation files have the same encoding.

Fast

Switches between a fast and a robust TextGrid parser. For Fast no "\n" or "\t" may be in the transcription. Default is FALSE.

Cores

Number of cores that the function may use. Default is 1.

IntensitySteps

Number of steps that the intensity gets compressed to. Default is 5

Smooth

A parameter for using the kernel smooth function provied by the package zoo.

Value

A data.frame with $Cues and $Outcomes for use with ndl or ndl2.

Author(s)

Denis Arnold

Examples

       ## Not run: 
       # assuming the corpus contains wave files and praat textgrids
           
         setwd(~/Data/MyCorpus) # assuming everything is in one place
           
         #assuming you have one wav for each annotation
           
         Waves=list.files(pattern="*.wav",recursive=T)
         Annotations=list.files(pattern="*.TextGrids",recursive=T) # see above
           
         # Lets assume the annotation is in UTF-8 and you want everything from a tier called words
         # Lets assume tha you want to dismiss everything in <|>
         # Lets assume that have 4 cores available
         # Lets assume that you want the defaut settings for the parameters
           
         Data=CorpusCoderCorpusCoder(Waves, Annotations, AnnotationType = "TextGrid",
         TierName = "words", Dismiss = "<|>", Encoding, Fast = F, Cores = 4, 
         IntensitySteps = 5, Smooth = 800)
         
       
## End(Not run)

[Package AcousticNDLCodeR version 1.0.2 Index]