CreateALC {PPRL}R Documentation

Anonymous Linkage Codes (ALCs)

Description

Creates ALCs from clear-text data by creating soundex phonetics for first and last names and concatenating all other identifiers. The resulting code is encrypted using SHA-2. The user can decide on which columns the soundex phonetic is applied.

Usage

CreateALC(ID, data, soundex, password)

Arguments

ID

A character vector or integer vector containing the IDs of the data.frame.

data

a data.frame containing the data to be encoded.

soundex

a binary vector with one element for each input column, indicating whether soundex is to be used. 1 = soundex is used, 0 = soundex is not used. The soundex vector must have the same length as the number of columns the data.frame.

password

a string used as a password for the HMAC.

Value

A data.frame containing IDs and the corresponding Anonymous Linkage Codes.

Source

Herzog, T. N., Scheuren, F. J., Winkler, W. E. (2007): Data Quality and Record Linkage Techniques. Springer.

See Also

Create581, StandardizeString

Examples

# Load test data
testFile <- file.path(path.package("PPRL"), "extdata/testdata.csv")
testData <- read.csv(testFile, head = FALSE, sep = "\t",
  colClasses = "character")

# Encrypt data, use Soundex for names
res <- CreateALC(ID = testData$V1,   
  data = testData[, c(2, 3, 7, 8)],
  soundex = c(0, 0, 1, 1),  
  password = "$6Uh*-Z204q")
  

[Package PPRL version 0.3.8 Index]