preprocLinkage {PreProcessRecordLinkage} | R Documentation |
Record Linkage with Data Preprocessing
Description
This function records linkage along with data preprocessing. It has been meticulously executed to cover a wide range of datasets, ensuring that variable names are standardized using synonyms. This approach facilitates seamless data integration and analysis across various datasets.
Usage
preprocLinkage(d1,d2,chz="NULL",var=c("age","sex"),threshold=0.9)
Arguments
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
var |
The vector of the names of the blocked variables that the user chooses based on the output of the |
threshold |
A numeric value between 0 and 1. |
Details
The results are stored in the .csv files, but if the number of records exceeds one million, they are stored in the rdata files.
Value
Two csv files or two rdata files.
Note
Note that, to see the results in the created file, first call the data.table package.
Author(s)
Hossein Hassani and and Leila Marvian Mashhad.
See Also
Examples
d1 = RLdata500
d2 = RLdata10000
preprocLinkage(d1, d2, var = "by")