which_are_bijection {dataPreparation} R Documentation

## Identify bijections

### Description

Find all the columns that are bijections of another column.

### Usage

which_are_bijection(data_set, keep_cols = NULL, verbose = TRUE)


### Arguments

 data_set Matrix, data.frame or data.table keep_cols List of columns not to drop (list of character, default to NULL) verbose Should the algorithm talk (logical, default to TRUE)

### Details

Bijection, meaning that there is another column containing the exact same information (but maybe coded differently) for example col1: Men/Women, col2 M/W.
This function is performing search by looking to every couple of columns. It computes numbers of unique elements in each column, and number of unique tuples of values.
Computation is made by exponential search, so that the function is faster.
If verbose is TRUE, the column logged will be the one returned.
Ex: if column i and column j (with j > i) are bijections it will return j, expect if j is a character then it return i.

### Value

A list of index of columns that have an exact bijection in the data_set set.

### Examples

# First let's get a data set