R: High Dimensional Dataset handler.

HDDataset {D2MCS}

R Documentation

High Dimensional Dataset handler.

Description

Creates a high dimensional dataset object. Only the required instances are loaded in memory to avoid unnecessary of resources and memory.

Methods

Public methods

HDDataset$new()
HDDataset$getColumnNames()
HDDataset$getNcol()
HDDataset$createSubset()

Method `new()`

Method for initializing the object arguments during runtime.

Usage

HDDataset$new(
  filepath,
  header = TRUE,
  sep = ",",
  skip = 0,
  normalize.names = FALSE,
  ignore.columns = NULL
)

Arguments

filepath: The name of the file which the data are to be read from. Each row of the table appears as one line of the file. If it does not contain an _absolute_ path, the file name is _relative_ to the current working directory, 'getwd()'.
header: A logical value indicating whether the file contains the names of the variables as its first line. If missing, the value is determined from the file format: 'header' is set to 'TRUE' if and only if the first row contains one fewer field than the number of columns.
sep: The field separator character. Values on each line of the file are separated by this character.
skip: Defines the number of header lines should be skipped.
normalize.names: A logical value indicating whether the columns names should be automatically renamed to ensure R compatibility.
ignore.columns: Specify the columns from the input file that should be ignored.

Method `getColumnNames()`

Gets the name of the columns comprising the dataset

Usage

HDDataset$getColumnNames()

Returns

A character vector with the name of each column.

Method `getNcol()`

Obtains the number of columns present in the dataset.

Usage

HDDataset$getNcol()

Returns

An integer of length 1 or NULL

Method `createSubset()`

Creates a blinded HDSubset for classification purposes.

Usage

HDDataset$createSubset(column.id = FALSE, chunk.size = 1e+05)

Arguments

column.id: An integer or character indicating the column (number or name respectively) identifier. Default NULL value is valid ignores defining a identification column.
chunk.size: an integer value indicating the size of chunks taken over each iteration.

Returns

A HDSubset object.

High Dimensional Dataset handler.

Description

Methods

Public methods

Method new()

Usage

Arguments

Method getColumnNames()

Usage

Returns

Method getNcol()

Usage

Returns

Method createSubset()

Usage

Arguments

Returns

See Also

Method `new()`

Method `getColumnNames()`

Method `getNcol()`

Method `createSubset()`