getDbCovariateData {FeatureExtraction} | R Documentation |
Get covariate information from the database
Description
Uses one or several covariate builder functions to construct covariates.
Usage
getDbCovariateData(
connectionDetails = NULL,
connection = NULL,
oracleTempSchema = NULL,
cdmDatabaseSchema,
cdmVersion = "5",
cohortTable = "cohort",
cohortDatabaseSchema = cdmDatabaseSchema,
cohortTableIsTemp = FALSE,
cohortId = -1,
cohortIds = c(-1),
rowIdField = "subject_id",
covariateSettings,
aggregated = FALSE,
minCharacterizationMean = 0
)
Arguments
connectionDetails |
An R object of type |
connection |
A connection to the server containing the schema as created using the
|
oracleTempSchema |
A schema where temp tables can be created in Oracle. |
cdmDatabaseSchema |
The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specify both the database and the schema, so for example 'cdm_instance.dbo'. |
cdmVersion |
Define the OMOP CDM version used: currently supported is "5". |
cohortTable |
Name of the (temp) table holding the cohort for which we want to construct covariates |
cohortDatabaseSchema |
If the cohort table is not a temp table, specify the database schema where the cohort table can be found. On SQL Server, this should specify both the database and the schema, so for example 'cdm_instance.dbo'. |
cohortTableIsTemp |
Is the cohort table a temp table? |
cohortId |
DEPRECATED:For which cohort ID(s) should covariates be constructed? If set to -1, covariates will be constructed for all cohorts in the specified cohort table. |
cohortIds |
For which cohort ID(s) should covariates be constructed? If set to c(-1), covariates will be constructed for all cohorts in the specified cohort table. |
rowIdField |
The name of the field in the cohort table that is to be used as the row_id field in the output table. This can be especially usefull if there is more than one period per person. |
covariateSettings |
Either an object of type |
aggregated |
Should aggregate statistics be computed instead of covariates per cohort entry? If aggregated is set to FALSE, the results returned will be based on each subject_id and cohort_start_date in your cohort table. If your cohort contains multiple entries for the same subject_id (due to different cohort_start_date values), you must carefully set the rowIdField so you can identify the patients properly. See issue #229 for more discussion on this parameter. |
minCharacterizationMean |
The minimum mean value for characterization output. Values below this will be cut off from output. This will help reduce the file size of the characterization output, but will remove information on covariates that have very low values. The default is 0. |
Details
This function uses the data in the CDM to construct a large set of covariates for the provided cohort. The cohort is assumed to be in an existing table with these fields: 'subject_id', 'cohort_definition_id', 'cohort_start_date'. Optionally, an extra field can be added containing the unique identifier that will be used as rowID in the output.
Value
Returns an object of type covariateData
, containing information on the covariates.
Examples
eunomiaConnectionDetails <- Eunomia::getEunomiaConnectionDetails()
covSettings <- createDefaultCovariateSettings()
Eunomia::createCohorts(
connectionDetails = eunomiaConnectionDetails,
cdmDatabaseSchema = "main",
cohortDatabaseSchema = "main",
cohortTable = "cohort"
)
covData <- getDbCovariateData(
connectionDetails = eunomiaConnectionDetails,
oracleTempSchema = NULL,
cdmDatabaseSchema = "main",
cdmVersion = "5",
cohortTable = "cohort",
cohortDatabaseSchema = "main",
cohortTableIsTemp = FALSE,
cohortIds = -1,
rowIdField = "subject_id",
covariateSettings = covSettings,
aggregated = FALSE
)