system_define_cohort {ubiquity} | R Documentation |
Define Estimation Cohort
Description
Define a cohort to include in a parameter estimation
Usage
system_define_cohort(cfg, cohort)
Arguments
cfg |
ubiquity system object |
cohort |
list with cohort information |
Details
Each cohort has a name (eg d5mpk
), and the dataset containing the
information for this cohort is identified (the name defined in system_load_data
)
cohort = list( name = "d5mpk", dataset = "pm_data", inputs = NULL, outputs = NULL)
Next if only a portion of the dataset applies to the current cohort, you
can define a filter (cf
field). This will be
applied to the dataset to only return values relevant to this cohort. For
example, if we only want records where the column DOSE
is 5 (for the 5
mpk cohort). We can use the following:
cohort[["cf"]] = list(DOSE = c(5))
If the dataset has the headings ID
, DOSE
and SEX
and
cohort filter had the following format:
cohort[["cf"]] = list(ID = c(1:4), DOSE = c(5,10), SEX = c(1))
It would be translated into the boolean filter:
(ID==1) | (ID==2) | (ID==3) | (ID==4)) & ((DOSE == 5) | (DOSE==10)) & (SEX == 1)
Optionally you may want to fix a system parameter to a different value for a
given cohort. This can be done using the cohort parameter (cp
) field.
For example if you had the body weight defined as a system parameter
(BW
), and you wanted to fix the body weight to 70 for the current
cohort you would do the following:
cohort[["cp"]] = list(BW = c(70))
Note that you can only fix parameters that are not being estimated.
By default the underlying simulation output times will be taken from the
general output_times option (see system_set_option
). However It may also be
necessary to specify simulation output times for a specific cohort. The
output_times
field can be used for this. Simply provide a vector of
output times:
cohort[["output_times"]] = seq(0,100,2)
Next we define the dosing for this cohort. It is only necessary to define
those inputs that are non-zero. So if the data here were generated from
animals given a single 5 mpk IV at time 0. Bolus dosing is defined
using <B:times>
and <B:events>
. If Cp
is the central
compartment, you would pass this information to the cohort in the
following manner:
cohort[["inputs"]][["bolus"]] = list() cohort[["inputs"]][["bolus"]][["Cp"]] = list(TIME=NULL, AMT=NULL) cohort[["inputs"]][["bolus"]][["Cp"]][["TIME"]] = c( 0) cohort[["inputs"]][["bolus"]][["Cp"]][["AMT"]] = c( 5)
Inputs can also include any infusion rates (infusion_rates
) or
covariates (covariates
). Covariates will have the default value
specified in the system file unless overwritten here. The units here are
the same as those in the system file
Next we need to map the outputs in the model to the observation data in the
dataset. Under the outputs
field there is a field for each output. Here
the field ONAME
can be replaced with something more useful (like
PK
).
cohort[["outputs"]][["ONAME"]] = list()
If you want to further filter the dataset. Say for example you
have two outputs and the cf
applied above reduces your dataset
down to both outputs. Here you can use the "of" field to apply an "output filter"
to further filter the records down to those that apply to the current output ONAME.
cohort[["outputs"]][["ONAME"]][["of"]] = list( COLNAME = c(), COLNAME = c())
If you do not need further filtering of data, you can you can just omit the field.
Next you need to identify the columns in the dataset that contain your
times and observations. This is found in the obs
field for the
current observation:
cohort[["outputs"]][["ONAME"]][["obs"]] = list( time = "TIMECOL", value = "OBSCOL", missing = -1)
The times and observations in the dataset are found in the ’TIMECOL’
column
and the ’OBSCOL’
column (optional missing data option specified by -1).
These observations in the dataset need to be mapped to the appropriate
elements of your model defined in the system file. This is done with the
model
field:
cohort[["outputs"]][["ONAME"]][["model"]] = list( time = "TS", value = "MODOUTPUT", variance = "PRED^2")
First the system time scale indicated by the TS
placeholder above
must be specfied. The time scale must correspond to the data found in
TIMECOL
above. Next the model output indicated by the MODOUTPUT
placeholder needs to be specified. This is defined in the system file using
<O>
and should correspond to OBSCOL
from the dataset. Lastly the
variance
field specifies the variance model. You can use the keyword
PRED
(the model predicted output) and any variance parameters. Some
examples include:
-
variance = "1"
- Least squares -
variance = "PRED^2"
- Weighted least squares proportional to the prediction squared -
variance = "(SLOPE*PRED)^2"
Maximum likelihood estimation whereSLOPE
is defined as a variance parameter (<VP>
)
The following controls the plotting aspects associated with this output. The color, shape and line values are the values used by ggplot functions.
cohort[["outputs"]][["ONAME"]][["options"]] = list( marker_color = "black", marker_shape = 16, marker_line = 1 )
If the cohort has multiple outputs, simply repeat the process above for the. additional cohorts. The estimation vignettes contains examples of this.
Note: Output names should be consistent between cohorts so they will be grouped together when plotting results.
Value
ubiquity system object with cohort defined
See Also
Estimation vignette (vignette("Estimation", package = "ubiquity")
)