R: Create an instance of a fitting problem

ml_problem {mlfit}

R Documentation

Create an instance of a fitting problem

Description

The ml_problem() function is the first step for fitting a reference sample to known control totals with mlfit. All algorithms (see ml_fit()) expect an object created by this function (or optionally processed with flatten_ml_fit_problem()).

The special_field_names() function is useful for the field_names argument to ml_problem.

Usage

ml_problem(
  ref_sample,
  controls = list(individual = individual_controls, group = group_controls),
  field_names,
  individual_controls,
  group_controls,
  prior_weights = NULL,
  geo_hierarchy = NULL
)

is_ml_problem(x)

## S3 method for class 'ml_problem'
format(x, ...)

## S3 method for class 'ml_problem'
print(x, ...)

special_field_names(
  groupId,
  individualId,
  individualsPerGroup = NULL,
  count = NULL,
  zone = NULL,
  region = NULL
)

Arguments

`ref_sample`	The reference sample
`controls`	Control totals, by default initialized from the `individual_controls` and `group_controls` arguments
`field_names`	Names of special fields, construct using `special_field_names()`
`individual_controls`, `group_controls`	Control totals at individual and group level, given as a list of data frames where each data frame defines a control
`prior_weights`	Prior (or design) weights at group level; by default a vector of ones will be used, which corresponds to random sampling of groups
`geo_hierarchy`	A table shows mapping between a larger zoning level to many zones of a smaller zoning level. The column name of the larger level should be specified in `field_names` as 'region' and the smaller one as 'zone'.
`x`	An object
`...`	Ignored.
`groupId`, `individualId`	Name of the column that defines the ID of the group or the individual
`individualsPerGroup`	Obsolete.
`count`	Name of control total column in control tables (use first numeric column in each control by default).
`region`, `zone`	Name of the column that defines the region of the reference sample or the zone of the controls. Note that region is a larger area that contains more than one zone.

Value

An object of class ml_problem or a list of them if geo_hierarchy was given, essentially a named list with the following components:

refSample: The reference sample, a data.frame.
controls: A named list with two components, individual and group. Each contains a list of controls as data.frames.
fieldNames: A named list with the names of special fields.

is_ml_problem() returns a logical.

Examples

# Create example from Ye et al., 2009

# Provide reference sample
ye <- tibble::tribble(
  ~HHNR, ~PNR, ~APER, ~HH_VAR, ~P_VAR,
  1, 1, 3, 1, 1,
  1, 2, 3, 1, 2,
  1, 3, 3, 1, 3,
  2, 4, 2, 1, 1,
  2, 5, 2, 1, 3,
  3, 6, 3, 1, 1,
  3, 7, 3, 1, 1,
  3, 8, 3, 1, 2,
  4, 9, 3, 2, 1,
  4, 10, 3, 2, 3,
  4, 11, 3, 2, 3,
  5, 12, 3, 2, 2,
  5, 13, 3, 2, 2,
  5, 14, 3, 2, 3,
  6, 15, 2, 2, 1,
  6, 16, 2, 2, 2,
  7, 17, 5, 2, 1,
  7, 18, 5, 2, 1,
  7, 19, 5, 2, 2,
  7, 20, 5, 2, 3,
  7, 21, 5, 2, 3,
  8, 22, 2, 2, 1,
  8, 23, 2, 2, 2
)
ye

# Specify control at household level
ye_hh <- tibble::tribble(
  ~HH_VAR, ~N,
  1,       35,
  2,       65
)
ye_hh

# Specify control at person level
ye_ind <- tibble::tribble(
  ~P_VAR, ~N,
  1, 91,
  2, 65,
  3, 104
)
ye_ind

ye_problem <- ml_problem(
  ref_sample = ye,
  field_names = special_field_names(
    groupId = "HHNR", individualId = "PNR", count = "N"
  ),
  group_controls = list(ye_hh),
  individual_controls = list(ye_ind)
)
ye_problem

fit <- ml_fit_dss(ye_problem)
fit$weights

[Package mlfit version 0.5.3 Index]