ipr {ipr} | R Documentation |
Iterative Proportional Repartition (IPR) algorithm
Description
Estimating the health cost repartition among diseases in the presence of multimorbidity, i.e. when some patients have multiple diseases. Using the Iterative Proportional Repartition algorithm (see reference below), the goal is to estimate the average cost for each disease, starting from the global health costs available for each patient.
Usage
ipr(X, y, print.it=FALSE, start=rep(1,dim(X)[2]), cutup=Inf, cutlow=cutup,
epsrel=0.001, epsabs=0.1, maxiter=1000, det=FALSE)
Arguments
X |
Matrix with |
y |
Vector where |
print.it |
Logical. If |
start |
Vector of initial estimates of the average cost for each disease to start IPR algorithm. Default is an initial average cost of 1 for all diseases. The length of |
cutup , cutlow |
Options which can be used to get a robust version of IPR. If the current allocated cost of disease |
epsrel |
Stopping criterion such that the IPR algorithm stops if for all diseases, the current estimated average cost differs by less than 100* |
epsabs |
Stopping criterion such that the IPR algorithm stops if for all diseases, the current estimated average cost differs (in absolute value) by less than |
maxiter |
Maximal number of iterations of IPR algorithm. The default value is 1000. |
det |
Logical. If |
Details
Let us consider n
patients and p
diseases. We are given a matrix X
such that x_{ij}=1
if the patient i
suffers from disease j
and x_{ij}=0
otherwise. We are also given a vector y
, where y_i
is the global health cost of patient i
. In order to estimate the average cost of each disease, the IPR algorithm works as follows:
1. Start with some initial estimates mu_j
, e.g. mu_j
=1 for all j=1,\dots,p
. Those initial estimates are stored in the vector start
.
2. Allocate the cost y_i
among the diseases diagnosed for patient i
, proportionally to the current estimates mu_j
.
3. Update the current estimate of mu_j
by averaging the specific costs obtained in step 2 for the disease j
over the patients having that disease.
4. Repeat steps 2 and 3 until a stopping criterion, based on relative or absolute distance between two consecutive iterations. The stopping criterion can be defined with epsabs
or epsrel
.
By construction, the IPR algorithm satisfies two properties. First, it allows to obtain positive estimates for each average disease cost. Secondly, it allows to retrieve the total health costs. In other words, the sum of the estimates mu_j
multiplied by the number of patients suffering from j
is equal to the sum of the costs y_i
.
The estimate of total cost tau_j
spent for disease j
as well as the estimated proportion pi_j
of the total costs which is allocated to disease j
are also returned by our function.
Mathematically, tau_j
is the sum over i=1
to i=n
of X_{ij}*mu_j
, while pi_j
is defined by tau_j
divided by the sum of all tau_k
.
Value
coef |
A vector with the estimated average cost of each disease. |
total |
A vector with the estimated total cost spent for each disease. |
proportions |
A vector with the estimated proportion of total cost spent for each disease. |
niter |
The number of iterations of IPR algorithm until the stopping criterion is achieved. |
esprel |
The stopping criterion based on a relative distance between two consecutive iterations which has been used. |
epsabs |
The stopping criterion based on an absolute distance between two consecutive iterations which has been used. |
detail |
A matrix with the allocated costs of each disease for each patient, if |
Author(s)
Dr. Jean-Benoit Rossel (jean-benoit.rossel@unisante.ch), Prof. Valentin Rousson and Dr. Yves Eggli.
References
Rousson, V., Rossel, J.-B. & Eggli, Y. (2019). Estimating Health Cost Repartition Among Diseases in the Presence of Multimorbidity. Health Services Research and Managerial Epidemiology, 6.
Rossel, J.-B., Rousson, V. & Eggli, Y. A comparison of statistical methods for allocating disease costs in the presence of interactions. In preparation.
Examples
# Here is a first example with 10 patients and 4 diseases:
X <- matrix(c(1,0,0,0,
0,1,1,0,
0,1,0,1,
1,0,0,1,
1,1,1,0,
0,0,1,1,
0,1,0,0,
1,1,0,0,
0,1,1,1,
0,0,0,1),ncol=4,byrow=TRUE)
y <- c(500,200,100,400,1000,500,100,300,800,2000)
# If we would use a linear model without intercept to estimate the average
# disease costs, we would obtain a negative value for disease 2.
lm(y~X-1)
# The IPR algorithm provides only positive estimates
ipr(X,y)
# Here is a second example:
X <- matrix(c(1,0,0,1,1,1),nrow=3,byrow=TRUE)
y <- c(5000,500,6600)
# We have three patients. The first one has only disease 1 with a cost of 5000.
# The second one has only disease 2 with a cost of 500 (i.e. ten times less
# expansive than disease 1). The third patient has both diseases with
# a cost of 6600 (i.e. 5000 + 500 + an extra cost of 1100).
# Using a linear model, one would allocate the extra cost equally between
# the three patients. The estimated average cost would thus be 5000+(1100/3)
# for disease 1 and 500+(1100/3) for disease 2.
lm(y~X-1)
# Using IPR algorithm, one allocates the extra cost taking into account that
# disease 1 is ten times more expansive than disease 2 when occuring alone.
# One thus gets an estimated average cost of 5500 for disease 1 and
# of 550 for disease 2.
ipr(X,y)