mixcov {CAMAN} | R Documentation |

## Fitting mixture models with covariates

### Description

The function mixcov can be used to estimate the parameters of covariate
adjusted mixture models. This is done using the EM algorithm.
The function first performs the necessary data augmentation and then
applies the EM-algorithm. Covariates may be included as fixed and random
effects into the model.

Thus, the EM algorithm for covariate adjusted mixture models implies to
perform first the necessary data augmentation, and then based on
starting values for `p[j]`

and `beta[j]`

the computation of posterior
probabilities `e[ij]`

. This is the E-step. In the M-step new mixing
weights `p[j]`

and regression coefficients `beta[j]`

are computed.

Then the algorithm performs as follows:

\
Step 0: Let `P`

be any vector of starting values

Step 1: Compute the E-step, that is estimate the probability of component
membership for each observation

Step 2: Compute the M-step, that is compute new mixing weights and model
coefficients

Proceed until convergence is met.

### Usage

```
mixcov(dep, fixed, random="", data, k, weight=NULL, pop.at.risk=NULL,
var.lnOR=NULL, family="gaussian", maxiter=50,
acc=10^-7, returnHomogeneousModel = FALSE)
```

### Arguments

`dep` |
dependent variable. Vector or colname of |

`fixed` |
fixed effects. Vector or colname of |

`random` |
random effects. Vector or colname of |

`k` |
number of components. |

`weight` |
weights of the data. Vector or colname of |

`family` |
the underlying type density function as a character ("gaussian" or "poisson")! |

`data` |
an optional data frame. |

`pop.at.risk` |
population at risk: An offset that could be used to determine a mixture model for Poisson data from unequally large populations at risk. Vector or colname of |

`var.lnOR` |
variances of the data: These variances might be given when working with meta analyses! Vector or colname of |

`maxiter` |
parameter to control the maximal number of iterations in the VEM and EM loops. Default is 5000. |

`acc` |
convergence criterion. VEM and EM loops stop when deltaLL<acc. Default is 10^(-7). |

`returnHomogeneousModel` |
boolean to indicate whether the homogeneous model (simple glm) should be returned too. Default is FALSE. |

### Value

The function returns a CAMAN.glm.object (S4-class), describing a mixture model with covariates.
The main information about the mixture model is printed by just typing the <object>. Additional information is given in `summary(object)`

.
Single attributes can be accessed using the `@`

, e.g. mix@LL.

`dat` |
(input) data |

`family` |
underlying type density function |

`LL` |
Likelihood of the final (best) iteration |

`BIC` |
Likelihood of the final (best) iteration |

`num.k` |
number of components obtained |

`p` |
probability of each component |

`t` |
parameter of distribution (normal distr. -> mean, poisson distr. -> lambda, binomial distr. -> prob) |

`coefMatrix` |
complete coefficient matrix |

`prob` |
probabilies, belonging to each component |

`steps` |
number of steps performed (EM). |

`commonEffect` |
common effects |

`hetvar` |
heterogeneity variance |

`commonEffect` |
common effects |

`classification` |
classification labels for each observation ( |

`cl` |
the matched call. |

`fittedObs` |
predictions of the fitted model, given the observations |

### Author(s)

Peter Schlattmann and Johannes Hoehne

### Examples

```
### Toy data: simulate subjects with a different relationship between age and salariy
grps = sample(1:3,70, replace=TRUE) #assign each person to one group
salary=NULL
age = round(runif(70) * 47 + 18)
#random effects: age has a different influence (slope) on the salary
salary[grps == 1] = 2000 + 12 * age[grps==1]
salary[grps == 2] = 4000 + 4 * age[grps==2]
salary[grps == 3] = 3200 + (-15) * age[grps==3]
salary = salary + rnorm(70)*30 #some noise
sex =sample(c("m","w"), 70, replace=TRUE)
salary[sex=="m"] = salary[sex=="m"] * 1.2 #men earn 20 percent more than women
salaryData = data.frame(salary=salary, age=age, sex=sex)
tstSalary <- mixcov(dep="salary", fixed="sex", random="age" ,data=salaryData,
k=3,family="gaussian", acc=10^-3)
### POISSON data:
data(NoP)
ames3 <- mixcov(dep="count",fixed=c("dose", "logd"),random="",data=NoP,
k=3,family="poisson")
### Gaussian data
data(betaplasma)
beta4 <- mixcov(dep="betacaro", fixed=c("chol","sex","bmi"), random="betadiet",
data=betaplasma, k=4, family="gaussian")
```

*CAMAN*version 0.78 Index]