simulatey {optimStrat} | R Documentation |
Simulate the Study Variable
Description
Simulate values for the study variable based on the auxiliary variable x
and an assumed superpopulation model.
Usage
simulatey(x, f, g, dist = "normal", rho = NULL, Sigma = NULL, ...)
Arguments
x |
a numeric vector giving the values of the auxiliary variable. |
f |
the name of the function defining the desired trend (see ‘Details’). |
g |
the name of the function defining the desired spread (see ‘Details’). |
dist |
the desired distribution of the study variable conditioned on the auxiliary variable. Either 'normal' or 'gamma' (see ‘Details’). |
rho |
a number giving the absolute value of the desired correlation between |
Sigma |
a nonnegative number giving the scale of the spread term in the superpopulation model. Ignored if |
... |
other arguments passed to |
Details
The values of the study variable y
are simulated using a superpopulation model defined as:
Y_{k}=f(x_{k})+\epsilon_{k}
with E(\epsilon_{k}) = 0
, V(\epsilon_{k}) = \sigma^{2}g^{2}(x_{k})
and Cov(\epsilon_{k},\epsilon_{l}) = 0
if k\ne l
. Also Y_{k}|f(x_{k})
is distributed according to dist
.
f
and g
should return a vector of the same length of x
. Their first argument should be x
and they should not share the name of any other argument. Both f
and g
should have the ... argument (see ‘Examples’).
Note that Sigma
defines the degree of association between x
and y
: the larger Sigma
, the smaller the correlation, rho
, and vice versa. For this reason only one of them should be defined. If both are defined, Sigma
will be ignored.
Depending on the trend function f
, some correlations cannot be reached. In those cases, Sigma
will automatically be set to zero, dist
will automatically be set to 'normal' and rho
will be ignored (see ‘Examples’).
If the trend term takes negative values, dist
will be automatically set to 'normal'.
Value
A numeric vector giving the simulated value of y
associated to each value in x
.
Examples
f<- function(x,b0,b1,b2,...) {b0+b1*x^b2}
g<- function(x,b3,...) {x^b3}
x<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )
#Linear trend and homocedasticity
y1<- simulatey(x,f,g,dist="normal",b0=0,b1=1,b2=1,b3=0,rho=0.90)
y2<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=1,b3=0,rho=0.90)
#Linear trend and heterocedasticity
y3<- simulatey(x,f,g,dist="normal",b0=0,b1=1,b2=1,b3=1,rho=0.90)
y4<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=1,b3=1,rho=0.90)
#Quadratic trend and homocedasticity
y5<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=2,b3=0,rho=0.80)
#Correlation of minus one
y6<- simulatey(x,f,g,dist="normal",b0=0,b1=-1,b2=1,b3=0,rho=1)
#Desired correlation cannot be attained
y7<- simulatey(x,f,g,dist="normal",b0=0,b1=1,b2=3,b3=0,rho=0.99)
#Negative expectation not possible under gamma distribution
y8<- simulatey(x,f,g,dist="gamma",b0=0,b1=-1,b2=1,b3=0,rho=1)
#Conditional variance of zero not possible under gamma distribution
y9<- simulatey(x,f,g,dist="gamma",b0=0,b1=1,b2=3,b3=0,rho=0.99)