| fit {optimalThreshold} | R Documentation |
Specify which distribution to fit on the marker values
Description
This function is a wrapper to create an S4 object to specify a distribution to fit the marker values.
Usage
fit(x, distr, ini = NULL, thin = NULL, burnin = NULL, model = NULL,
paraNames = NULL, mcmcList = NULL, cdf = NULL, gradient = NULL,
hessian = NULL)
Arguments
x |
a vector of marker values (NA values allowed, see Details). |
distr |
a character that specifies the distribution to fit (normal, log-normal, scaled t, gamma, logistic, user-defined or undefined, see Details). |
ini |
specification of initial values for the parameters of the marker distribution in the form of a list. Each list must be named. A list should be provided for each MCMC chain. NULL for "norm" and "lnorm". |
thin |
the thinning interval between consecutive observations. NULL for "norm" and "lnorm". |
burnin |
a positive integer that defines the length of the burn-in iterations when performing the MCMC algorithm. NULL for "norm" and "lnorm". |
model |
a character string used to define the model. Must match with the definition of a model compatible with JAGS. Necessary only for the t and logistic distributions (see Details). |
paraNames |
a string vector containing the names of the parameters of the submitted distribution. Should be provided only for "user" defined distribution. |
mcmcList |
an object of class mcmc.list where each list contains an MCMC chain. To be provided only for "user" defined distribution. |
cdf |
a function that characterizes the cumulative distribution. To be provided only for "user" defined distribution (see Details). |
gradient |
a function that characterizes the density distribution. To be provided only for "user" defined distribution (see Details). |
hessian |
a function that characterizes the first derivative of the probability density function. To be provided only for "user" defined distribution (see Details). |
Details
This function allows the user to specify which distribution should be fitted to the marker values. If NA values are present in the x argument passed to the function, a warning is produced. However, the user should not discard the NA values from the original data because the length of the x argument is calculated internally to to estimate the mean risk of event occurrence in each treatment arm. So NA values are managed internally by the function.
Five theoretical distributions are implemented by the package: normal, log-normal, gamma, scaled t, and logistic. This is here that the user must specify which of the four distributions must be of type 'undefined' (or in other words which distribution must be expressed as a function of the three other distributions and mean risks of event). The user may also define its own theoretical distribution. The details for each theoretical distribution are provided hereafter:
Fit a normal distribution: when specifying
distr="norm"you fit a normal distribution to the marker values passed to thexargument of the function. Non-informative priors are used (p(\mu,\sigma^2) \propto (\sigma^2)^(-1)). Posterior values of the normal distribution parameters are sampled directly from the exact posterior distributions. If you don't want to use non-informative priors, see the explanation on how to fit a user-defined distribution.Fit a log-normal distribution: when specifying
distr="lnorm"you fit a log-normal distribution to the marker values passed to thexargument of the function. Non-informative priors are used (p(\mu,\sigma^2) \propto (\sigma^2)^(-1)). Posterior values of the log-normal distribution parameters are sampled directly from the exact posterior distributions. If you don't want to use non-informative priors, see the explanation on how to fit a user-defined distribution.Fit a gamma distribution: when specifying
distr="gamma"you fit a gamma distribution to the marker values passed to thexargument of the function. Non-informative priors are used (p(shape,scale) \propto 1/scale). Posterior values of the gamma distribution parameters are sampled using the ARS method. This method requires that the user specifies a list of initial values passed to theiniargument of the function. Each element of this list must be a list with one element named 'shape'. It also requires thethinof the MCMC chain, and the length of the burnin phase passed to theburninargument. If you don't want to use non-informative priors, see the explanation on how to fit a user-defined distribution.Fit a scaled t distribution: when specifying
distr="t"you fit a scaled t distribution to the marker values passed to thexargument of the function. Posterior values of the scaled t distribution parameters are sampled using an MCMC algorithm through the JAGS software, so the function requires the user to provide the JAGS model as a character string through themodelargument of the function. IfNULL, a model with vague priors is provided to the function automatically:mu ~ U(min(x),max(x))log(sd) ~ U(-10,10)1/df ~ U(0,1)This method requires that the user specifies a list of initial values passed to the
iniargument of the function. Each element of this list must be a list with three elements named 'mu', 'sd', and 'df'. It also requires thethinof the MCMC chain, and the length of the burnin phase passed to theburninargument.Fit a logistic distribution: when specifying
distr="logis"you fit a logistic distribution to the marker values passed to thexargument of the function. Posterior values of the logistic distribution parameters are sampled using a MCMC algorithm through the JAGS software, so the function requires the user to provide the JAGS model as a character string through themodelargument of the function. IfNULL, a model with vague priors is provided to the function automatically:location ~ U(min(x),max(x))log(scale) ~ U(-10,10)This method requires that the user specifies a list of initial values passed to the
iniargument of the function. Each element of this list must be a list with two elements named 'location', and 'scale'. It also requires thethinof the MCMC chain, and the length of the burnin phase passed to theburninargument.Fit a user-defined distribution: when specifying
distr="user"you fit a user-defined distribution to the marker values passed to thexargument of the function. First of all, the user must give the parameters name in the argumentparaNamesof the function using a character vector. Then, the user provides a posterior sample of the parameters of the distribution obtained using JAGS or another software through an object of classmcmc.listto the argumentmcmcListof the function (this implies that the user performed the Bayesian inference himself). Note that the names passed to themcmc.listobject must match with the names given in theparaNamesargument. Then, the user must specify thecdf,gradient, andhessianfunctions associated with the fitted distribution. Thecdffunction is the cumulative distribution function that is fitted to the marker values, thegradientfunction is its first derivative which corresponds to the probability density function fitted to the marker values, and thehessianfunction is the second derivative ofcdf. When the fitted distribution is a supported distribution (e.g. a normal distribution with informative priors), the user may use thegetMethod(cdf,"normalDist")function to use the standard method for normal distribution used in the package. When the fitted distribution is not supported, the user must specify directly thecdfasfunction(x,mu,sd) pnorm(x,mu,sd)(if we keep the example of the normal distribution). The same idea may be used for thegradientandhessianfunctions (see the examples to have more details).Specify which marker distribution is expressed as a function of the three others and the mean risks of event using
distr="undefined".
Value
Returns an object to be passed to the trtSelThresh and diagThresh functions.
See Also
trtSelThresh and diagThresh.
Examples
#Fit a normal distribution
x <- rnorm(250)
fitX <- fit(x, "norm")
#Fit a log-normal distribution
x <- rlnorm(250)
fitX <- fit(x, "lnorm")
#Fit a gamma distribution
x <- rgamma(250, shape = 2, scale = 1.2)
fitX <- fit(x, "gamma",
ini = list(list(shape = 1),
list(shape = 2),
list(shape = 3)),
thin = 1, burnin = 1000)
#Fit a scaled t distribution
x <- optimalThreshold:::rt.scaled(250, df = 4, mean = 2.5, sd = 2)
fitX <- fit(x, "t",
ini = list(list(mu = 1, sd = 1, df = 2),
list(mu = 2, sd = 2, df = 4),
list(mu = 3, sd = 3, df = 6)),
thin = 1, burnin = 1000, model = NULL)
#Fit a logistic distribution
x <- rlogis(250)
fitX <- fit(x, "logis",
ini = list(list(location = 0.3, scale = 0.5),
list(location = 1, scale = 1),
list(location = 2, scale = 2)),
thin = 1, burnin = 1000, model = NULL)
#Specify which distribution is 'undefined'
x <- rnorm(250)
fitX <- fit(x, "undefined")
#Fit a user-defined normal distribution with informative priors
library(rjags)
x <- rnorm(250, mean = 2, sd = 1)
model <- "model
{
mu ~ dunif(0, 4)
log_sd ~ dunif(-1, 1)
sd <- exp(log_sd)
tau <- 1 / (sd^2)
for (i in 1:N)
{
x[i] ~ dnorm(mu, tau)
}
}
"
modelJAGS <- jags.model(file = textConnection(model), data = list(x = x, N = length(x)),
inits = list(list(mu = 1, log_sd = -0.5),list(mu = 3.5, log_sd = 0.5)),
n.chains = 2, quiet = TRUE)
update(modelJAGS, 1000, progress.bar = "text")
mcmcpara <- coda.samples(modelJAGS, c("mu", "log_sd"), n.iter = 2000, thin = 1)
varnames(mcmcpara) <- c("mu", "sd")
mcmcpara[[1]][, "sd"] <- exp(mcmcpara[[1]][, "sd"])
mcmcpara[[2]][, "sd"] <- exp(mcmcpara[[2]][, "sd"])
fitX <- fit(x, "user", paraNames = varnames(mcmcpara), mcmcList = mcmcpara,
cdf = function(x, mu, sd) pnorm(x, mu, sd),
gradient = getMethod(gradient, "normalDist"),
hessian = function(x, mu, sd) ((mu - x) / sd^2) * dnorm(x, mu, sd))