fit {optimalThreshold} | R Documentation |
Specify which distribution to fit on the marker values
Description
This function is a wrapper to create an S4 object to specify a distribution to fit the marker values.
Usage
fit(x, distr, ini = NULL, thin = NULL, burnin = NULL, model = NULL,
paraNames = NULL, mcmcList = NULL, cdf = NULL, gradient = NULL,
hessian = NULL)
Arguments
x |
a vector of marker values (NA values allowed, see Details). |
distr |
a character that specifies the distribution to fit (normal, log-normal, scaled t, gamma, logistic, user-defined or undefined, see Details). |
ini |
specification of initial values for the parameters of the marker distribution in the form of a list. Each list must be named. A list should be provided for each MCMC chain. NULL for "norm" and "lnorm". |
thin |
the thinning interval between consecutive observations. NULL for "norm" and "lnorm". |
burnin |
a positive integer that defines the length of the burn-in iterations when performing the MCMC algorithm. NULL for "norm" and "lnorm". |
model |
a character string used to define the model. Must match with the definition of a model compatible with JAGS. Necessary only for the t and logistic distributions (see Details). |
paraNames |
a string vector containing the names of the parameters of the submitted distribution. Should be provided only for "user" defined distribution. |
mcmcList |
an object of class mcmc.list where each list contains an MCMC chain. To be provided only for "user" defined distribution. |
cdf |
a function that characterizes the cumulative distribution. To be provided only for "user" defined distribution (see Details). |
gradient |
a function that characterizes the density distribution. To be provided only for "user" defined distribution (see Details). |
hessian |
a function that characterizes the first derivative of the probability density function. To be provided only for "user" defined distribution (see Details). |
Details
This function allows the user to specify which distribution should be fitted to the marker values. If NA values are present in the x
argument passed to the function, a warning is produced. However, the user should not discard the NA values from the original data because the length of the x
argument is calculated internally to to estimate the mean risk of event occurrence in each treatment arm. So NA values are managed internally by the function.
Five theoretical distributions are implemented by the package: normal, log-normal, gamma, scaled t, and logistic. This is here that the user must specify which of the four distributions must be of type 'undefined' (or in other words which distribution must be expressed as a function of the three other distributions and mean risks of event). The user may also define its own theoretical distribution. The details for each theoretical distribution are provided hereafter:
Fit a normal distribution: when specifying
distr="norm"
you fit a normal distribution to the marker values passed to thex
argument of the function. Non-informative priors are used (p(\mu,\sigma^2) \propto (\sigma^2)^(-1)
). Posterior values of the normal distribution parameters are sampled directly from the exact posterior distributions. If you don't want to use non-informative priors, see the explanation on how to fit a user-defined distribution.Fit a log-normal distribution: when specifying
distr="lnorm"
you fit a log-normal distribution to the marker values passed to thex
argument of the function. Non-informative priors are used (p(\mu,\sigma^2) \propto (\sigma^2)^(-1)
). Posterior values of the log-normal distribution parameters are sampled directly from the exact posterior distributions. If you don't want to use non-informative priors, see the explanation on how to fit a user-defined distribution.Fit a gamma distribution: when specifying
distr="gamma"
you fit a gamma distribution to the marker values passed to thex
argument of the function. Non-informative priors are used (p(shape,scale) \propto 1/scale
). Posterior values of the gamma distribution parameters are sampled using the ARS method. This method requires that the user specifies a list of initial values passed to theini
argument of the function. Each element of this list must be a list with one element named 'shape'. It also requires thethin
of the MCMC chain, and the length of the burnin phase passed to theburnin
argument. If you don't want to use non-informative priors, see the explanation on how to fit a user-defined distribution.Fit a scaled t distribution: when specifying
distr="t"
you fit a scaled t distribution to the marker values passed to thex
argument of the function. Posterior values of the scaled t distribution parameters are sampled using an MCMC algorithm through the JAGS software, so the function requires the user to provide the JAGS model as a character string through themodel
argument of the function. IfNULL
, a model with vague priors is provided to the function automatically:mu ~ U(min(x),max(x))
log(sd) ~ U(-10,10)
1/df ~ U(0,1)
This method requires that the user specifies a list of initial values passed to the
ini
argument of the function. Each element of this list must be a list with three elements named 'mu', 'sd', and 'df'. It also requires thethin
of the MCMC chain, and the length of the burnin phase passed to theburnin
argument.Fit a logistic distribution: when specifying
distr="logis"
you fit a logistic distribution to the marker values passed to thex
argument of the function. Posterior values of the logistic distribution parameters are sampled using a MCMC algorithm through the JAGS software, so the function requires the user to provide the JAGS model as a character string through themodel
argument of the function. IfNULL
, a model with vague priors is provided to the function automatically:location ~ U(min(x),max(x))
log(scale) ~ U(-10,10)
This method requires that the user specifies a list of initial values passed to the
ini
argument of the function. Each element of this list must be a list with two elements named 'location', and 'scale'. It also requires thethin
of the MCMC chain, and the length of the burnin phase passed to theburnin
argument.Fit a user-defined distribution: when specifying
distr="user"
you fit a user-defined distribution to the marker values passed to thex
argument of the function. First of all, the user must give the parameters name in the argumentparaNames
of the function using a character vector. Then, the user provides a posterior sample of the parameters of the distribution obtained using JAGS or another software through an object of classmcmc.list
to the argumentmcmcList
of the function (this implies that the user performed the Bayesian inference himself). Note that the names passed to themcmc.list
object must match with the names given in theparaNames
argument. Then, the user must specify thecdf
,gradient
, andhessian
functions associated with the fitted distribution. Thecdf
function is the cumulative distribution function that is fitted to the marker values, thegradient
function is its first derivative which corresponds to the probability density function fitted to the marker values, and thehessian
function is the second derivative ofcdf
. When the fitted distribution is a supported distribution (e.g. a normal distribution with informative priors), the user may use thegetMethod(cdf,"normalDist")
function to use the standard method for normal distribution used in the package. When the fitted distribution is not supported, the user must specify directly thecdf
asfunction(x,mu,sd) pnorm(x,mu,sd)
(if we keep the example of the normal distribution). The same idea may be used for thegradient
andhessian
functions (see the examples to have more details).Specify which marker distribution is expressed as a function of the three others and the mean risks of event using
distr="undefined"
.
Value
Returns an object to be passed to the trtSelThresh
and diagThresh
functions.
See Also
trtSelThresh
and diagThresh
.
Examples
#Fit a normal distribution
x <- rnorm(250)
fitX <- fit(x, "norm")
#Fit a log-normal distribution
x <- rlnorm(250)
fitX <- fit(x, "lnorm")
#Fit a gamma distribution
x <- rgamma(250, shape = 2, scale = 1.2)
fitX <- fit(x, "gamma",
ini = list(list(shape = 1),
list(shape = 2),
list(shape = 3)),
thin = 1, burnin = 1000)
#Fit a scaled t distribution
x <- optimalThreshold:::rt.scaled(250, df = 4, mean = 2.5, sd = 2)
fitX <- fit(x, "t",
ini = list(list(mu = 1, sd = 1, df = 2),
list(mu = 2, sd = 2, df = 4),
list(mu = 3, sd = 3, df = 6)),
thin = 1, burnin = 1000, model = NULL)
#Fit a logistic distribution
x <- rlogis(250)
fitX <- fit(x, "logis",
ini = list(list(location = 0.3, scale = 0.5),
list(location = 1, scale = 1),
list(location = 2, scale = 2)),
thin = 1, burnin = 1000, model = NULL)
#Specify which distribution is 'undefined'
x <- rnorm(250)
fitX <- fit(x, "undefined")
#Fit a user-defined normal distribution with informative priors
library(rjags)
x <- rnorm(250, mean = 2, sd = 1)
model <- "model
{
mu ~ dunif(0, 4)
log_sd ~ dunif(-1, 1)
sd <- exp(log_sd)
tau <- 1 / (sd^2)
for (i in 1:N)
{
x[i] ~ dnorm(mu, tau)
}
}
"
modelJAGS <- jags.model(file = textConnection(model), data = list(x = x, N = length(x)),
inits = list(list(mu = 1, log_sd = -0.5),list(mu = 3.5, log_sd = 0.5)),
n.chains = 2, quiet = TRUE)
update(modelJAGS, 1000, progress.bar = "text")
mcmcpara <- coda.samples(modelJAGS, c("mu", "log_sd"), n.iter = 2000, thin = 1)
varnames(mcmcpara) <- c("mu", "sd")
mcmcpara[[1]][, "sd"] <- exp(mcmcpara[[1]][, "sd"])
mcmcpara[[2]][, "sd"] <- exp(mcmcpara[[2]][, "sd"])
fitX <- fit(x, "user", paraNames = varnames(mcmcpara), mcmcList = mcmcpara,
cdf = function(x, mu, sd) pnorm(x, mu, sd),
gradient = getMethod(gradient, "normalDist"),
hessian = function(x, mu, sd) ((mu - x) / sd^2) * dnorm(x, mu, sd))