R: Miscellaneous Design Attributes and Utility Functions

rmsMisc {rms}

R Documentation

Miscellaneous Design Attributes and Utility Functions

Description

These functions are used internally to anova.rms, fastbw, etc., to retrieve various attributes of a design. These functions allow some fitting functions not in the rms series (e.g,, lm, glm) to be used with rms.Design, fastbw, and similar functions.

For vcov, there are several functions. The method for orm fits is a bit different because the covariance matrix stored in the fit object only deals with the middle intercept. See the intercepts argument for more options. There is a method for lrm that also allows non-default intercept(s) to be selected (default is first).

The oos.loglik function for each type of model implemented computes the -2 log likelihood for out-of-sample data (i.e., data not necessarily used to fit the model) evaluated at the parameter estimates from a model fit. Vectors for the model's linear predictors and response variable must be given. oos.loglik is used primarily by bootcov.

The Getlim function retrieves distribution summaries from the fit or from a datadist object. It handles getting summaries from both sources to fill in characteristics for variables that were not defined during the model fit. Getlimi returns the summary for an individual model variable.

Mean is a generic function that creates an R function that calculates the expected value of the response variable given a fit from rms or rmsb.

The related.predictors function returns a list containing variable numbers that are directly or indirectly related to each predictor. The interactions.containing function returns indexes of interaction effects containing a given predictor. The param.order function returns a vector of logical indicators for whether parameters are associated with certain types of effects (nonlinear, interaction, nonlinear interaction). combineRelatedPredictors creates of list of inter-connected main effects and interations for use with predictrms with type='ccterms' (useful for gIndex).

The Penalty.matrix function builds a default penalty matrix for non-intercept term(s) for use in penalized maximum likelihood estimation. The Penalty.setup function takes a constant or list describing penalty factors for each type of term in the model and generates the proper vector of penalty multipliers for the current model.

logLik.rms returns the maximized log likelihood for the model, whereas AIC.rms returns the AIC. The latter function has an optional argument for computing AIC on a "chi-square" scale (model likelihood ratio chi-square minus twice the regression degrees of freedom. logLik.ols handles the case for ols, just by invoking logLik.lm in the stats package. logLik.Gls is also defined.

nobs.rms returns the number of observations used in the fit.

The lrtest function does likelihood ratio tests for two nested models, from fits that have stats components with "Model L.R." values. For models such as psm, survreg, ols, lm which have scale parameters, it is assumed that scale parameter for the smaller model is fixed at the estimate from the larger model (see the example).

univarLR takes a multivariable model fit object from rms and re-fits a sequence of models containing one predictor at a time. It prints a table of likelihood ratio chi^2 statistics from these fits.

The Newlabels function is used to override the variable labels in a fit object. Likewise, Newlevels can be used to create a new fit object with levels of categorical predictors changed. These two functions are especially useful when constructing nomograms.

rmsArgs handles ... arguments to functions such as Predict, summary.rms, nomogram so that variables to vary may be specified without values (after an equals sign).

prModFit is the workhorse for the print methods for highest-level rms model fitting functions, handling both regular, html, and LaTeX printing, the latter two resulting in html or LaTeX code written to the console, automatically ready for knitr. The work of printing summary statistics is done by prStats, which uses the Hmisc print.char.matrix function to print overall model statistics if options(prType=) was not set to "latex" or "html". Otherwise it generates customized LaTeX or html code. The LaTeX longtable and epic packages must be in effect to use LaTeX.

reListclean allows one to rename a subset of a named list, ignoring the previous names and not concatenating them as R does. It also removes NULL elements and (by default) elements that are NA, as when an optional named element is fetched that doesn't exist. It has an argument dec whose elements are correspondingly removed, then dec is appended to the result vector.

formatNP is a function to format a vector of numerics. If digits is specified, formatNP will make sure that the formatted representation has digits positions to the right of the decimal place. If lang="latex" it will translate any scientific notation to LaTeX math form. If lang="html" will convert to html. If pvalue=TRUE, it will replace formatted values with "< 0.0001" (if digits=4).

latex.naprint.delete will, if appropriate, use LaTeX to draw a dot chart of frequency of variable NAs related to model fits. html.naprint.delete does the same thing in the RStudio R markdown context, using Hmisc:dotchartp (which uses plotly) for drawing any needed dot chart.

removeFormulaTerms removes one or more terms from a model formula, using strictly character manipulation. This handles problems such as [.terms removing offset() if you subset on anything. The function can also be used to remove the dependent variable(s) from the formula.

Usage

## S3 method for class 'rms'
vcov(object, regcoef.only=TRUE, intercepts='all', ...)
## S3 method for class 'cph'
vcov(object, regcoef.only=TRUE, ...)
## S3 method for class 'Glm'
vcov(object, regcoef.only=TRUE, intercepts='all', ...)
## S3 method for class 'Gls'
vcov(object, intercepts='all', ...)
## S3 method for class 'lrm'
vcov(object, regcoef.only=TRUE, intercepts='all', ...)
## S3 method for class 'ols'
vcov(object, regcoef.only=TRUE, ...)
## S3 method for class 'orm'
vcov(object, regcoef.only=TRUE, intercepts='mid', ...)
## S3 method for class 'psm'
vcov(object, regcoef.only=TRUE, ...)

# Given Design attributes and number of intercepts creates R
# format assign list.  atr non.slopes Terms
DesignAssign(atr, non.slopes, Terms)

oos.loglik(fit, ...)

## S3 method for class 'ols'
oos.loglik(fit, lp, y, ...)
## S3 method for class 'lrm'
oos.loglik(fit, lp, y, ...)
## S3 method for class 'cph'
oos.loglik(fit, lp, y, ...)
## S3 method for class 'psm'
oos.loglik(fit, lp, y, ...)
## S3 method for class 'Glm'
oos.loglik(fit, lp, y, ...)

Getlim(at, allow.null=FALSE, need.all=TRUE)
Getlimi(name, Limval, need.all=TRUE)

related.predictors(at, type=c("all","direct"))
interactions.containing(at, pred)
combineRelatedPredictors(at)
param.order(at, term.order)

Penalty.matrix(at, X)
Penalty.setup(at, penalty)

## S3 method for class 'Gls'
logLik(object, ...)
## S3 method for class 'ols'
logLik(object, ...)
## S3 method for class 'rms'
logLik(object, ...)
## S3 method for class 'rms'
AIC(object, ..., k=2, type=c('loglik', 'chisq'))
## S3 method for class 'rms'
nobs(object, ...)

lrtest(fit1, fit2)
## S3 method for class 'lrtest'
print(x, ...)

univarLR(fit)

Newlabels(fit, ...)
Newlevels(fit, ...)
## S3 method for class 'rms'
Newlabels(fit, labels, ...)
## S3 method for class 'rms'
Newlevels(fit, levels, ...)

prModFit(x, title, w, digits=4, coefs=TRUE, footer=NULL,
         lines.page=40, long=TRUE, needspace, subtitle=NULL, ...)

prStats(labels, w, lang=c("plain", "latex", "html"))

reListclean(..., dec=NULL, na.rm=TRUE)

formatNP(x, digits=NULL, pvalue=FALSE,
         lang=c("plain", "latex", "html"))

## S3 method for class 'naprint.delete'
latex(object, file="", append=TRUE, ...)

## S3 method for class 'naprint.delete'
html(object, ...)

removeFormulaTerms(form, which=NULL, delete.response=FALSE)

Arguments

`fit`	result of a fitting function
`object`	result of a fitting function
`regcoef.only`	For fits such as parametric survival models which have a final row and column of the covariance matrix for a non-regression parameter such as a log(scale) parameter, setting `regcoef.only=TRUE` causes only the first `p` rows and columns of the covariance matrix to be returned, where `p` is the length of `object$coef`.
`intercepts`	set to `"none"` to omit any rows and columns related to intercepts. Set to an integer scalar or vector to include particular intercept elements. Set to `'all'` to include all intercepts, or for `orm` to `"mid"` to use the default for `orm`. The default is to use the first for `lrm` and the median intercept for `orm`.
`at`	`Design` element of a fit
`pred`	index of a predictor variable (main effect)
`fit1`, `fit2`	fit objects from `lrm,ols,psm,cph` etc. It doesn't matter which fit object is the sub-model.
`lp`	linear predictor vector for `oos.loglik`. For proportional odds ordinal logistic models, this should have used the first intercept only. If `lp` and `y` are omitted, the -2 log likelihood for the original fit are returned.
`y`	values of a new vector of responses passed to `oos.loglik`.
`name`	the name of a variable in the model
`Limval`	an object returned by `Getlim`
`allow.null`	prevents `Getlim` from issuing an error message if no limits are found in the fit or in the object pointed to by `options(datadist=)`
`need.all`	set to `FALSE` to prevent `Getlim` or `Getlimi` from issuing an error message if data for a variable are not found
`type`	For `related.predictors`, set to `"direct"` to return lists of indexes of directly related factors only (those in interactions with the predictor). For `AIC.rms`, `type` specifies the basis on which to return AIC. The default is minus twice the maximized log likelihood plus `k` times the degrees of freedom counting intercept(s). Specify `type='chisq'` to get a penalized model likelihood ratio chi-square instead.
`term.order`	1 for all parameters, 2 for all parameters associated with either nonlinear or interaction effects, 3 for nonlinear effects (main or interaction), 4 for interaction effects, 5 for nonlinear interaction effects.
`X`	a design matrix, not including columns for intercepts
`penalty`	a vector or list specifying penalty multipliers for types of model terms
`k`	the multiplier of the degrees of freedom to be used in computing AIC. The default is 2.
`x`	a result of `lrtest`, or the result of a high-level model fitting function (for `prModFit`)
`labels`	a character vector specifying new labels for variables in a fit. To give new labels for all variables, you can specify `labels` of the form `labels=c("Age in Years","Cholesterol")`, where the list of new labels is assumed to be the length of all main effect-type variables in the fit and in their original order in the model formula. You may specify a named vector to give new labels in random order or for a subset of the variables, e.g., `labels=c(age="Age in Years",chol="Cholesterol")`. For `prStats`, is a list with major column headings, which can themselves be vectors that are then stacked vertically.
`levels`	a list of named vectors specifying new level labels for categorical predictors. This will override `parms` as well as `datadist` information (if available) that were stored with the fit.
`title`	a single character string used to specify an overall title for the regression fit, which is printed first by `prModFit`. Set to `""` to suppress the title.
`w`	For `prModFit`, a special list of lists, which each list element specifying information about a block of information to include in the `print.` output for a fit. For `prStats`, `w` is a list of statistics to print, elements of which can be vectors that are stacked vertically. Unnamed elements specify number of digits to the right of the decimal place to which to round (`NA` means use `format` without rounding, as with integers and floating point values). Negative values of `digits` indicate that the value is a P-value to be formatted with `formatNP`. Digits are recycled as needed.
`digits`	number of digits to the right of the decimal point, for formatting numeric values in printed output
`coefs`	specify `coefs=FALSE` to suppress printing the table of model coefficients, standard errors, etc. Specify `coefs=n` to print only the first `n` regression coefficients in the model.
`footer`	a character string to appear at the bottom of the regression model output
`file`	name of file to which to write model output
`append`	specify `append=FALSE` when using `file` and you want to start over instead of adding to an existing file.
`lang`	specifies the typesetting language: plain text, LaTeX, or html
`lines.page`	see `latex`
`long`	set to `FALSE` to suppress printing of formula and certain other model output
`needspace`	optional character string to insert inside a LaTeX needspace macro call before the statistics table and before the coefficient matrix, to avoid bad page splits. This assumes the LaTeX needspace style is available. Example: `needspace='6\baselineskip'` or `needspace='1.5in'`.
`subtitle`	optional vector of character strings containing subtitles that will appear under `title` but not bolded
`dec`	vector of decimal places used for rounding
`na.rm`	set to `FALSE` to keep `NA`s in the vector created by `reListclean`
`pvalue`	set to `TRUE` if you want values below 10 to the minus `digits` to be formatted to be less than that value
`form`	a formula object
`which`	a vector of one or more character strings specifying the names of functions that are called from a formula, e.g., `"cluster"`. By default no right-hand-side terms are removed.
`delete.response`	set to `TRUE` to remove the dependent variable(s) from the formula
`atr`, `non.slopes`, `Terms`	`Design` function attributes, number of intercepts, and `terms` object
`...`	other arguments. For `reListclean` this contains the elements being extracted. For `prModFit` this information is passed to the `Hmisc latexTabular` function when a block of output is a vector to be formatted in LaTeX.

Value

vcov returns a variance-covariance matrix oos.loglik returns a scalar -2 log likelihood value. Getlim returns a list with components limits and values, either stored in fit or retrieved from the object created by datadist and pointed to in options(datadist=). related.predictors and combineRelatedPredictors return a list of vectors, and interactions.containing returns a vector. param.order returns a logical vector corresponding to non-strata terms in the model. Penalty.matrix returns a symmetric matrix with dimension equal to the number of slopes in the model. For all but categorical predictor main effect elements, the matrix is diagonal with values equal to the variances of the columns of X. For segments corresponding to c-1 dummy variables for c-category predictors, puts a c-1 x c-1 sub-matrix in Penalty.matrix that is constructed so that a quadratic form with Penalty.matrix in the middle computes the sum of squared differences in parameter values about the mean, including a portion for the reference cell in which the parameter is by definition zero. Newlabels returns a new fit object with the labels adjusted.

reListclean returns a vector of named (by its arguments) elements. formatNP returns a character vector.

removeFormulaTerms returns a formula object.

Examples

## Not run: 
f <- psm(S ~ x1 + x2 + sex + race, dist='gau')
g <- psm(S ~ x1 + sex + race, dist='gau', 
         fixed=list(scale=exp(f$parms)))
lrtest(f, g)


g <- Newlabels(f, c(x2='Label for x2'))
g <- Newlevels(g, list(sex=c('Male','Female'),race=c('B','W')))
nomogram(g)

## End(Not run)