HmiscOverview {Hmisc}  R Documentation 
Overview of Hmisc Library
Description
The Hmisc library contains many functions useful for data analysis, highlevel graphics, utility operations, functions for computing sample size and power, translating SAS datasets into R, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX code, recoding variables, and bootstrap repeated measures analysis. Most of these functions were written by F Harrell, but a few were collected from statlib and from snews; other authors are indicated below. This collection of functions includes all of Harrell's submissions to statlib other than the functions in the rms and display libraries. A few of the functions do not have “Help” documentation.
To make Hmisc load silently, issue
options(Hverbose=FALSE)
before library(Hmisc)
.
Functions
Function Name  Purpose 
abs.error.pred  Computes various indexes of predictive accuracy based 
on absolute errors, for linear models  
addMarginal  Add marginal observations over selected variables 
all.is.numeric  Check if character strings are legal numerics 
approxExtrap  Linear extrapolation 
aregImpute  Multiple imputation based on additive regression, 
bootstrapping, and predictive mean matching  
areg.boot  Nonparametrically estimate transformations for both 
sides of a multiple additive regression, and  
bootstrap these estimates and R^2 

ballocation  Optimum sample allocations in 2sample proportion test 
binconf  Exact confidence limits for a proportion and more accurate 
(narrower!) score stat.based Wilson interval  
(Rollin Brant, mod. FEH)  
bootkm  Bootstrap KaplanMeier survival or quantile estimates 
bpower  Approximate power of 2sided test for 2 proportions 
Includes bpower.sim for exact power by simulation  
bpplot  BoxPercentile plot 
(Jeffrey Banfield, umsfjban@bill.oscs.montana.edu)  
bpplotM  Chart extended box plots for multiple variables 
bsamsize  Sample size requirements for test of 2 proportions 
bystats  Statistics on a single variable by levels of >=1 factors 
bystats2  2way statistics 
character.table  Shows numeric equivalents of all latin characters 
Useful for putting many special chars. in graph titles  
(Pierre Joyet, pierre.joyet@bluewin.ch)  
ciapower  Power of Cox interaction test 
cleanup.import  More compactly store variables in a data frame, and clean up 
problem data when e.g. Excel spreadsheet had a non  
numeric value in a numeric column  
combine.levels  Combine infrequent levels of a categorical variable 
confbar  Draws confidence bars on an existing plot using multiple 
confidence levels distinguished using color or gray scale  
contents  Print the contents (variables, labels, etc.) of a data frame 
cpower  Power of Cox 2sample test allowing for noncompliance 
Cs  Vector of character strings from list of unquoted names 
csv.get  Enhanced importing of comma separated files labels 
cut2  Like cut with better endpoint label construction and allows 
construction of quantile groups or groups with given n  
datadensity  Snapshot graph of distributions of all variables in 
a data frame. For continuous variables uses scat1d.  
dataRep  Quantify representation of new observations in a database 
ddmmmyy  SAS “date7” output format for a chron object 
deff  Kish design effect and intracluster correlation 
describe  Function to describe different classes of objects. 
Invoke by saying describe(object). It calls one of the  
following:  
describe.data.frame  Describe all variables in a data frame (generalization 
of SAS UNIVARIATE)  
describe.default  Describe a variable (generalization of SAS UNIVARIATE) 
dotplot3  A more flexible version of dotplot 
Dotplot  Enhancement of Trellis dotplot allowing for matrix 
xvar., auto generation of Key function, superposition  
drawPlot  Simple mousedriven drawing program, including a function 
for fitting Bezier curves  
Ecdf  Empirical cumulative distribution function plot 
errbar  Plot with error bars (Charles Geyer, U. Chi., mod FEH) 
event.chart  Plot general event charts (Jack Lee, jjlee@mdanderson.org, 
Ken Hess, Joel Dubin; Am Statistician 54:6370,2000)  
event.history  Event history chart with timedependent cov. status 
(Joel Dubin, jdubin@uwaterloo.ca)  
find.matches  Find matches (with tolerances) between columns of 2 matrices 
first.word  Find the first word in an R expression (R Heiberger) 
fit.mult.impute  Fit most regression models over multiple transcan imputations, 
compute imputationadjusted variances and avg. betas  
format.df  Format a matrix or data frame with much user control 
(R Heiberger and FE Harrell)  
ftupwr  Power of 2sample binomial test using Fleiss, Tytun, Ury 
ftuss  Sample size for 2sample binomial test using " " " " 
(Both by Dan Heitjan, dheitjan@biostats.hmc.psu.edu)  
gbayes  Bayesian posterior and predictive distributions when both 
the prior and the likelihood are Gaussian  
getHdata  Fetch and list datasets on our web site 
hdquantile  HarrellDavis nonparametric quantile estimator with s.e. 
histbackback  Backtoback histograms (Pat Burns, Salomon Smith 
Barney, London, pburns@dorado.sbi.com)  
hist.data.frame  Matrix of histograms for all numeric vars. in data frame 
Use hist.data.frame(data.frame.name)  
histSpike  Add highresolution spike histograms or density estimates 
to an existing plot  
hoeffd  Hoeffding's D test (omnibus test of independence of X and Y) 
impute  Impute missing data (generic method) 
interaction  More flexible version of builtin function 
is.present  Tests for nonblank character values or nonNA numeric values 
james.stein  JamesStein shrinkage estimates of cell means from raw data 
labcurve  Optimally label a set of curves that have been drawn on 
an existing plot, on the basis of gaps between curves.  
Also position legends automatically at emptiest rectangle.  
label  Set or fetch a label for an Robject 
Lag  Lag a vector, padding on the left with NA or '' 
latex  Convert an R object to LaTeX (R Heiberger & FE Harrell) 
list.tree  Prettyprint the structure of any data object 
(Alan Zaslavsky, zaslavsk@hcp.med.harvard.edu)  
Load  Enhancement of load 
mask  8bit logical representation of a short integer value 
(Rick Becker)  
matchCases  Match each case on one continuous variable 
matxv  Fast matrix * vector, handling intercept(s) and NAs 
mgp.axis  Version of axis() that uses appropriate mgp from 
mgp.axis.labels and gets around bug in axis(2, ...)  
that causes it to assume las=1  
mgp.axis.labels  Used by survplot and plot in rms library (and other 
functions in the future) so that different spacing  
between tick marks and axis tick mark labels may be  
specified for x and yaxes.  
Use mgp.axis.labels('default') to set defaults.  
Users can set values manually using  
mgp.axis.labels(x,y) where x and y are 2nd value of  
par('mgp') to use. Use mgp.axis.labels(type=w) to  
retrieve values, where w='x', 'y', 'x and y', 'xy',  
to get 3 mgp values (first 3 types) or 2 mgp.axis.labels.  
minor.tick  Add minor tick marks to an existing plot 
mtitle  Add outer titles and subtitles to a multiple plot layout 
multLines  Draw multiple vertical lines at each x 
in a line plot  
%nin%  Opposite of %in% 
nobsY  Compute no. nonNA observations for left hand formula side 
nomiss  Return a matrix after excluding any row with an NA 
panel.bpplot  Panel function for trellis bwplot  boxpercentile plots 
panel.plsmo  Panel function for trellis xyplot  uses plsmo 
pBlock  Block variables for certain lattice charts 
pc1  Compute first prin. component and get coefficients on 
original scale of variables  
plotCorrPrecision  Plot precision of estimate of correlation coefficient 
plsmo  Plot smoothed x vs. y with labeling and exclusion of NAs 
Also allows a grouping variable and plots unsmoothed data  
popower  Power and sample size calculations for ordinal responses 
(two treatments, proportional odds model)  
prn  prn(expression) does print(expression) but titles the 
output with 'expression'. Do prn(expression,txt) to add  
a heading (‘txt’) before the ‘expression’ title  
pstamp  Stamp a plot with date in lower right corner (pstamp()) 
Add ,pwd=T and/or ,time=T to add current directory  
name or time  
Put additional text for label as first argument, e.g.  
pstamp('Figure 1') will draw 'Figure 1 date'  
putKey  Different way to use key() 
putKeyEmpty  Put key at most empty part of existing plot 
rcorr  Pearson or Spearman correlation matrix with pairwise deletion 
of missing data  
rcorr.cens  Somers' Dxy rank correlation with censored data 
rcorrp.cens  Assess difference in concordance for paired predictors 
rcspline.eval  Evaluate restricted cubic spline design matrix 
rcspline.plot  Plot spline fit with nonparametric smooth and grouped estimates 
rcspline.restate  Restate restricted cubic spline in unrestricted form, and 
create TeX expression to print the fitted function  
reShape  Reshape a matrix into 3 vectors, reshape serial data 
rm.boot  Bootstrap spline fit to repeated measurements model, 
with simultaneous confidence region  least  
squares using spline function in time  
rMultinom  Generate multinomial random variables with varying prob. 
samplesize.bin  Sample size for 2sample binomial problem 
(Rick Chappell, chappell@stat.wisc.edu)  
sas.get  Convert SAS dataset to S data frame 
sasxport.get  Enhanced importing of SAS transport dataset in R 
Save  Enhancement of save 
scat1d  Add 1dimensional scatterplot to an axis of an existing plot 
(like barcodes, FEH/Martin Maechler,  
maechler@stat.math.ethz.ch/Jens OehlschlaegelAkiyoshi,  
oehl@psyresstuttgart.de)  
score.binary  Construct a score from a series of binary variables or 
expressions  
sedit  A set of character handling functions written entirely 
in R. sedit() does much of what the UNIX sed  
program does. Other functions included are  
substring.location, substring<, replace.string.wild,  
and functions to check if a string is numeric or  
contains only the digits 09  
setTrellis  Set Trellis graphics to use blank conditioning panel strips, 
line thickness 1 for dot plot reference lines:  
setTrellis(); 3 optional arguments  
show.col  Show colors corresponding to col=0,1,...,99 
show.pch  Show all plotting characters specified by pch=. 
Just type show.pch() to draw the table on the  
current device.  
showPsfrag  Use LaTeX to compile, and dvips and ghostview to 
display a postscript graphic containing psfrag strings  
solvet  Version of solve with argument tol passed to qr 
somers2  Somers' rank correlation and cindex for binary y 
spearman  Spearman rank correlation coefficient spearman(x,y) 
spearman.test  Spearman 1 d.f. and 2 d.f. rank correlation test 
spearman2  Spearman multiple d.f. \rho^2 , adjusted \rho^2 , WilcoxonKruskal 
Wallis test, for multiple predictors  
spower  Simulate power of 2sample test for survival under 
complex conditions  
Also contains the Gompertz2,Weibull2,Lognorm2 functions.  
spss.get  Enhanced importing of SPSS files using read.spss function 
src  src(name) = source("name.s") with memory 
store  store an object permanently (easy interface to assign function) 
strmatch  Shortest unique identifier match 
(Terry Therneau, therneau@mayo.edu)  
subset  More easily subset a data frame 
substi  Substitute one var for another when observations NA 
summarize  Generate a data frame containing stratified summary 
statistics. Useful for passing to trellis.  
summary.formula  General table making and plotting functions for summarizing 
data  
summaryD  Summarizing using userprovided formula and dotchart3 
summaryM  Replacement for summary.formula(..., method='reverse') 
summaryP  Multipanel dot chart for summarizing proportions 
summaryS  Summarize multiple response variables for multipanel 
dot chart or scatterplot  
summaryRc  Summary for continuous variables using lowess 
symbol.freq  XY Frequency plot with circles' area prop. to frequency 
sys  Execute unix() or dos() depending on what's running 
tabulr  Frontend to tabular function in the tables package 
tex  Enclose a string with the correct syntax for using 
with the LaTeX psfrag package, for postscript graphics  
transace  ace() packaged for easily automatically transforming all 
variables in a matrix  
transcan  automatic transformation and imputation of NAs for a 
series of predictor variables  
trap.rule  Area under curve defined by arbitrary x and y vectors, 
using trapezoidal rule  
trellis.strip.blank  To make the strip titles in trellis more visible, you can 
make the backgrounds blank by saying trellis.strip.blank().  
Use before opening the graphics device.  
t.test.cluster  2sample ttest for clusterrandomized observations 
uncbind  Form individual variables from a matrix 
upData  Update a data frame (change names, labels, remove vars, etc.) 
units  Set or fetch "units" attribute  units of measurement for var. 
varclus  Graph hierarchical clustering of variables using squared 
Pearson or Spearman correlations or Hoeffding D as similarities  
Also includes the naclus function for examining similarities in  
patterns of missing values across variables.  
wtd.mean  
wtd.var  
wtd.quantile  
wtd.Ecdf  
wtd.table  
wtd.rank  
wtd.loess.noiter  
num.denom.setup  Set of function for obtaining weighted estimates 
xy.group  Compute mean x vs. function of y by groups of x 
xYplot  Like trellis xyplot but supports error bars and multiple 
response variables that are connected as separate lines  
ynbind  Combine a series of yes/no true/false present/absent variables into a matrix 
zoom  Zoom in on any graphical display 
(Bill Dunlap, bill@statsci.com) 
Copyright Notice
GENERAL DISCLAIMER
This program is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2, or (at your option) any later version.
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the GNU General Public License for more
details.
In short: You may use it any way you like, as long as you
don't charge money for it, remove this notice, or hold anyone liable
for its results. Also, please acknowledge the source and communicate
changes to the author.
If this software is used is work presented for publication, kindly
reference it using for example:
Harrell FE (2014): Hmisc: A package of miscellaneous R functions.
Programs available from https://hbiostat.org/R/Hmisc/.
Be sure to reference R itself and other libraries used.
Author(s)
Frank E Harrell Jr
Professor of Biostatistics
Vanderbilt University School of Medicine
Nashville, Tennessee
fh@fharrell.com
References
See Alzola CF, Harrell FE (2004): An Introduction to S and the Hmisc and Design Libraries at https://hbiostat.org/R/doc/sintro.pdf for extensive documentation and examples for the Hmisc package.