percentile {EdSurvey}R Documentation

EdSurvey Percentiles


Calculates the percentiles of a numeric variable in an, a, or an


  weightVar = NULL,
  jrrIMax = 1,
  varMethod = c("jackknife", "Taylor"),
  alpha = 0.05,
  omittedLevels = TRUE,
  defaultConditions = TRUE,
  recode = NULL,
  returnVarEstInputs = FALSE,
  returnNumberOfPSU = FALSE,
  pctMethod = c("symmetric", "unbiased", "simple"),
  confInt = TRUE,
  dofMethod = c("JR", "WS")



the character name of the variable to percentiles computed, typically a subject scale or subscale


a numeric vector of percentiles in the range of 0 to 100 (inclusive)


an or an


a character indicating the weight variable to use.


a numeric value; when using the jackknife variance estimation method, the default estimation option, jrrIMax=1, uses the sampling variance from the first plausible value as the component for sampling variance estimation. The V_{jrr} term (see Statistical Methods Used in EdSurvey) can be estimated with any number of plausible values, and values larger than the number of plausible values on the survey (including Inf) will result in all plausible values being used. Higher values of jrrIMax lead to longer computing times and more accurate variance estimates.


a character set to jackknife or Taylor that indicates the variance estimation method used when constructing the confidence intervals. The jackknife variance estimation method is always used to calculate the standard error.


a numeric value between 0 and 1 indicating the confidence level. An alpha value of 0.05 would indicate a 95% confidence interval and is the default.


a logical value. When set to the default value of TRUE, drops those levels of all factor variables that are specified in achievementVars and aggregatBy. Use print on an to see the omitted levels.


a logical value. When set to the default value of TRUE, uses the default conditions stored in an to subset the data. Use print on an to see the default conditions.


a list of lists to recode variables. Defaults to NULL. Can be set as recode=list(var1= list(from= c("a", "b", "c"), to= "d")).


a logical value set to TRUE to return the inputs to the jackknife and imputation variance estimates which allows for the computation of covariances between estimates.


a logical value set to TRUE to return the number of primary sampling units (PSUs)


one of “unbiased”, “symmetric”, “simple”; unbiased produces a weighted median unbiased percentile estimate, whereas simple uses a basic formula that matches previously published results. Symmetric uses a more basic formula but requires that the percentile is symetric to multiplying the quantity by negative one.


a Boolean indicating if the confidence interval should be returned


passed to DoFCorrection as the method argument


Percentiles, their standard errors, and confidence intervals are calculated according to the vignette titled Statistical Methods Used in EdSurvey. The standard errors and confidence intervals are based on separate formulas and assumptions.

The Taylor series variance estimation procedure is not relevant to percentiles because percentiles are not continuously differentiable.


The return type depends on whether the class of the data argument is an or an

The data argument is an When the data argument is an, percentile returns an S3 object of class percentile. This is a data.frame with typical attributes (names, row.names, and class) and additional attributes as follows:


number of rows on before any conditions were applied


number of observations with valid data and weights larger than zero


number of PSUs used in the calculation


the call used to generate these results

The columns of the data.frame are as follows:


the percentile of this row


the estimated value of the percentile


the jackknife standard error of the estimated percentile


degrees of freedom


the lower bound of the confidence interval


the upper bound of the confidence interval


the number of units with more extreme results, averaged across plausible values

When the confInt argument is set to FALSE, the confidence intervals are not returned.

The data argument is an When the data argument is an, percentile returns an S3 object of class percentileList. This is a data.frame with a call attribute. The columns in the data.frame are identical to those in the previous section, but there also are columns from the


a column for each column in the covs value of the See Examples.

When returnVarEstInputs is TRUE, an attribute varEstInputs also is returned that includes the variance estimate inputs used for calculating covariances with varEstToCov.


Paul Bailey


Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. American Statistician, 50, 361–365.


## Not run: 
# read in the example data (generated, not real student data)
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))

# get the median of the composite
percentile("composite", 50, sdf)

# get several percentiles
percentile("composite", c(0,1,25,50,75,99,100), sdf)
# build an
sdfA <- subset(sdf, scrpsu %in% c(5,45,56))
sdfB <- subset(sdf, scrpsu %in% c(75,76,78))
sdfC <- subset(sdf, scrpsu %in% 100:200)
sdfD <- subset(sdf, scrpsu %in% 201:300)

sdfl <-, sdfB, sdfC, sdfD),
                                 labels=c("A locations",
                                           "B locations",
                                           "C locations",
                                           "D locations"))
# this shows how these datasets will be described:

percentile("composite", 50, sdfl)
percentile("composite", c(25, 50, 75), sdfl)

## End(Not run)

[Package EdSurvey version 2.7.1 Index]