percentile {EdSurvey}  R Documentation 
Calculates the percentiles of a numeric variable in an
edsurvey.data.frame
, a light.edsurvey.data.frame
,
or an edsurvey.data.frame.list
.
percentile(
variable,
percentiles,
data,
weightVar = NULL,
jrrIMax = 1,
varMethod = c("jackknife", "Taylor"),
alpha = 0.05,
omittedLevels = TRUE,
defaultConditions = TRUE,
recode = NULL,
returnVarEstInputs = FALSE,
returnNumberOfPSU = FALSE,
pctMethod = c("symmetric", "unbiased", "simple"),
confInt = TRUE,
dofMethod = c("JR", "WS")
)
variable 
the character name of the variable to percentiles computed, typically a subject scale or subscale 
percentiles 
a numeric vector of percentiles in the range of 0 to 100 (inclusive) 
data 
an 
weightVar 
a character indicating the weight variable to use. 
jrrIMax 
a numeric value; when using the jackknife variance estimation method, the default estimation option, 
varMethod 
a character set to 
alpha 
a numeric value between 0 and 1 indicating the confidence level.
An 
omittedLevels 
a logical value. When set to the default value of

defaultConditions 
a logical value. When set to the default value
of 
recode 
a list of lists to recode variables. Defaults to

returnVarEstInputs 
a logical value set to 
returnNumberOfPSU 
a logical value set to 
pctMethod 
one of “unbiased”, “symmetric”, “simple”; unbiased produces a weighted median unbiased percentile estimate, whereas simple uses a basic formula that matches previously published results. Symmetric uses a more basic formula but requires that the percentile is symetric to multiplying the quantity by negative one. 
confInt 
a Boolean indicating if the confidence interval should be returned 
dofMethod 
passed to 
Percentiles, their standard errors, and confidence intervals are calculated according to the vignette titled Statistical Methods Used in EdSurvey. The standard errors and confidence intervals are based on separate formulas and assumptions.
The Taylor series variance estimation procedure is not relevant to percentiles because percentiles are not continuously differentiable.
The return type depends on whether the class of the data
argument is an
edsurvey.data.frame
or an edsurvey.data.frame.list
.
The data argument is an edsurvey.data.frame
When the data
argument is an edsurvey.data.frame
,
percentile
returns an S3 object of class percentile
.
This is a data.frame
with typical attributes (names
,
row.names
, and class
) and additional attributes as follows:
n0 
number of rows on 
nUsed 
number of observations with valid data and weights larger than zero 
nPSU 
number of PSUs used in the calculation 
call 
the call used to generate these results 
The columns of the data.frame
are as follows:
percentile 
the percentile of this row 
estimate 
the estimated value of the percentile 
se 
the jackknife standard error of the estimated percentile 
df 
degrees of freedom 
confInt.ci_lower 
the lower bound of the confidence interval 
confInt.ci_upper 
the upper bound of the confidence interval 
nsmall 
the number of units with more extreme results, averaged across plausible values 
When the confInt
argument is set to FALSE
, the confidence
intervals are not returned.
The data argument is an edsurvey.data.frame.list
When the data
argument is an edsurvey.data.frame.list
,
percentile
returns an S3 object of class percentileList
.
This is a data.frame with a call
attribute.
The columns in the data.frame
are identical to those in the previous
section, but there also are columns from the edsurvey.data.frame.list
.
covs 
a column for each column in the 
When returnVarEstInputs
is TRUE
, an attribute
varEstInputs
also is returned that includes the variance estimate
inputs used for calculating covariances with varEstToCov
.
Paul Bailey
Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. American Statistician, 50, 361–365.
## Not run:
# read in the example data (generated, not real student data)
sdf < readNAEP(system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))
# get the median of the composite
percentile("composite", 50, sdf)
# get several percentiles
percentile("composite", c(0,1,25,50,75,99,100), sdf)
# build an edsurvey.data.frame.list
sdfA < subset(sdf, scrpsu %in% c(5,45,56))
sdfB < subset(sdf, scrpsu %in% c(75,76,78))
sdfC < subset(sdf, scrpsu %in% 100:200)
sdfD < subset(sdf, scrpsu %in% 201:300)
sdfl < edsurvey.data.frame.list(list(sdfA, sdfB, sdfC, sdfD),
labels=c("A locations",
"B locations",
"C locations",
"D locations"))
# this shows how these datasets will be described:
sdfl$covs
percentile("composite", 50, sdfl)
percentile("composite", c(25, 50, 75), sdfl)
## End(Not run)