mml.sdf {EdSurvey}R Documentation

EdSurvey Direct Estimation

Description

Prepare IRT parameters and score items and then estimate a linear model with direct estimation.

Usage

mml.sdf(
  formula,
  data,
  weightVar = NULL,
  omittedLevels = TRUE,
  composite = TRUE,
  dctPath = NULL,
  verbose = FALSE,
  multiCore = FALSE,
  numberOfCores = NULL,
  minNode = -4,
  maxNode = 4,
  Q = 34,
  scoreDict = defaultNAEPScoreCard(),
  idVar = NULL
)

Arguments

formula

a formula for the model.

data

an edsurvey.data.frame for the National Assessment of Educational Progress (NAEP) and the Trends in International Mathematics and Science Study (TIMSS).

weightVar

a character indicating the weight variable to use. The weightVar must be one of the weights for the edsurvey.data.frame. If NULL, it uses the default for the edsurvey.data.frame.

omittedLevels

a logical value. When set to the default value of TRUE, drops the levels of all factor variables that are specified in an edsurvey.data.frame. Use print on an edsurvey.data.frame to see the omitted levels.

composite

logical; for a NAEP composite, setting to FALSE fits the model to all items at once, in a single construct, whereas setting to TRUE fits the model as a NAEP composite (i.e., a weighted average of the subscales). This argument is not applicable for TIMSS.

dctPath

a connection that points to the location of a NAEP dct file. A dct file can be used to input custom item response theory (IRT) parameters and subscale/subtest weights for NAEP assessments compared with those provided in the NAEPirtparams package. Otherwise, the argument defaults to NULL and IRT parameters and subscale weights from NAEPirtparams are used. IRT parameters for TIMSS cannot be supplied through a dctPath and are downloaded by using the downloadTIMSS function.

verbose

logical; indicates whether a detailed printout should display during execution, only for NAEP data.

multiCore

allows the foreach package to be used. You should have already set up registerDoParallel.

numberOfCores

the number of cores to be used when using multiCore. Defaults to 75% of available cores. Users can check available cores with detectCores().

minNode

numeric; minimum integration point in direct estimation; see mml.

maxNode

numeric; maximum integration point in direct estimation; see mml.

Q

integer; number of integration points per student used when integrating over the levels of the latent outcome construct.

scoreDict

a data.frame that includes guidelines for scoring the provided NAEP data. Here, scoring refers to turning item responses into scores on each item. To see the default scoring guidelines, call the function defaultNAEPScoreCard(), or see the Examples section. See Details for more information on possible scores.

idVar

a variable that is used to explicitly define the name of the student identifier variable to be used from data. Defaults to NULL, and sid is used as the student identifier.

Details

Typically, models are fit with NAEP data using plausible values to integrate out the uncertainty in the measurement of individual student outcomes. When direct estimation is used, the measurement error is integrated out explicitly using Q quadrature points. See documentation for mml in the Dire package.

The scoreDict helps turn response categories that are not simple item responses, such as Not Reached and Multiple, to something coded as inputs for the mml function in Dire. How mml treats these values depends on the test. For NAEP, for a dichotomous item, 8 is scored as the same proportion correct as the guessing parameter for that item, 0 is an incorrect response, an NA does not change the student's score, and 1 is correct. TIMSS does not require a scoreDict.

Value

An edSurveyMML object, which is the outcome from mml.sdf, with the following elements:

mml

an object containing information from the mml procedure. ?mml can be used for further information.

scoreDict

the scoring used in the mml procedure

.

itemMapping

the item mapping used in the mml procedure

.

References

Cohen, J., & Jiang, T. (1999). Comparison of partially measured latent traits across nominal subgroups. Journal of the American Statistical Association, 94(448), 1035–1044. https://doi.org/10.2307/2669917

Examples

## Not run: 
## Direct Estimation with NAEP 
# Load data 
sdfNAEP <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))

# Inspect scoring guidelines
defaultNAEPScoreCard()

# example output: 
#          resCat pointMult pointConst
# 1     Multiple         8          0
# 2  Not Reached        NA         NA
# 3      Missing        NA         NA
# 4      Omitted         8          0
# 5    Illegible         0          0
# 6 Non-Rateable         0          0
# 7     Off Task         0          0

# Run NAEP model, warnings are about item codings
mmlNAEP <- mml.sdf(algebra ~ dsex + b013801, sdfNAEP, weightVar='origwt')

# Call with Taylor
summary(mmlNAEP, varType="Taylor", strataVar="repgrp1", PSUVar="jkunit")

## Direct Estimation with TIMSS 
# Load data 
downloadTIMSS("~/", year=2015)
sdfTIMSS <- readTIMSS("~/TIMSS/2015", countries="usa", grade = "4")

# Run TIMSS model, warnings are about item codings 
mmlTIMSS <- mml.sdf(mmat ~ itsex + asbg04, sdfTIMSS, weightVar='totwgt')

# Call with Taylor
summary(mmlTIMSS, varType="Taylor", strataVar="jkzone", PSUVar="jkrep")

## End(Not run)



[Package EdSurvey version 2.7.1 Index]