dscore {dscore} | R Documentation |
D-score estimation
Description
The dscore()
function estimates the following quantities: D-score,
a numeric score that quantifies child development by one number,
Development-for-Age Z-score (DAZ) that corrects the D-score for age,
standard error of measurement (SEM) of the D-score.
Usage
dscore(
data,
items = names(data),
key = NULL,
population = NULL,
xname = "age",
xunit = c("decimal", "days", "months"),
prepend = NULL,
itembank = dscore::builtin_itembank,
metric = c("dscore", "logit"),
prior_mean = NULL,
prior_sd = NULL,
transform = NULL,
qp = NULL,
dec = c(2L, 3L),
relevance = c(-Inf, Inf),
algorithm = c("current", "1.8.7"),
verbose = FALSE
)
dscore_posterior(
data,
items = names(data),
key = NULL,
population = NULL,
xname = "age",
xunit = c("decimal", "days", "months"),
prepend = NULL,
itembank = dscore::builtin_itembank,
metric = c("dscore", "logit"),
prior_mean = NULL,
prior_sd = NULL,
transform = NULL,
qp = NULL,
dec = c(2L, 3L),
relevance = c(-Inf, Inf),
algorithm = c("current", "1.8.7"),
verbose = FALSE
)
Arguments
data |
A |
items |
A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as |
key |
String. Name of the key that bundles the difficulty estimates
pertaining one the same Rasch model. View |
population |
String. The name of the reference population to calculate DAZ.
Use |
xname |
A string with the name of the age variable in
|
xunit |
A string specifying the unit in which age is measured
(either |
prepend |
Character vector with column names in |
itembank |
A |
metric |
A string, either |
prior_mean |
A string or numeric scalar. If a string, it should
refer to a column name in |
prior_sd |
A string or a numeric scalar. If a string, it should
refer to a column name in |
transform |
Numeric vector, length 2, containing the intercept
and slope of the linear transform from the logit scale into the
the D-score scale. The default ( |
qp |
Numeric vector of equally spaced quadrature points.
This vector should span the range of all D-score or logit values.
The default ( |
dec |
A vector of two integers specifying the number of
decimals for rounding the D-score and DAZ, respectively.
The default is |
relevance |
A numeric vector of length with the lower and
upper bounds of the relevance interval. The procedure calculates
a dynamic EAP for each item. If the difficulty level (tau) of the
next item is outside the relevance interval around EAP, the procedure
ignore the score on the item. The default is |
algorithm |
Computational method, for backward compatibility.
Either |
verbose |
Logical. Print settings. |
Details
The scoring algorithm is based on the method by Bock and Mislevy (1982). The method uses Bayes rule to update a prior ability into a posterior ability.
The item names should correspond to the "gsed"
lexicon.
A key is defined by the set of estimated item difficulties.
Key | Model | Quadrature | Instruments | Direct/Caregiver | Reference |
"dutch" | 75_0 | -10:80 | 1 | direct | Van Buuren, 2014/2020 |
"gcdg" | 565_18 | -10:100 | 13 | direct | Weber, 2019 |
"gsed1912" | 807_17 | -10:100 | 21 | mixed | GSED Team, 2019 |
"293_0" | 293_0 | -10:100 | 2 | mixed | GSED Team, 2022 |
"gsed2212" | 818_6 | -10:100 | 27 | mixed | GSED Team, 2022 |
"gsed2406" | 818_6 | -10:100 | 27 | mixed | GSED Team, 2024 |
As a general rule, one should only compare D-scores
that are calculated using the same key and the same
set of quadrature points. For calculating D-scores on new data,
the advice is to use the default, which currently is "gsed2406"
.
The default starting prior is a mean calculated from a so-called
"Count model" that describes mean D-score as a function of age. The
The Count models are implemented in the function [count_mu()]
.
By default, the spread of the starting prior
is 5 D-score points around the mean D-score, which corresponds to
approximately 1.5 to 2 times the normal spread of child of a given age. The
starting prior is informative for very short test (say <5 items), but has
little impact on the posterior for larger tests.
Value
The dscore()
function returns a data.frame
with nrow(data)
rows.
Optionally, the first block of columns can be copied to the
result by using prepend
. The second block consists of the
following columns:
Name | Label |
a | Decimal age |
n | Number of items with valid (0/1) data |
p | Percentage of passed milestones |
d | Ability estimate, mean of posterior |
sem | Standard error of measurement, standard deviation of the posterior |
daz | D-score corrected for age, calculated in Z-scale (for metric "dscore" ) |
For more detail, the dscore_posterior()
function returns a data frame with
nrow(data)
rows and length(qp)
plus prepended columns with the
full posterior density of the D-score at each quadrature point.
If no valid responses are found, dscore_posterior()
returns the
prior density. Versions prior to 1.8.5 returned a matrix
(instead of
a data.frame
). Code that depends on the result being a matrix
may break
and may need adaptation.
Author(s)
Stef van Buuren, Iris Eekhout, Arjan Huizing (2022)
References
Bock DD, Mislevy RJ (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431-444.
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf
See Also
builtin_keys()
, builtin_itembank()
, builtin_itemtable()
,
builtin_references()
, get_tau()
, posterior()
, milestones()
Examples
# using all defaults and properly formatted data
ds <- dscore(milestones)
head(ds)
# step-by-step example
data <- data.frame(
id = c(
"Jane", "Martin", "ID-3", "No. 4", "Five", "6",
NA_character_, as.character(8:10)
),
age = rep(round(21 / 365.25, 4), 10),
ddifmd001 = c(NA, NA, 0, 0, 0, 1, 0, 1, 1, 1),
ddicmm029 = c(NA, NA, NA, 0, 1, 0, 1, 0, 1, 1),
ddigmd053 = c(NA, 0, 0, 1, 0, 0, 1, 1, 0, 1)
)
items <- names(data)[3:5]
# third item is not part of the default key
get_tau(items, verbose = TRUE)
# calculate D-score
dscore(data)
# prepend id variable to output
dscore(data, prepend = "id")
# or prepend all data
# dscore(data, prepend = colnames(data))
# calculate full posterior
p <- dscore_posterior(data)
# check that rows sum to 1
rowSums(p)
# plot full posterior for measurement 7
barplot(as.matrix(p[7, 12:36]),
names = 1:25,
xlab = "D-score", ylab = "Density", col = "grey",
main = "Full D-score posterior for measurement in row 7",
sub = "D-score (EAP) = 11.58, SEM = 3.99")
# plot P10, P50 and P90 of D-score references
g <- expand.grid(age = seq(0.1, 4, 0.1), p = c(0.1, 0.5, 0.9))
d <- zad(z = qnorm(g$p), x = g$age, verbose = TRUE)
matplot(
x = matrix(g$age, ncol = 3), y = matrix(d, ncol = 3), type = "l",
lty = 1, col = "blue", xlab = "Age (years)", ylab = "D-score",
main = "D-score preliminary standards: P10, P50 and P90")
abline(h = seq(10, 80, 10), v = seq(0, 4, 0.5), col = "gray", lty = 2)
# add measurements made on very preterms, ga < 32 weeks
ds <- dscore(milestones)
points(x = ds$a, y = ds$d, pch = 19, col = "red")