loglin.smooth {SNSequate}R Documentation

Pre-smoothing using log-linear models.

Description

This function fits log-linear models to score data and provides estimates of the (vector of) score probabilities as well as the C matrix decomposition of their covariance matrix, according to the specified equating design (see Details).

Usage

loglin.smooth(scores, degree, design, scores2, degreeXA, degreeYA, 
J, K, L, wx, wy, w, gapsX, gapsY, gapsA, lumpX, lumpY, lumpA,...)

Arguments

Note that depending on the specified equating design, not all arguments are necessary as detailed below.

scores

If the "EG" design is specified, a vector containing the raw sample frequencies coming from one group taking the test.

If the "SG" design is specified, a matrix containing the (joint) bivariate sample frequencies for XX (raws) and YY (columns).

If the "CB" design is specified, a two column matrix containing the observed scores of the sample taking test XX first, followed by test YY. The scores2 argument is then used for the scores of the sample taking test Y first followed by test XX.

If either the "NEAT_CB" or "NEAT_PSE" design is selected, a two column matrix containing the observed scores on test XX (first column) and the observed scores on the anchor test AA (second column). The scores2 argument is then used for the observed scores on test YY.

degree

Either a number or vector indicating the number of power moments to be fitted to the marginal distributions, or the number or cross moments to be fitted to the joint distributions, respectively. For the "EG" design it will be a number (see Details).

design

A character string indicating the equating design (one of "EG", "SG", "CB", "NEAT_CE", "NEAT_PSE")

scores2

Only used for the "CB", "NEAT_CE" and "NEAT_PSE" designs. See the description of scores.

degreeXA

A vector indicating the number of power moments to be fitted to the marginal distributions XX and AA, and the number or cross moments to be fitted to the joint distribution (X,A)(X,A) (see details). Only used for the "NEAT_CE" and "NEAT_PSE" designs.

degreeYA

Only used for the "NEAT_CE" and "NEAT_PSE" designs (see the description for degreeXA)

J

The number of possible XX scores. Only needed for "CB", "NEAT_CB" and "NEAT_PSE" designs

K

The number of possible YY scores. Only needed for "CB", "NEAT_CB" and "NEAT_PSE" designs

L

The number of possible AA scores. Needed for "NEAT_CB" and "NEAT_PSE" designs

wx

A number that satisfies 0wX10\leq w_X\leq 1 indicating the weight put on the data that is not subject to order effects. Only used for the "CB" design.

wy

A number that satisfies 0wY10\leq w_Y\leq 1 indicating the weight put on the data that is not subject to order effects. Only used for the "CB" design.

w

A number that satisfies 0w10\leq w\leq 1 indicating the weight given to population PP. Only used for the "NEAT" design.

gapsX

A list object containing:

index

A vector of indices between 00 and JJ to smooth "gaps", usually ocurring at regular intervals due to scores rounded to integer values and other methodological factors.

degree

An integer indicating the maximum degree of the moments fitted by the log-linear model.

Only used for the "NEAT" design.

gapsY

A list object containing:

index

A vector of indices between 00 and KK.

degree

An integer indicating the maximum degree of the moments fitted.

Only used for the "NEAT" design.

gapsA

A list object containing:

index

A vector of indices between 00 and LL.

degree

An integer indicating the maximum degree of the moments fitted.

Only used for the "NEAT" design.

lumpX

An integer to represent the index where an artificial "lump" is created in the marginal distribution of frecuencies for XX due to recording of negative rounded formulas or any other methodological artifact.

lumpY

An integer to represent the index where an artificial "lump" is created in the marginal distribution of frecuencies for YY.

lumpA

An integer to represent the index where an artificial "lump" is created in the marginal distribution of frecuencies for AA.

...

Further arguments to be passed.

Details

This function fits loglinear models as described in Holland and Thayer (1987), and Von Davier et al. (2004). The following general equation can be used to represent the models according to the different designs used, in which the vector oo (or matrix) of (marginal or bivariate) score probabilities satisfies the log-linear model:

log(ogh)=αm+Zm(zg)+Wm(wh)+ZWm(zg,wh)\log(o_{gh})=\alpha_m+Z_m(z_g)+W_m(w_h)+ZW_m(z_g,w_h)

where Zm(zg)=i=1TZmβzmi(zg)iZ_m(z_g)=\sum_{i=1}^{T_{Zm}}\beta_{zmi}(z_g)^i, Wm(wh)=i=1TWmβWmi(wh)iW_m(w_h)=\sum_{i=1}^{T_{Wm}}\beta_{Wmi}(w_h)^i, and, ZWm(zg,wh)=i=1IZmi=1IWmβZWmii(zg)i(wh)iZW_m(z_g,w_h)=\sum_{i=1}^{I_{Zm}}\sum_{i'=1}^{I_{Wm}}\beta_{ZWmii'}(z_g)^i(w_h)^{i'}.

The symbols will vary according to the different equating designs specified. Possible values are: o=p(12),p(21),p,qo=p_{(12)}, p_{(21)}, p, q; Z=X,YZ=X, Y; W=Y,AW=Y, A; z=x,yz=x, y; w=y,aw=y, a; m=(12),(21),P,Qm=(12), (21), P, Q; g=j,kg=j, k; h=l,kh=l, k.

Particular cases of this general equation for each of the equating designs can be found in Von Davier et al (2004) (e.g., Equations (7.1) and (7.2) for the "EG" design, Equation (8.1) for the "SG" design, Equations (9,1) and (9.2) for the "CB" design).

Value

sp.est

The estimated score probabilities

C

The C matrix which is so that Σ=CCt\Sigma=CC^t

Author(s)

Jorge Gonzalez jorge.gonzalez@mat.uc.cl

References

Gonzalez, J. (2014). SNSequate: Standard and Nonstandard Statistical Models and Methods for Test Equating. Journal of Statistical Software, 59(7), 1-30.

Holland, P. and Thayer, D. (1987). Notes on the use of loglinear models for fitting discrete probability distributions. Research Report 87-31, Princeton NJ: Educational Testing Service.

Von Davier, A., Holland, P., and Thayer, D. (2004). The Kernel Method of Test Equating. New York, NY: Springer-Verlag.

[1] Moses, T. "Paper SA06_05 Using PROC GENMOD for Loglinear Smoothing Tim Moses and Alina A. von Davier, Educational Testing Service, Princeton, NJ".

See Also

glm, ker.eq

Examples

#Table 7.4 from Von Davier et al. (2004)
data(Math20EG)
rj<-loglin.smooth(scores=Math20EG[,1],degree=2,design="EG")$sp.est
sk<-loglin.smooth(scores=Math20EG[,2],degree=3,design="EG")$sp.est
score<-0:20
Table7.4<-cbind(score,rj,sk)
Table7.4

## Example taken from [1]
score <- 0:20
freq <- c(10, 2, 5, 8, 7, 9, 8, 7, 8, 5, 5, 4, 3, 0, 2, 0, 1, 0, 2, 1, 0)
ldata <- data.frame(score, freq)

plot(ldata, pch=16, main="Data w Lump at 0")
m1 = loglin.smooth(scores=ldata$freq,kert="gauss",degree=c(3),design="EG")
m2 = loglin.smooth(scores=ldata$freq,kert="gauss",degree=c(3),design="EG",lumpX=0)
Ns = sum(ldata$freq)
points(m1$sp.est*Ns, col=2, pch=16)
points(m2$sp.est*Ns, col=3, pch=16) # Preserves the lump

[Package SNSequate version 1.3-5 Index]