pairwise.S {pairwise}R Documentation

The Fischer-Scheiblechner Statistic S on item level (Wald like Test)

Description

This function calculates the S-statistic on item level proposed by Fischer and Scheiblechner (1970) on item level for dicho- or polytomous item response formats by splitting the data into two subsamples. For polytomous Items the test is performed on item category level. Several splitting options are available (see arguments). The S-statistic is also mentioned in van den Wollenberg, (1982) – an article in Psychometrika, which might be available more easily (see details).

Usage

pairwise.S(
  daten,
  m = NULL,
  split = "random",
  splitseed = "no",
  verbose = FALSE,
  ...
)

Arguments

daten

a data.frame or matrix with optionaly named colums (names of items), potentially with missing values, comprising polytomous or dichotomous (or mixed category numbers) responses of n respondents (rows) on k items (colums) coded starting with 0 for lowest category to m-1 for highest category, with m beeing a vector (with length k) with the number of categories for the respective item.

m

an integer (will be recycled to a vector of length k) or a vector giving the number of response categories for all items - by default m = NULL, m is calculated from data, assuming that every response category is at least once present in data. For sparse data it is strongly recomended to explicitly define the number of categories by defining this argument.

split

Specifies the splitting criterion. Basically there are three different options available - each with several modes - which are controlled by passing the corresponding character expression to the argument.

1) Using the rawscore for splitting into subsamples with the following modes: split = "median" median raw score split - high score group and low score group; split = "mean" mean raw score split - high score group and low score group.

2) Dividing the persons in daten into subsamples with equal size by random allocation with the following modes: split = "random" (which is equivalent to split = "random.2") divides persons into two subsamples with equal size. In general the number of desired subsamples must be expressed after the dot in the character expression - e.g. split = "random.6" divides persons into 6 subsamples (with equal size) by random allocation etc.

3) The third option is using a manifest variable as a splitting criterion. In this case a vector with the same length as number of cases in daten must be passed to the argument grouping the data into subsamples. This vector should be coded as "factor" or a "numeric" integer vector with min = 1.

splitseed

numeric, used for set.seed(splitseed) for random splitting - see argument split.

verbose

logical, if verbose = TRUE (default) a message about subsampling is sent to console when calculating standard errors.

...

additional arguments nsample, size, seed, pot for caling pairSE are passed through - see description for pairSE.

Details

The data is splitted in two subsamples and then item thresholds, the parameter (Sigma) and their standard errors (SE) for the items according the PCM (or RM in case of dichotonimies) are calculated for each subsample. This function internaly calls the function pairSE. Additional arguments (see description of function pairSE) for parameter calculation are passed through. This item fit statistic is also (perhaps misleadingly) namend as 'Wald test' in other R-packages. The S-statistic, as implemented in pairwise, is defined according to Fischer and Scheiblechner (1970); see also equation (3) in van den Wollenberg, (1982), p. 124 in the following equation:

{ S }_{ i }=\frac { { \hat { \sigma } }^{ (1) }_{ i }-{ \hat { \sigma } }^{ (2) }_{ i } }{ \sqrt { { \left( { { S } }^{ (1) }_{ \hat { \sigma } _{ i } } \right) }^{ 2 }+{ \left( { { S } }^{ (2) }_{ \hat { \sigma } _{ i } } \right) }^{ 2 } } }

where {\hat { \sigma } }^{ (1) }_{ i } is the estimate of the item parameter of subsample 1, {\hat { \sigma } }^{ (2) }_{ i } is the estimate of the item parameter of subsample 2 and { S }^{ (1) }_{ \hat { \sigma } _{ i } } and { S }^{ (2) }_{ \hat { \sigma } _{ i } } are the respective standard errors. In Fischer (1974), p. 297, the resulting test statistic (as defined above) is labeled with Z_i, as it is asymtotically normally distributed. Contrary to the 'Wald-type' test statistic W_i, which was drived by Glas and Verhelst (2005) from the (general) \chi^2 distributed test of statistical hypotheses concerning several parameters, which was introduced by Wald (1943).

Value

A (list) object of class "pairS" containing the test statistic and item difficulty parameter sigma and their standard errors for the two or more subsamples.

A note on standard errors

Estimation of standard errors is done by repeated calculation of item parameters for subsamples of the given data. This procedure is mainly controlled by the arguments nsample and size (see arguments in pairSE). With regard to calculation time, the argument nsample is the 'time killer'. On the other hand, things (estimation of standard errors) will not necessarily get better when choosing large values for nsample. For example choosing nsample=400 will only result in minimal change for standard error estimation in comparison to (nsample=30) which is the default setting (see examples).

References

description of function pairSE{pairwise}.

Fischer, G. H., & Scheiblechner, H. (1970). Algorithmen und Programme fuer das probabilistische Testmodell von Rasch. Psychologische Beitrage, (12), 23–51.

van den Wollenberg, A. (1982). Two new test statistics for the rasch model. Psychometrika, 47(2), 123–140. https://doi.org/10.1007/BF02296270

Glas, C. A. W., & Verhelst, N. D. (1995). Testing the Rasch Model. In G. Fischer & I. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications. New York: Springer.

Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society, 54(3), 426–482. https://doi.org/10.1090/S0002-9947-1943-0012401-3

Fischer, G. H. (1974). Einführung in die Theorie psychologischer Tests. Bern: Huber.

Examples

##########
data("kft5")
S_ran_kft <- pairwise.S(daten = kft5,m = 2,split = "random")
summary(S_ran_kft)
summary(S_ran_kft,thres = FALSE)
#### polytomous examples
data(bfiN) # loading example data set
data(bfi_cov) # loading covariates to bfiN data set

# calculating itemparameters and SE for two subsamples by gender
S_gen <- pairwise.S(daten=bfiN, split = bfi_cov$gender)
summary(S_gen)
summary(S_gen,thres = FALSE)

# other splitting criteria
## Not run: 
S_med <- pairwise.S(daten=bfiN, split = "median")
summary(S_med)

S_ran<-pairwise.S(daten=bfiN, split = "random")
summary(S_ran)

S_ran.4<-pairwise.S(daten=bfiN, split = "random.4")
summary(S_ran.4) # currently not displayed

###### example from details section 'Some Notes on Standard Errors' ########
S_def<-pairwise.S(daten=bfiN, split = "random",splitseed=13)
summary(S_def)
######
S_400<-pairwise.S(daten=bfiN, split = "random", splitseed=13 ,nsample=400)
summary(S_400) 

## End(Not run)


[Package pairwise version 0.6.1-0 Index]