response_set {irt}R Documentation

Create Response_set-class object

Description

This function creates a Response_set-class object from various types of data sets. Currently following scenarios are supported:

Usage

response_set(
  x,
  data_format = "wide",
  ip = NULL,
  examinee_id_var = NULL,
  testlet_id_var = NULL,
  item_id_var = NULL,
  score_var = NULL,
  raw_response_var = NULL,
  order_var = NULL,
  response_time_var = NULL,
  misc_var = NULL,
  misc_unique_var = NULL,
  misc = NULL,
  fill_na_score = NULL
)

Arguments

x

A matrix or data.frame holding item scores. See the description about the options. Additionally, it can be a list of Response-class objects.

data_format

A string value representing the format of the data x supplied. The default value is "wide". The following options are available:

"wide"

x can be in wide format data where a matrix or data.frame where rows represents examinees and columns represent items. Each row will be converted to a Response-class object.

If the columns has names (and an Itempool-class object has not been supplied), then the item_ids will be supplied by the column names. If neither column names nor an Itempool-class object supplied, default item_ids will be given.

If rows has names, those will be used as examinee_ids.

"long"

x can be in long format where data.frame with at least three columns: (1) a column for examinee_id, (2) a column for item_id and (3) a column for either scores or raw_responses. Additional columns can be added such as testlet_id, item order, response_time.

ip

Optionally an Itempool-class object that is holding the item parameters can be supplied to check whether Response_set object created is compatible with the Itempool-class object.

examinee_id_var

A string for the column name that holds examinee ids, if x is in long format.

testlet_id_var

A string for the column name that holds testlet ids, if x is in long format.

item_id_var

A string for the column name that holds item ids, if x is in long format.

score_var

A string for the column name that holds examinee scores, if x is in long format.

raw_response_var

A string for the column name that holds raw responses of the examinees, if x is in long format.

order_var

A string for the column name that holds the administration order of items, if x is in long format.

response_time_var

A string for the column name that holds response time information of the items, if x is in long format.

misc_var

A string for the column names that are holding the miscellaneous information of the items. Available only when x is in long format. Within an examinee, if there is additional information for each item (for example, item's type, item's reading level, examinee's raw response to an item, whether an item is operational or not, the date/time item is administered, ratings of multiple raters, etc.), in the dataset, this information can be passed. Later in the code, such information can be extracted by $ operator. See examples.

misc_unique_var

A string for the column names that are holding the miscellaneous information of the items. Different than misc_var, these columns are assumed to be the same within an examinee, so only the unique value of this column within an examinee will be saved. Examples of variables for this column is gender, race, ability score, school of the examinee that will not vary from one item to another within an examinee. The argument is only available when data_format = "long".

misc

A list of miscellaneous variables that needs to be added to the Response_set object.

fill_na_score

If some examinees do not answer all items, the value fill_na_score will be replaced by the scores of unanswered items. If an ip value provided, 'all items' will be all of the items in the item pool. Otherwise, all items will be the list of all unique item_id values.

Currently, this feature only works when x is a data frame or matrix.

Value

A Response_set-class object.

Author(s)

Emre Gonulates

Examples

##### Wide format data #####
## Example 1
x_wide <- matrix(sample(0:1, 35, TRUE), nrow = 7, ncol = 5)
response_set(x_wide)

## Example 2
ip <- generate_ip(n = 6)
# simulate responses for 10 examinees
resp_matrix <- sim_resp(ip = ip, theta = rnorm(10), prop_missing = .2,
                        output = "matrix")
# convert it to tibble
resp_wide <- as.data.frame(resp_matrix)
resp_wide$stu_id <- rownames(resp_matrix)
# Create a Response_set object:
resp_set <- response_set(resp_wide, data_format = "wide", ip = ip,
                         examinee_id_var = "stu_id")
# Retrieve examinee ids:
resp_set$examinee_id
# Fourth examinee:
resp_set[[4]]
# Scores of 6th examinee
resp_set[[6]]$score


##### Long format data #####
x_long <- data.frame(examinee_id = c("stu1", "stu1", "stu1", "stu2", "stu2"),
                     item_id = c("i1", "i2", "i4", "i1", "i2"),
                     scr = c(0, 1, 0, 1, 0),
                     rwscore = c("A", "D", "B", "C", "D"),
                     resptime = c(33, 55, 22, 66, 31),
                     # These will be passed to misc
                     item_type = c("MC", "MC", "MS", "SA", "MC"),
                     lexile_level = c(1, 4, 3, 2, 1),
                     word_count = c(123, 442, 552, 342, 666),
                     ability = c(1.1, 1.1, 1.1, -.2, -.2),
                     grade = c("7", "7", "7", "11", "11")
                     )

resp_set <- response_set(x = x_long,
                         data_format = "long",
                         examinee_id_var = "examinee_id",
                         item_id_var = "item_id",
                         score_var = "scr",
                         raw_response_var = "rwscore",
                         response_time_var ="resptime",
                         misc_var = c("item_type", "lexile_level"),
                         misc_unique_var = c("ability", "grade")
                         )

resp_set[[1]]  # Response of the first examinee
resp_set$item_type  # extract item_type of each examinee
resp_set$grade  # extract grade of each examinee

# Also, additional examinee level miscellaneous information can be added:
resp_set$gender <- c("M", "F")
resp_set[[2]]$gender  # access second examinee's gender.
resp_set$gender

# Fill missing values with 0.
response_set(x = x_long,
             data_format = "long",
             examinee_id_var = "examinee_id",
             item_id_var = "item_id",
             score_var = "scr",
             raw_response_var = "rwscore",
             response_time_var ="resptime",
             misc_var = c("item_type", "lexile_level"),
             fill_na_score = 0
             )

[Package irt version 0.2.9 Index]