PISA {equate} | R Documentation |
Programme for International Student Assessment 2009 USA Data
Description
This dataset contains scored cognitive item response data from the 2009 administration of the Programme for International Student Assessment (PISA), an international study of education systems. The data, and license under which they are released, are available online at https://www.oecd.org/pisa/.
Usage
PISA
Format
PISA
is a list
containing four elements. The first,
PISA$students
, is a data.frame
containing 233 variables across
5233 individuals, with one row per individual. All but one variable come
from the USA PISA data file "INT_COG09_S_DEC11.txt". The remaining variable,
language spoken at home, has been merged in from the student questionnaire
file "INT_STQ09_DEC11.txt". Variable names match those found in the original
files:
- list("stidstd")
Unique student ID (one for each of the 5233 cases);
- list("schoolid")
School ID (there are 165 different schools);
- list("bookid")
ID for the test booklet given to a particular student, of which there were 13;
- list("langn")
-
Student-reported language spoken at home, with 4466 students reporting English (indicated by code 313), 484 students reporting Spanish (with code 156) and 185 students reporting "another language" (code 859);
- list("m033q01")
Scored item-response data across the 189 items included in the general cognitive assessment, described below; and
- to
Scored item-response data across the 189 items included in the general cognitive assessment, described below; and
- list("s527q04t")
Scored item-response data across the 189 items included in the general cognitive assessment, described below; and
- list("pv1math")
PISA scale scores, referred to in the PISA technical documentation as "plausible values".
- to
PISA scale scores, referred to in the PISA technical documentation as "plausible values".
- list("pv5read5")
PISA scale scores, referred to in the PISA technical documentation as "plausible values".
Next, PISA$booklets
is a data.frame
containing 4
columns and 756 rows and describes the 13 general cognitive assessment
booklets. Variables include:
- list("bookid")
The test booklet ID, as in
PISA$students
;- list("clusterid")
ID for the cluster or item subset in which an item was placed; items were fully nested within clusters; however, each item cluster appeared in four different test booklets;
- list("itemid")
Item ID, matching the columns of
PISA$students
; each item appears inPISA$booklets
four times, once for each booklet; and- list("order")
The order in which the cluster was presented within a given booklet.
PISA$items
is a data.frame
containing 4 columns and 189 rows,
with one row per item. Variables include:
- list("itemid")
-
Item ID, as in
PISA$booklets
- list("clusterid")
Cluster ID, as in
PISA$booklets
- list("max")
Maximum possible score value, either 1 or 2 points, with dichotomous scoring (max of 1) used for the majority of items; and
- list("subject")
The subject of an item, equivalent to the first character in
itemid
andclusterid
.- list("format")
Item format, abbreviated as
mc
for multiple choice,cmc
for complex multiple choice,ocr
for open constructed response, andccr
for closed constructed response.- list("noptions")
Number of options, zero except for some multiple choice items.
Finally, PISA$totals
is a list of 13
data.frame
s, one per booklet, where the columns correspond to total
scores for all students on each cluster for the corresponding booklet. These
total scores were calculated using PISA$students
and
PISA$booklets
. Elements within the PISA$totals
list are named
by booklet, and the columns in the data.frame
are named by cluster.
For example, PISA$totals$b1$m1
contains the total scores on cluster
M1 for students taking booklet 1.
Source
OECD (2012). PISA 2009 Technical Report, PISA, OECD Publishing. http://dx.doi.org/10.1787/9789264167872-en
Addition information can be found at the PISA website: https://www.oecd.org/pisa/