| sweets {MM} | R Documentation |
Synthetic dataset due to Hankin
Description
Four objects:
sweetsis a \(2\times 3\times 21\) arraysweets_tallyis a length 37 vectorsweets_arrayis a \(2\times 3\times 37\) vectorsweets_tableis a \(37\times 6\) matrix
Usage
data(sweets)
Details
Object sweets is the raw dataset; objects sweets_table
and sweets_tally are processed versions which are easier to
analyze.
The father of a certain family brings home nine sweets of type
mm and nine sweets of type jb each day for 21 days to
his children, AMH, ZJH, and AGH.
The children share the sweets amongst themselves in such a way that each child receives exactly 6 sweets.
Array
sweetshas dimensionc(2,3,21): 2 types of sweets, 3 children, and 21 days. Thussweets[,,1]shows that on the first day,AMHchose 0 sweets of typemmand 6 sweets of typejb; childZJHchose 3 of each, and childAGHchose 6 sweets of typemmand 0 sweets of typejb.Observe the constant marginal totals: the kids have the same overall number of sweets each, and there are a fixed number of each kind of sweet.
Array
sweets_arrayhas dimensionc(2,3,37): 2 sweets, 3 children, and 37 possible ways of arranging a matrix with the specified marginal totals. This can be produced byallboards()of the aylmer package.-
sweets_tableis a dataframe with six columns, one for each combination of child and sweet, and 37 rows, each row showing a permissible arrangement. All possibilities are present. The six entries ofsweets[,,1]correspond to the six elements ofsweets_table[1,]; the column names are mnemonics. sweets_tallyshows how often each of the arrangements insweets_tallywas observed (that is, it's a table of the 21 observations insweets)
Source
The Hankin family
Examples
data(sweets)
# show correspondence between sweets_table and sweets_tally:
cbind(sweets_table, sweets_tally)
# Sum the data, by sweet and child and test:
fisher.test(apply(sweets,1:2,sum))
# Not significant!
# Now test for overdispersion.
# First set up the regressors:
jj1 <- apply(sweets_array,3,tcrossprod)
jj2 <- apply(sweets_array,3, crossprod)
dim(jj1) <- c(2,2,37)
dim(jj2) <- c(3,3,37)
theta_xy <- jj1[1,2,]
phi_ab <- jj2[1,2,]
phi_ac <- jj2[1,3,]
phi_bc <- jj2[2,3,]
# Now the offset:
Off <- apply(sweets_array,3,function(x){-sum(lfactorial(x))})
# Now the formula:
f <- formula(sweets_tally~ -1 + theta_xy + phi_ab + phi_ac + phi_bc)
# Now the Lindsey Poisson device:
out <- glm(formula=f, offset=Off, family=poisson)
summary(out)
# See how the residual deviance is comparable with the degrees of freedom