umx_long2wide {umx} | R Documentation |
Take a long twin-data file and make it wide (one family per row)
Description
umx_long2wide
merges on famID
. Family members are ordered by twinID
.
twinID is equivalent to birth order. Up to 10 twinIDs are allowed (family order).
Note: Not all data sets have an order column, but it is essential to rank subjects correctly.
You might start off with a TWID which is a concatenation of a familyID and a 2 digit twinID
Generating famID and twinID as used by this function
You can capture the last 2 digits with the mod
function: twinID = df$TWID %% 100
You can drop the last 2 digits with integer div: famID = df$TWID %/% 100
Note: The functions assumes that if zygosity or any passalong variables are NA in the first family member, they are NA everywhere. i.e., it does not hunt for values that are present elsewhere to try and self-heal missing data.
Usage
umx_long2wide(
data,
famID = NA,
twinID = NA,
zygosity = NA,
vars2keep = NA,
passalong = NA,
twinIDs2keep = NA
)
Arguments
data |
The original (long-format) data file |
famID |
The unique identifier for members of a family |
twinID |
The twinID. Typically 1, 2, 50 51, etc... |
zygosity |
Typically MZFF, DZFF MZMM, DZMM DZOS |
vars2keep |
= The variables you wish to analyse (these will be renamed with paste0("_T", twinID) |
passalong |
= Variables you wish to pass-through (keep, even though not twin vars) |
twinIDs2keep |
= If NA (the default) all twinIDs are kept, else only those listed here. Useful to drop sibs. |
Value
dataframe in wide format
References
See Also
Other Twin Data functions:
umx_make_TwinData()
,
umx_make_twin_data_nice()
,
umx_residualize()
,
umx_scale_wide_twin_data()
,
umx_wide2long()
,
umx
Examples
## Not run:
# ==============================================
# = First make a long format file for the demo =
# ==============================================
data(twinData)
tmp = twinData[, -2]
tmp$twinID1 = 1; tmp$twinID2 = 2
long = umx_wide2long(data = tmp, sep = "")
str(long)
# 'data.frame': 7616 obs. of 11 variables:
# $ fam : int 1 2 3 4 5 6 7 8 9 10 ...
# $ zyg : int 1 1 1 1 1 1 1 1 1 1 ...
# $ part : int 2 2 2 2 2 2 2 2 2 2 ...
# $ cohort : chr "younger" "younger" "younger" "younger" ...
# $ zygosity: Factor w/ 5 levels "MZFF","MZMM",..: 1 1 1 1 1 1 1 1 1 1 ...
# $ wt : int 58 54 55 66 50 60 65 40 60 76 ...
# $ ht : num 1.7 1.63 1.65 1.57 1.61 ...
# $ htwt : num 20.1 20.3 20.2 26.8 19.3 ...
# $ bmi : num 21 21.1 21 23 20.7 ...
# $ age : int 21 24 21 21 19 26 23 29 24 28 ...
# $ twinID : num 1 1 1 1 1 1 1 1 1 1 ...
# OK. Now to demo long2wide...
# Keeping all columns
wide = umx_long2wide(data= long, famID= "fam", twinID= "twinID", zygosity= "zygosity")
namez(wide) # some vars, like part, should have been passed along instead of made into "part_T1"
# ======================================
# = Demo requesting specific vars2keep =
# ======================================
# Just keep bmi and wt
wide = umx_long2wide(data= long, famID= "fam", twinID= "twinID",
zygosity = "zygosity", vars2keep = c("bmi", "wt")
)
namez(wide)
# "fam" "twinID" "zygosity" "bmi_T1" "wt_T1" "bmi_T2" "wt_T2"
# ==================
# = Demo passalong =
# ==================
# Keep bmi and wt, and pass through 'cohort'
wide = umx_long2wide(data= long, famID= "fam", twinID= "twinID", zygosity= "zygosity",
vars2keep = c("bmi", "wt"), passalong = "cohort"
)
namez(wide)
## End(Not run)