prepareCGPairedDifferenceData {cg} | R Documentation |
Prepare data object from a data frame for Paired Samples evaluations
Description
The function prepareCGPairedDifferenceData
reads in a data frame and
settings
in order to create a
cgPairedDifferenceData
object. The created object is designed to have exploratory and
fit methods applied to it.
Usage
prepareCGPairedDifferenceData(dfr, format = "listed", analysisname = "",
endptname = "", endptunits = "", logscale = TRUE, zeroscore = NULL,
addconstant = NULL, digits = NULL, expunitname= "",
refgrp = NULL, stamps = FALSE)
Arguments
dfr |
A valid data frame, see the |
format |
Default value of
|
analysisname |
Optional, a character text or
math-valid expression that will be set for
default use in graph title and table methods. The default
value is the empty |
endptname |
Optional, a character text or math-valid expression
that will be set for default use as the y-axis label of graph
methods, and also used for table methods. The default
value is the empty |
endptunits |
Optional, a character text or math-valid
expression that can be used in combination with the endptname
argument.
Parentheses are
automatically added to this input, which will be added to the end
of the endptname character value or expression. The default
value is the empty |
logscale |
Apply a log-transformation to the data for
evaluations. The default value is |
zeroscore |
Optional,
replace response values of zero with a derived or specified
numeric value, as an approach to overcome the presence of zeroes
when evaluation in the
logarithmic scale ( |
addconstant |
Optional,
add a numeric constant to all response values, as an
approach to overcome the presence of zeroes when evaluation in the
logarithmic scale |
digits |
Optional, for output display purposes in graphs
and table methods, values will be rounded to this numeric
value. Only the integers of 0, 1, 2, 3, and 4 are accepted. No
rounding is done during any calculations. The default value is
|
expunitname |
Optional, a character text
that will be set for default use as the experimental unit label of graph
methods, and also used for table methods. The default
value is the empty |
refgrp |
Optional, specify one of the factor levels to be the
“reference group”, such as a “control” group.
The default value is |
stamps |
Optional, specify a time stamp in graphs, along
with cg package
version identification. The default value is |
Details
- Input Data Frame
-
The input data frame
dfr
can be of the format"listed"
or"groupcolumns"
.If
format="listed"
fordfr
is specified, then there must be three columns for an input data frame. The first column needs to be the experimental unit identifier, the second column needs to be the group identifier, and the third is the endpoint. The first column of the listed input data format, needs to have two sets of distinct values since it is the experimental unit identifier of response pairs. The second column of the listed input data format needs to have exactly 2 distinct values since it is the group identifier.If
format="groupcolumns"
fordfr
is specified, then there can be two columns or three columns.- two columns
The column headers specify the two paired group names. Each row contains the experimental unit of paired numeric values under those two groups. In the course of creating the
cgPairedDifferenceData
object, another column will be binded from the left and become the first column, with the column header ofexpunitname
is specified, and "expunit" if the defaultexpunitname=""
is specified. A sequence of integers starting with 1 up to the number of pairs/rows will be generated to uniquely identify each experimental unit pair.- three columns
The first column needs to be unique experimental unit identifiers of the paired numeric values in the second and third columns. The second and third column headers will be used to identify the two paired group names. Each row's second and third column needs to contain the experimental unit of paired numeric values under those two groups. The name of the first column will be assigned to the
expunitname
setting ifexpunitname
is not explicity specified to something else instead of its defaultexpunitname=""
.
As the evaluation data set is prepared for
cgPairedDifferenceData
object, any experimental unit pairs/rows with missing values in the endpoint are flagged. This includes a check to make sure that each experimental unit identified has a complete pair of numeric observations. - zeroscore
-
If
zeroscore="estimate"
is specified, a number close to zero is derived to replace all zeroes for subsequent log-scale analyses. A spline fit (usingspline
andmethod="natural"
) of the log of the response vector on the original response vector is performed. The zeroscore is then derived from the log-scale value of the spline curve at the original scale value of zero. This approach comes from the concept of arithmetic-logarithmic scaling discussed in Tukey, Ciminera, and Heyse (1985). - addconstant
-
If
addconstant="simple"
is specified, a number is derived and added to all response values. The approach taken is from the "white" book on S (Chambers and Hastie, 1992), page 68. The range (max - min
) of the response values ismultiplied by
0.0001
to derive the number to add to all the response values.
Value
A cgPairedDifferenceData
object is returned, with the following slots:
dfr |
The original input data frame that is the specified value of the
|
dfru |
Processed version of the input data frame, which will be used for the various evaluation methods. |
dfr.gcfmt |
A groupcolumns version of the input data frame with
an additional column of the differences between groups, where the
|
settings |
A list of properties associated with the data frame:
|
Note
Contact cg@billpikounis.net for bug reports, questions, concerns, and comments.
Author(s)
Bill Pikounis [aut, cre, cph], John Oleynick [aut], Eva Ye [ctb]
References
Tukey, J.W., Ciminera, J.L., and Heyse, J.F. (1985). "Testing the Statistical Certainty of a Response to Increasing Doses of a Drug," Biometrics, Volume 41, 295-301.
Chambers, J.M, and Hastie, T.R. (1992), Statistical Modeling in S. Chapman&Hall/CRC.
See Also
Examples
data(anorexiaFT)
anorexiaFT.data <- prepareCGPairedDifferenceData(anorexiaFT, format="groupcolumns",
analysisname="Anorexia FT",
endptname="Weight",
endptunits="lbs",
expunitname="Patient",
digits=1, logscale=TRUE)