KSperm {NU.Learning} | R Documentation |
Simulate a p-value for the significance of the Kolmogorov-Smirnov D-statistic from confirm().
Description
For a given confirm() output object, KSperm() simulates the NULL distribution of LTDs or LRCs resulting from Purely Random Clusterings of experimental units within the parent data.frame. This NULL distribution is discrete because Local Effect-Size estimates are TIED within-clusters. The observed D-Statistic from confirm() is compared with new NULL order statistics computed by KSperm(), again using stats::ks.test. When KSperm() is called immediately after confirm() and the seed value used in confirm() is known, then both the simulated p-value and the additional NULL KS-D order statistics generated by KSperm() will all be reproducible.
Usage
KSperm(x, reps=100)
Arguments
x |
An output object from confirm(). |
reps |
This is the number of new NULL KS-D statistics to generated. Each experimental unit is used at most once within each full replication. No clusters will be empty, but some may be "uninformative". |
Details
The observed value of the Kolmogorov-Smirnov D-statistic from confirm() is used here, but its "p.value" from ks.test() is not because it is badly biased downwards. This bias results because the distribution of LTDs or LRCs across clusters is always discrete, due to TIED values within clusters that typically also vary in size. Thus, KSperm() generates "reps" additional, independent, NULL values of KS-D and saves their order statistics. Finally, KSperm() compares the Observed KS-D from confirm() with its simulated NULL order statistics to estimate an appropriately "adjusted" p-value, pv.adj. Note that the simulated pv.adj value estimate cannot be less than 1/(reps).
Value
An output list object of class KSperm:
hiclus |
Hierarchical clustering object created by the designated method. |
dframe |
Name of data.frame containing X, t & Y variables. |
trtm |
Name of numerical treatment/exposure variable. |
yvar |
Name of numerical y-Outcome variable. |
Type |
1 ==> LTDs, otherwise LRCs. |
reps |
Number of overall Replications, each with the same number, K, of requested clusters. |
nclus |
Number of clusters requested. |
units |
Number of experimental units or patients. |
obsD |
Observed numerical value of KS D-statistic from confirm() |
Dvec |
Vector of order statistics for simulated NULL KS D-statistics. |
pv.adj |
Simulated p-value adjusted for TIES within discrete LTD/LRC distributions. |
Author(s)
Bob Obenchain <wizbob@att.net>
References
Obenchain RL. (2010) Local Control Approach using JMP. Chapter 7 of Analysis of Observational Health Care Data using SAS, Cary, NC:SAS Press, pages 151-192.
Obenchain RL. (2019) NU.Learning_in_R.pdf http://localcontrolstatistics.org