drop_summary {dropout} | R Documentation |
Summarizing Dropouts in Surveys
Description
drop_summary
function provides a high-level summary of dropout occurrences in the survey data.
It generates key statistics to understand the patterns of participant dropouts across different survey questions.
Usage
drop_summary(data, last_col = NULL, section_min = 3)
Arguments
data |
A dataframe or tibble containing the survey data. |
last_col |
The index position or column name of the last survey item. This is optional and is used when there are additional columns in the data frame that are not part of the survey questions you are interested in. |
section_min |
Indicates occurrences of missing values that span at least n consecutive columns (n defaults to 3) |
Value
A dataframe or tibble containing summary statistics about dropouts. Typical columns might include:
-
column_name
: Lists the names of the columns from your dataset that have been analyzed for dropouts. -
dropout
: Contains the frequency of dropouts within each listed column, allowing you to see where dropout rates might be the most significant. -
drop_rate
: Shows the percentage of dropout incidents in each column. This is useful for understanding the relative impact of dropouts in various parts of your dataset. -
cum_drop_rate
: Shows the overall percentage of dropout incidents in each column. -
drop_na
: Provides the percentage of missing values in each column that can be attributed specifically to dropouts. This offers insights into the nature of missing data. -
section_na
: Indicates occurrences of missing values that span at leastn
consecutive columns (n
defaults to 3). You can adjust this parameter usingsection_min
See Also
See vignette for detailed workflows, tips on interpretation, and practical examples.
Examples
# Basic usage
drop_summary(flying, "location_census_region")
# Summarizing dropouts up to a specific column
drop_summary(flying, last_col = "age")
# Read more in the vignette for interpreting summary statistics and plotting dropout trends.