desc_stats {descstatsr}R Documentation

Descriptive Univariate Statistics

Description

The function summarizes the input data using different descriptive univariate statistical measures on grouped or ungrouped level.

Usage

desc_stats(dataset, show_levels = 5, decimal_points = 2,
  group_variable = NULL, miss_val = NULL)

Arguments

dataset

A data.frame object, an input dataset for which descriptive statistics needs to be calculated

show_levels

An integer value. It controls how many top character/factor levels with their proportions needs to be displayed in descending order of their proportions, by default it is set to 5.

decimal_points

An integer value. It controls no of decimal points to which numeric data needs to be rounded off, by default it is set to 2.

group_variable

A character vector. Specify the character or factor variable/variables on whose unique group levels the data should be split and univariate statistics needs to be generated.

miss_val

A character vector. Specify different strings which needs to be considered as missing values.

Details

The functions calculates following measures on the input data:

Measures of Central Tendency: Mean, Median

Measures of Distribution: Count, Proportion

Measures of Dispersion: Min, Max, Quantile, Standard Deviation, Variance

Measures of shape: Skewness, Kurtosis

Addition to these measures, the function provides information on the data type, count on no. of rows, unique entries and percentage of missing entries

All the above statistics can be generated for the entire data or at a group level. The variables/variables specified to group_variable parameter splits the data into groups based on the unique levels of the variable/variables specified and calculates descriptive statistics on each of these levels. .

Value

A data.frame object with descriptive univariate statistics listed for numerical,categorical and date variables at group level, if specified, else for entire data.

Examples

desc_stats(iris,show_levels=2,decimal_points=2,group_variable=c("Species"),miss_val=c("unknown"))
desc_stats(iris,show_levels=2,decimal_points=2,group_variable=c("Species"))
desc_stats(iris,show_levels=2,decimal_points=2)
desc_stats(iris,show_levels=2)
desc_stats(iris)


[Package descstatsr version 0.1.0 Index]