new_p {pdqr}R Documentation

Create new pdqr-function

Description

Functions for creating new pdqr-functions based on numeric sample or data frame describing distribution. They construct appropriate "x_tbl" metadata based on the input and then create pdqr-function (of corresponding pdqr class) defined by that "x_tbl".

Usage

new_p(x, type, ...)

new_d(x, type, ...)

new_q(x, type, ...)

new_r(x, type, ...)

Arguments

x

Numeric vector or data frame with appropriate columns (see "Data frame input" section).

type

Type of pdqr-function. Should be one of "discrete" or "continuous".

...

Extra arguments for density().

Details

Data frame input x is treated as having enough information for creating (including normalization of "y" column) an "x_tbl" metadata. For more details see "Data frame input" section.

Numeric input is transformed into data frame which is then used as "x_tbl" metadata (for more details see "Numeric input" section):

Value

A pdqr-function of corresponding class ("p" for new_p(), etc.) and type.

Numeric input

If x is a numeric vector, it is transformed into a data frame which is then used as "x_tbl" metadata to create pdqr-function of corresponding class.

First, all NaN, NA, and infinite values are removed with warnings. If there are no elements left, error is thrown. Then data frame is created in the way which depends on the type argument.

For "discrete" type elements of filtered x are:

For "continuous" type output data frame has columns "x", "y", "cumprob". Choice of algorithm depends on the number of x elements:

Data frame input

If x is a data frame, it should have numeric columns appropriate for "x_tbl" metadata of input type: "x", "prob" for "discrete" type and "x", "y" for "continuous" type ("cumprob" column will be computed inside ⁠new_*()⁠). To become an appropriate "x_tbl" metadata, input data frame is ordered in increasing order of "x" column and then imputed in the way which depends on the type argument.

For "discrete" type:

For "continuous" type column "y" is normalized so that piecewise-linear function passing through "x"-"y" points has total integral of 1. Column "cumprob" has cumulative probability of piecewise-linear d-function.

Examples

set.seed(101)
x <- rnorm(10)

# Type "discrete": `x` values are directly tabulated
my_d_dis <- new_d(x, "discrete")
meta_x_tbl(my_d_dis)
plot(my_d_dis)

# Type "continuous": `x` serves as input to `density()`
my_d_con <- new_d(x, "continuous")
head(meta_x_tbl(my_d_con))
plot(my_d_con)

# Data frame input
## Values in "prob" column will be normalized automatically
my_p_dis <- new_p(data.frame(x = 1:4, prob = 1:4), "discrete")
## As are values in "y" column
my_p_con <- new_p(data.frame(x = 1:3, y = c(0, 10, 0)), "continuous")

# Using bigger bandwidth in `density()`
my_d_con_2 <- new_d(x, "continuous", adjust = 2)
plot(my_d_con, main = "Comparison of density bandwidths")
lines(my_d_con_2, col = "red")

# Dirac-like "continuous" pdqr-function is created if `x` is a single number
meta_x_tbl(new_d(1, "continuous"))

[Package pdqr version 0.3.1 Index]