fdt {fdth} | R Documentation |
Frequency distribution table for numerical data
Description
A S3 set of methods to easily perform frequency distribution table (‘fdt’) from
vector
, data.frame
and matrix
objects.
Usage
## S3 generic
fdt(x, ...)
## S3 methods
## Default S3 method:
fdt(x,
k,
start,
end,
h,
breaks=c('Sturges', 'Scott', 'FD'),
right=FALSE,
na.rm=FALSE, ...)
## S3 method for class 'data.frame'
fdt(x,
k,
by,
breaks=c('Sturges', 'Scott', 'FD'),
right=FALSE,
na.rm=FALSE, ...)
## S3 method for class 'matrix'
fdt(x,
k,
breaks=c('Sturges', 'Scott', 'FD'),
right=FALSE,
na.rm=FALSE, ...)
Arguments
x |
a |
k |
number of class intervals. |
start |
left endpoint of the first class interval. |
end |
right endpoint of the last class interval. |
h |
class interval width. |
by |
categorical variable used for grouping each numeric variable,
useful only on |
breaks |
method used to determine the number of interval classes, c(“Sturges”, “Scott”, “FD”). |
right |
right endpoints open (default = |
na.rm |
logical. Should missing values be removed? (default = |
... |
potencial further arguments (required by generic). |
Details
The simplest way to run ‘fdt’ is done by supplying only the ‘x’
object, for example: nm <- fdt(x)
. In this case all necessary
default values (‘breaks’ and ‘right’) (“Sturges” and FALSE
respectively) will be used.
It can be provided also:
-
‘x’ and ‘k’ (number of class intervals);
-
‘x’, ‘start’ (left endpoint of the first class interval) and ‘end’ (right endpoint of the last class interval); or
-
‘x’, ‘start’, ‘end’ and ‘h’ (class interval width).
These options make the ‘fdt’ very easy and flexible.
The ‘fdt’ object stores information to be used by methods summary
,
print
, plot
, mean
, median
and mfv
. The result of plot is a histogram.
The methods summary
, print
and plot
provide a reasonable
set of parameters to format and plot the ‘fdt’ object in a pretty
(and publishable) way.
Value
For fdt
the method fdt.default
returns a list of class fdt.default
with the slots:
\samp{table} |
A |
\samp{breaks} |
A |
\samp{data} |
A vector of the data ‘x’ provided. |
The methods fdt.data.frame
and fdt.matrix
return a list of class fdt.multiple
.
This list
has one slot for each numeric (fdt
)
variable of the ‘x’ provided. Each slot, corresponding to each numeric
variable, stores the same slots of the fdt.default
described above.
Author(s)
Faria, J. C.
Allaman, I. B
Jelihovschi, E. G.
See Also
hist
provided by graphics and
table
, cut
both provided by base.
Examples
library(fdth)
#========
# Vector
#========
x <- rnorm(n=1e3,
mean=5,
sd=1)
str(x)
# x
(ft <- fdt(x))
# x, alternative breaks
(ft <- fdt(x,
breaks='Scott'))
# x, k
(ft <- fdt(x,
k=10))
# x, star, end
range(x)
(ft <- fdt(x,
start=floor(min(x)),
end=floor(max(x) + 1)))
# x, start, end, h
(ft <- fdt(x,
start=floor(min(x)),
end=floor(max(x) + 1),
h=1))
# Effect of right
sort(x <- rep(1:3, 3))
(ft <- fdt(x,
start=1,
end=4,
h=1))
(ft <- fdt(x,
start=0,
end=3,
h=1,
right=TRUE))
#================================================
# Data.frame: multivariated with two categorical
#================================================
mdf <- data.frame(c1=sample(LETTERS[1:3], 1e2, TRUE),
c2=as.factor(sample(1:10, 1e2, TRUE)),
n1=c(NA, NA, rnorm(96, 10, 1), NA, NA),
n2=rnorm(100, 60, 4),
n3=rnorm(100, 50, 4),
stringsAsFactors=TRUE)
head(mdf)
#(ft <- fdt(mdf)) # Error message due to presence of NA values
(ft <- fdt(mdf,
na.rm=TRUE))
str(mdf)
# By factor
(ft <- fdt(mdf,
k=5,
by='c1',
na.rm=TRUE))
# choose FD criteria
(ft <- fdt(mdf,
breaks='FD',
by='c1',
na.rm=TRUE))
# k
(ft <- fdt(mdf,
k=5,
by='c2',
na.rm=TRUE))
(ft <- fdt(iris,
k=10))
(ft <- fdt(iris,
k=5,
by='Species'))
#=========================
# Matrices: multivariated
#=========================
(ft <-fdt(state.x77))
summary(ft,
format=TRUE)
summary(ft,
format=TRUE,
pattern='%.2f')