Useful Functions for Data Processing

Documentation for package ‘baizer’ version 0.8.0

Help Pages

-- A --

adjacent_div	expand a number vector according to the adjacent two numbers
alias_arg	use aliases for function arguments
as_md_table	trans a tibble into markdown format table
as_tibble_md	trans a table in markdown format into tibble
atomic_expr	whether the expression is an atomic one

-- B --

broadcast the vector into length n

-- C --

c2r	wrapper of tibble::column_to_rownames
check_arg	check arguments by custom function
cmdargs	get the command line arguments
collapse_vector	dump a named vector into character
combn_vector	combine multiple vectors into one
correct_ratio	correct the numbers to a target ratio
cross_count	count two columns as a cross-tabulation table

-- D --

detect_dup	detect possible duplication in a vector, ignore case, blank and special character
diff_index	the index of different character
diff_tb	differences between two tibbles
dx_tb	diagnosis a tibble for character NA, NULL, all T/F column, blank in cell

-- E --

empty_dir	detect whether directory is empty recursively
empty_file	detect whether file is empty recursively
exist_matrix	generate a matrix to show whether the item in each element of a list
expr_pileup	pileup the subexpressions which is atomic
extract_kv	extract key and values for a character vector

-- F --

fancy_count	fancy count to show an extended column
fetch_char	fetch character from strings
filterC	apply tbflt on dplyr filter
fix_to_regex	trans fixed string into regular expression string
float_to_percent	from float number to percent number
fps_vector	farthest point sampling (FPS) for a vector
full_expand	like 'dplyr::full_join' while ignore the same columns in right tibble

-- G --

generate_ticks	generate ticks for a number vector
gen_char	generate characters
gen_combn	generate all combinations
gen_outlier	generate outliers from a series of number
gen_str	generate strings
gen_tb	generate tibbles
geom_mean	geometric mean
group_vector	group character vector by a regex pattern

-- H --

separate numeric x into bins

-- I --

inner_expand	like 'dplyr::inner_join' while ignore the same columns in right tibble
int_digits	trans numbers to a fixed integer digit length
is.zero	if a number only have zeros

-- L --

left_expand	like 'dplyr::left_join' while ignore the same columns in right tibble
list2df	trans list into data.frame

-- M --

max_depth	max depth of a list
melt_vector	melt a vector into single value
mini_diamond	Minimal tibble dataset adjusted from diamond
mm_norm	max-min normalization
move_row	move selected rows to target location

-- N --

nearest_tick	the nearest ticks around a number
near_ticks	the ticks near a number
not.na	not NA
not.null	not NULL
number_fun_wrapper	wrapper of the functions to process number string with prefix and suffix

-- O --

slice a tibble by an ordered vector

-- P --

percent_to_float	from percent number to float number
pileup_logical	pileup another logical vector on the TRUE values of first vector
pkginfo	information of packages
pkglib	load packages as a batch
pkgver	versions of packages
pos_int_split	split a positive integer number as a number vector

-- R --

r2c	wrapper of tibble::rownames_to_column
read_excel	read excel file
read_excel_list	read multi-sheet excel file as a list of tibbles
read_fmmd	read front matter markdown
ref_level	relevel a target column by another reference column
reg_join	join the matched parts into string
reg_match	regex match
remove_monocol	remove columns by the ratio of an identical single value (NA supported)
remove_nacol	remove columns by the ratio of NA
remove_narow	remove rows by the ratio of NA
remove_outliers	remove outliers and NA
replace_item	replace the items of one object by another
rewrite_na	rewrite the NA values in a tibble by another tibble
rng2seq	trans range character into seq characters
round_string	from float number to fixed digits character
roxygen_fmt	add #' into each line of codes for roxygen examples

-- S --

same_index	the index of identical character
seriate_df	dataframe rows seriation, which will reorder the rows in a better pattern
sftp_connect	connection parameters to remote server via sftp
sftp_download	download file from remote server via sftp
sftp_ls	list files from remote server via sftp
signif_ceiling	signif while use ceiling
signif_floor	signif while use floor
signif_round_string	signif or round string depend on the character length
signif_string	from float number to fixed significant digits character
slice_char	slice character vector
sortf	sort by a function
split_column	split a column and return a longer tibble
split_path	split a path into ancestor paths recursively
split_vector	split vector into list
stat_fc	fold change calculation which returns a extensible tibble
stat_phi	calculate phi coefficient of two binary variables
stat_test	statistical test which returns a extensible tibble
str_replace_loc	replace specific characters in a string by their locations
swap_vecname	swap the names and values of a vector

-- T --

tbflt	create a tbflt object to save filter conditions
tdf	transpose a dataframe
top_item	return top n items with highest frequency

-- U --

uniq	only keep unique vector values and its names
uniq_in_cols	count unique values in each column

-- W --

write a tibble into an excel file

-- misc --

%eq%	equal calculation operator, support NA
%neq%	not equal calculation operator, support NA
%nin%	not in calculation operator