Useful Functions for Data Processing

Documentation for package ‘baizer’ version 0.8.0

Help Pages

-- A --

adjacent_div expand a number vector according to the adjacent two numbers
alias_arg use aliases for function arguments
as_md_table trans a tibble into markdown format table
as_tibble_md trans a table in markdown format into tibble
atomic_expr whether the expression is an atomic one

-- B --

broadcast_vector broadcast the vector into length n

-- C --

c2r wrapper of tibble::column_to_rownames
check_arg check arguments by custom function
cmdargs get the command line arguments
collapse_vector dump a named vector into character
combn_vector combine multiple vectors into one
correct_ratio correct the numbers to a target ratio
cross_count count two columns as a cross-tabulation table

-- D --

detect_dup detect possible duplication in a vector, ignore case, blank and special character
diff_index the index of different character
diff_tb differences between two tibbles
dx_tb diagnosis a tibble for character NA, NULL, all T/F column, blank in cell

-- E --

empty_dir detect whether directory is empty recursively
empty_file detect whether file is empty recursively
exist_matrix generate a matrix to show whether the item in each element of a list
expr_pileup pileup the subexpressions which is atomic
extract_kv extract key and values for a character vector

-- F --

fancy_count fancy count to show an extended column
fetch_char fetch character from strings
filterC apply tbflt on dplyr filter
fix_to_regex trans fixed string into regular expression string
float_to_percent from float number to percent number
fps_vector farthest point sampling (FPS) for a vector
full_expand like 'dplyr::full_join' while ignore the same columns in right tibble

-- G --

generate_ticks generate ticks for a number vector
gen_char generate characters
gen_combn generate all combinations
gen_outlier generate outliers from a series of number
gen_str generate strings
gen_tb generate tibbles
geom_mean geometric mean
group_vector group character vector by a regex pattern

-- H --

hist_bins separate numeric x into bins

-- I --

inner_expand like 'dplyr::inner_join' while ignore the same columns in right tibble
int_digits trans numbers to a fixed integer digit length if a number only have zeros

-- L --

left_expand like 'dplyr::left_join' while ignore the same columns in right tibble
list2df trans list into data.frame

-- M --

max_depth max depth of a list
melt_vector melt a vector into single value
mini_diamond Minimal tibble dataset adjusted from diamond
mm_norm max-min normalization
move_row move selected rows to target location

-- N --

nearest_tick the nearest ticks around a number
near_ticks the ticks near a number not NA
not.null not NULL
number_fun_wrapper wrapper of the functions to process number string with prefix and suffix

-- O --

ordered_slice slice a tibble by an ordered vector

-- P --

percent_to_float from percent number to float number
pileup_logical pileup another logical vector on the TRUE values of first vector
pkginfo information of packages
pkglib load packages as a batch
pkgver versions of packages
pos_int_split split a positive integer number as a number vector

-- R --

r2c wrapper of tibble::rownames_to_column
read_excel read excel file
read_excel_list read multi-sheet excel file as a list of tibbles
read_fmmd read front matter markdown
ref_level relevel a target column by another reference column
reg_join join the matched parts into string
reg_match regex match
remove_monocol remove columns by the ratio of an identical single value (NA supported)
remove_nacol remove columns by the ratio of NA
remove_narow remove rows by the ratio of NA
remove_outliers remove outliers and NA
replace_item replace the items of one object by another
rewrite_na rewrite the NA values in a tibble by another tibble
rng2seq trans range character into seq characters
round_string from float number to fixed digits character
roxygen_fmt add #' into each line of codes for roxygen examples

-- S --

same_index the index of identical character
seriate_df dataframe rows seriation, which will reorder the rows in a better pattern
sftp_connect connection parameters to remote server via sftp
sftp_download download file from remote server via sftp
sftp_ls list files from remote server via sftp
signif_ceiling signif while use ceiling
signif_floor signif while use floor
signif_round_string signif or round string depend on the character length
signif_string from float number to fixed significant digits character
slice_char slice character vector
sortf sort by a function
split_column split a column and return a longer tibble
split_path split a path into ancestor paths recursively
split_vector split vector into list
stat_fc fold change calculation which returns a extensible tibble
stat_phi calculate phi coefficient of two binary variables
stat_test statistical test which returns a extensible tibble
str_replace_loc replace specific characters in a string by their locations
swap_vecname swap the names and values of a vector

-- T --

tbflt create a tbflt object to save filter conditions
tdf transpose a dataframe
top_item return top n items with highest frequency

-- U --

uniq only keep unique vector values and its names
uniq_in_cols count unique values in each column

-- W --

write_excel write a tibble into an excel file

-- misc --

%eq% equal calculation operator, support NA
%neq% not equal calculation operator, support NA
%nin% not in calculation operator