transform {rrefine} | R Documentation |
Text transformation for OpenRefine project
Description
The text transform functions allow users to pass arbitrary text transformations to a column in an existing OpenRefine project via an API query to /command/core/apply-operations
and the core/text-transform
operation. Besides the generic refine_transform()
, the package includes a series of transform functions that apply commonly used text operations. For more information on these functions see 'Details'.
Usage
refine_transform(
column_name,
expression,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_lower(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_upper(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_title(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_null(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_empty(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_text(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_number(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_to_date(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_trim_whitespace(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_collapse_whitespace(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
refine_unescape_html(
column_name,
mode = "row-based",
on_error = "set-to-blank",
project.name = NULL,
project.id = NULL,
verbose = FALSE,
validate = TRUE,
...
)
Arguments
column_name |
Name of the column on which text transformation should be performed |
expression |
Expression defining the text transformation to be performed |
mode |
Mode of operation; must be one of |
on_error |
Behavior if there is an error on new column creation; must be one of |
project.name |
Name of project |
project.id |
Unique identifier for project |
verbose |
Logical specifying whether or not query result should be printed; default is |
validate |
Logical as to whether or not the operation should validate parameters against existing data in project; default is |
... |
Additional parameters to be inherited by |
Details
The refine_transform()
function allows the user to pass arbitrary text transformations to a given column in an OpenRefine project. The package includes a set of functions that wrap refine_transform()
to execute common transformations:
-
refine_to_lower()
: Coerce text to lowercase -
refine_to_upper()
: Coerce text to uppercase -
refine_to_title()
: Coerce text to title case -
refine_to_null()
: Set values toNULL
-
refine_to_empty()
: Set text values to empty string (""
) -
refine_to_text()
: Coerce value to string -
refine_to_number()
: Coerce value to numeric -
refine_to_date()
: Coerce value to date -
refine_trim_whitespace()
: Remove leading and trailing whitespaces -
refine_collapse_whitespace()
: Collapse consecutive whitespaces to single whitespace -
refine_unescape_html()
: Unescape HTML in string
Value
Operates as a side-effect passing operations to the OpenRefine instance. However, if verbose=TRUE
then the function will return an object of the class "response".
Examples
## Not run:
fp <- system.file("extdata", "lateformeeting.csv", package = "rrefine")
refine_upload(fp, project.name = "lfm")
refine_add_column(new_column = "dotw",
base_column = "what day whas it",
value = "grel:value",
project.name = "lfm")
refine_export("lfm")$dotw
refine_to_lower("dotw", project.name = "lfm")
refine_export("lfm")$dotw
refine_to_upper("dotw", project.name = "lfm")
refine_export("lfm")$dotw
refine_to_title("dotw", project.name = "lfm")
refine_export("lfm")$dotw
refine_to_null("dotw", project.name = "lfm")
refine_export("lfm")$dotw
refine_remove_column("dotw", project.name = "lfm")
refine_add_column(new_column = "date",
base_column = "theDate",
value = "grel:value",
project.name = "lfm")
refine_export("lfm")$date
refine_to_date("date", project.name = "lfm")
refine_export("lfm")$date
refine_remove_column("date", project.name = "lfm")
## End(Not run)