cluster_modify_pairs {reclin2} | R Documentation |
Call a function on each of the worker nodes to modify the pairs on the node
Description
Call a function on each of the worker nodes to modify the pairs on the node
Usage
cluster_modify_pairs(pairs, fun, ..., new_name = NULL)
Arguments
pairs |
an object or type |
fun |
a function to call on each of the worker nodes. See details on the arguments of this function. |
... |
additional arguments are passed on to |
new_name |
name of new object to assign the pairs to on the cluster nodes. |
Details
The function will have to accept the following arguments as its first three arguments:
- pairs
the
data.table
with the pairs of the worker node.- x
a
data.table
with the portion ofx
present on the worker node.- y
a
data.table
withy
.
The function should either return a data.table
with the new pairs, or
NULL
. When a data.table
is returned this values will replace
the pairs when new_name
is missing or create new pairs in the
environment new_name
. When the function returns NULL
it is
assumed that the function modified the pairs by reference (e.g. using
pairs[, new_var := new_val]
). Note that this also means that
new_name
is ignored.
Value
Will return a cluster_pairs
object. When new_name
is not given
it will return the input pairs
invisibly. Otherwise it will return a
new cluster_pairs
object.
Examples
# Generate some pairs
library(parallel)
data("linkexample1", "linkexample2")
cl <- makeCluster(2)
pairs <- cluster_pair(cl, linkexample1, linkexample2)
compare_pairs(pairs, c("lastname", "firstname", "address", "sex"))
# Create a new set of pairs containing a random sample of the original
# pairs.
sample <- cluster_call(pairs, new_name = "sample", function(pairs, ...) {
sel <- sample(nrow(pairs), round(nrow(pairs)*0.1))
pairs[sel, ]
})
# Cleanup
stopCluster(cl)