| cluster_modify_pairs {reclin2} | R Documentation |
Call a function on each of the worker nodes to modify the pairs on the node
Description
Call a function on each of the worker nodes to modify the pairs on the node
Usage
cluster_modify_pairs(pairs, fun, ..., new_name = NULL)
Arguments
pairs |
an object or type |
fun |
a function to call on each of the worker nodes. See details on the arguments of this function. |
... |
additional arguments are passed on to |
new_name |
name of new object to assign the pairs to on the cluster nodes. |
Details
The function will have to accept the following arguments as its first three arguments:
- pairs
the
data.tablewith the pairs of the worker node.- x
a
data.tablewith the portion ofxpresent on the worker node.- y
a
data.tablewithy.
The function should either return a data.table with the new pairs, or
NULL. When a data.table is returned this values will replace
the pairs when new_name is missing or create new pairs in the
environment new_name. When the function returns NULL it is
assumed that the function modified the pairs by reference (e.g. using
pairs[, new_var := new_val]). Note that this also means that
new_name is ignored.
Value
Will return a cluster_pairs object. When new_name is not given
it will return the input pairs invisibly. Otherwise it will return a
new cluster_pairs object.
Examples
# Generate some pairs
library(parallel)
data("linkexample1", "linkexample2")
cl <- makeCluster(2)
pairs <- cluster_pair(cl, linkexample1, linkexample2)
compare_pairs(pairs, c("lastname", "firstname", "address", "sex"))
# Create a new set of pairs containing a random sample of the original
# pairs.
sample <- cluster_call(pairs, new_name = "sample", function(pairs, ...) {
sel <- sample(nrow(pairs), round(nrow(pairs)*0.1))
pairs[sel, ]
})
# Cleanup
stopCluster(cl)