mvl_write_groups {RMVL} | R Documentation |
Write group information for each row
Description
This function is passed a list of MVL vectors which are interpreted in data.frame fashion. These rows
are split into groups so that identical rows are guaranteed to belong to the same group. This is done internally based on 20-bit hash values.
This function is convenient to use as a way to partition very large datasets before applying mvl_group
or mvl_find_matches
.
The groups can be obtained by using mvl_get_groups
Usage
mvl_write_groups(MVLHANDLE, L, name = NULL)
Arguments
MVLHANDLE |
a handle to MVL file produced by mvl_open() |
L |
list of vector like MVL_OBJECTs |
name |
if specified add a named entry to MVL file directory |
Value
an object of class MVL_OFFSET that describes an offset into this MVL file. MVL offsets are vectors and can be concatenated. They can be written to MVL file directly, or as part of another object such as list.
See Also
mvl_order_vectors
, mvl_find_matches
, mvl_group
, mvl_find_matches
, mvl_indexed_copy
, mvl_merge
, mvl_hash_vectors
, mvl_get_groups
Examples
## Not run:
Mtmp<-mvl_open("tmp_a.mvl", append=TRUE, create=TRUE)
mvl_write_object(Mtmp, data.frame(x=runif(100), y=1:100), "df1")
Mtmp<-mvl_remap(Mtmp)
mvl_write_groups(Mtmp, list(Mtmp$df1[,"x",ref=TRUE], Mtmp$df1[,"y", ref=TRUE]), "df1_groups")
Mtmp<-mvl_remap(Mtmp)
print(mvl_get_groups(Mtmp["df1_groups", ref=TRUE]["prev", ref=TRUE], Mtmp$df1_groups$first[1:5]))
## End(Not run)