mvl_write_groups {RMVL}R Documentation

Write group information for each row

Description

This function is passed a list of MVL vectors which are interpreted in data.frame fashion. These rows are split into groups so that identical rows are guaranteed to belong to the same group. This is done internally based on 20-bit hash values. This function is convenient to use as a way to partition very large datasets before applying mvl_group or mvl_find_matches. The groups can be obtained by using mvl_get_groups

Usage

mvl_write_groups(MVLHANDLE, L, name = NULL)

Arguments

MVLHANDLE

a handle to MVL file produced by mvl_open()

L

list of vector like MVL_OBJECTs

name

if specified add a named entry to MVL file directory

Value

an object of class MVL_OFFSET that describes an offset into this MVL file. MVL offsets are vectors and can be concatenated. They can be written to MVL file directly, or as part of another object such as list.

See Also

mvl_order_vectors, mvl_find_matches, mvl_group, mvl_find_matches, mvl_indexed_copy, mvl_merge, mvl_hash_vectors, mvl_get_groups

Examples

## Not run: 
Mtmp<-mvl_open("tmp_a.mvl", append=TRUE, create=TRUE)
mvl_write_object(Mtmp, data.frame(x=runif(100), y=1:100), "df1")
Mtmp<-mvl_remap(Mtmp)
mvl_write_groups(Mtmp, list(Mtmp$df1[,"x",ref=TRUE], Mtmp$df1[,"y", ref=TRUE]), "df1_groups")
Mtmp<-mvl_remap(Mtmp)
print(mvl_get_groups(Mtmp["df1_groups", ref=TRUE]["prev", ref=TRUE], Mtmp$df1_groups$first[1:5]))

## End(Not run)

[Package RMVL version 1.1.0.0 Index]