R: Import m-Path Sense files into a database

import {mpathsenser}

R Documentation

Import m-Path Sense files into a database

Description

Import JSON files from m-Path Sense into a structured database. This function is the bread and butter of this package, as it populates the database with data that most of the other functions in this package use. It is recommend to first run test_jsons() and, if necessary, fix_jsons() to repair JSON files with problematic syntax.

Usage

import(
  path = getwd(),
  db,
  sensors = NULL,
  batch_size = 24,
  backend = "RSQLite",
  recursive = TRUE
)

Arguments

`path`	The path to the file directory
`db`	Valid database connection, typically created by `create_db()`.
`sensors`	Select one or multiple sensors as in `sensors`. Leave NULL to extract all sensor data.
`batch_size`	The number of files that are to be processed in a single batch.
`backend`	Name of the database backend that is used. Currently, only RSQLite is supported.
`recursive`	Should the listing recurse into directories?

Details

import allows you to specify which sensors to import (even though there may be more in the files) and it also allows batching for a speedier writing process. If processing in parallel is active, it is recommended that batch_size be a scalar multiple of the number of CPU cores the parallel cluster can use. If a single JSON file in the batch causes and error, the batch is terminated (but not the function) and it is up to the user to fix the file. This means that if batch_size is large, many files will not be processed. Set batch_size to 1 for sequential (one-by-one) file processing.

Currently, only SQLite is supported as a backend. Due to its concurrency restriction, parallel processing works for cleaning the raw data, but not for importing it into the database. This is because SQLite does not allow multiple processes to write to the same database at the same time. This is a limitation of SQLite and not of this package. However, while files are processing individually (and in parallel if specified), writing to the database happens for the entire batch specified by batch_size at once. This means that if a single file in the batch causes an error, the entire batch is skipped. This is to ensure that the database is not left in an inconsistent state.

Value

A message indicating how many files were imported. If all files were imported successfully, this functions returns an empty string invisibly. Otherwise the file names of the files that were not imported are returned visibly.

Parallel

This function supports parallel processing in the sense that it is able to distribute it's computation load among multiple workers. To make use of this functionality, run future::plan("multisession") before calling this function.

Progress

You can be updated of the progress of this function by using the progressr::progress() package. See progressr's vignette on how to subscribe to these updates.

Examples

## Not run: 
path <- "some/path"
# Create a database
db <- create_db(path = path, db_name = "my_db")

# Import all JSON files in the current directory
import(path = path, db = db)

# Import all JSON files in the current directory, but do so sequentially
import(path = path, db = db, batch_size = 1)

# Import all JSON files in the current directory, but only the accelerometer data
import(path = path, db = db, sensors = "accelerometer")

# Import all JSON files in the current directory, but only the accelerometer and gyroscope data
import(path = path, db = db, sensors = c("accelerometer", "gyroscope"))

# Remember to close the database
close_db(db)

## End(Not run)

[Package mpathsenser version 1.2.3 Index]