Dataset {mstrio} | R Documentation |
Create, update, delete and certify MicroStrategy datasets
Description
When creating a new dataset, provide a dataset name and an optional description. When updating a pre-existing dataset, provide the dataset identifier. Tables are added to the dataset in an iterative manner using 'add_table()'.
Public fields
connection
MicroStrategy connection object
name
Name of the dataset
description
Description of the dataset. Must be less than or equal to 250 characters
folder_id
If specified the dataset will be saved in this folder
dataset_id
Identifier of a pre-existing dataset. Used when updating a pre-existing dataset
owner_id
Owner ID
path
Cube path
modification_time
Last modification time, "yyyy-MM-dd HH:mm:ss" in UTC
size
Cube size
cube_state
Cube status,for example, 0=unpublished, 1=publishing, 64=ready
verbose
If True (default), displays additional messages.
Methods
Public methods
Method new()
Interface for creating, updating, and deleting MicroStrategy in-memory datasets.
Usage
Dataset$new( connection, name = NULL, description = NULL, dataset_id = NULL, verbose = TRUE )
Arguments
connection
MicroStrategy connection object returned by 'Connection$New()'.
name
(character): Name of the dataset.
description
(character, optional): Description of the dataset. Must be less than or equal to 250 characters.
dataset_id
(character, optional): Identifier of a pre-existing dataset. Used when updating a pre-existing dataset.
verbose
Setting to control the amount of feedback from the I-Server.
Details
When creating a new dataset, provide a dataset name and an optional description. When updating a pre-existing dataset, provide the dataset identifier. Tables are added to the dataset in an iterative manner using 'add_table()'.
Returns
A new 'Datasets' object
Method add_table()
Add a data.frame to a collection of tables which are later used to update the MicroStrategy dataset
Usage
Dataset$add_table( name, data_frame, update_policy, to_metric = NULL, to_attribute = NULL )
Arguments
name
(character): Logical name of the table that is visible to users of the dataset in MicroStrategy.
data_frame
('data.frame'): R data.frame to add or update.
update_policy
(character): Update operation to perform. One of 'add' (inserts new, unique rows), 'update' (updates data in existing rows and columns), 'upsert' (updates existing data and inserts new rows), or 'replace' (replaces the existing data with new data).
to_metric
(optional, vector): By default, R numeric data types are treated as metrics in the MicroStrategy dataset while character and date types are treated as attributes. For example, a column of integer-like strings ("1", "2", "3") would, by default, be an attribute in the newly created dataset. If the intent is to format this data as a metric, provide the respective column name as a character vector in 'to_metric' parameter.
to_attribute
(optional, vector): Logical opposite of 'to_metric'. Helpful for formatting an integer-based row identifier as a primary key in the dataset.
Details
Add tables to the dataset in an iterative manner using 'add_table()'.
Method create()
Create a new dataset.
Usage
Dataset$create( folder_id = NULL, auto_upload = TRUE, auto_publish = TRUE, chunksize = 1e+05 )
Arguments
folder_id
ID of the shared folder that the dataset should be created within. If 'None', defaults to the user's My Reports folder.
auto_upload
(default TRUE) If True, automatically uploads the data to the I-Server. If False, simply creates the dataset definition but does not upload data to it.
auto_publish
(default TRUE) If True, automatically publishes the data used to create the dataset definition. If False, simply creates the dataset but does not publish it. To publish the dataset, data has to be uploaded first.
chunksize
(int, optional) Number of rows to transmit to the I-Server with each request when uploading.
Method update()
Updates an existing dataset with new data.
Usage
Dataset$update(chunksize = 1e+05, auto_publish = TRUE)
Arguments
chunksize
(int, optional): Number of rows to transmit to the I-Server with each request when uploading.
auto_publish
(default TRUE) If True, automatically publishes the data. If False, data will be uploaded but the cube will not be published
Method publish()
Publish the uploaded data to the selected dataset. A dataset can be published just once.
Usage
Dataset$publish()
Method publish_status()
Check the status of data that was uploaded to a dataset.
Usage
Dataset$publish_status()
Returns
Response status code
Method delete()
Delete a dataset that was previously created using the REST API.
Usage
Dataset$delete()
Returns
Response object from the Intelligence Server acknowledging the deletion process.
Method certify()
Certify a dataset that was previously creted using the REST API
Usage
Dataset$certify()
Returns
Response object from the Intelligence Server acknowledging the certification process.
Method clone()
The objects of this class are cloneable with this method.
Usage
Dataset$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
Examples
## Not run:
# Create data frames
df1 <- data.frame("id" = c(1, 2, 3, 4, 5),
"first_name" = c("Jason", "Molly", "Tina", "Jake", "Amy"),
"last_name" = c("Miller", "Jacobson", "Turner", "Milner", "Cooze"))
df2 <- data.frame("id" = c(1, 2, 3, 4, 5),
"age" = c(42, 52, 36, 24, 73),
"state" = c("VA", "NC", "WY", "CA", "CA"),
"salary" = c(50000, 100000, 75000, 85000, 250000))
# Create a list of tables containing one or more tables and their names
my_dataset <- Dataset$new(connection=conn, name="HR Analysis")
my_dataset$add_table("Employees", df1, "add")
my_dataset$add_table("Salaries", df2, "add")
my_dataset$create()
# By default Dataset$create() will upload the data to the Intelligence Server and publish the
dataset.
# If you just want to create the dataset but not upload the row-level data, use
Dataset$create(auto_upload=FALSE)
# followed by
Dataset$update()
Dataset$publish()
# When the source data changes and users need the latest data for analysis and reporting in
# MicroStrategy, mstrio allows you to update the previously created dataset.
ds <- Dataset$new(connection=conn, dataset_id="...")
ds$add_table(name = "Stores", data_frame = stores_df, update_policy = 'update')
ds$add_table(name = "Sales", data_frame = stores_df, update_policy = 'upsert')
ds$update(auto_publish=TRUE)
# By default Dataset$update() will upload the data to the Intelligence Server and publish the
dataset.
# If you just want to update the dataset but not publish the row-level data, use
Dataset$update(auto_publish=FALSE)
# By default, the raw data is transmitted to the server in increments of 100,000 rows. On very
# large datasets (>1 GB), it is beneficial to increase the number of rows transmitted to the
# Intelligence Server with each request. Do this with the chunksize parameter:
ds$update(chunksize = 500000)
# If you want to cerfify an existing dataset, use
ds$certify()
## End(Not run)