emr {paws}R Documentation

Amazon EMR

Description

Amazon EMR is a web service that makes it easier to process large amounts of data efficiently. Amazon EMR uses Hadoop processing combined with several Amazon Web Services services to do tasks such as web indexing, data mining, log file analysis, machine learning, scientific simulation, and data warehouse management.

Usage

emr(config = list(), credentials = list(), endpoint = NULL, region = NULL)

Arguments

config

Optional configuration of credentials, endpoint, and/or region.

  • credentials:

    • creds:

      • access_key_id: AWS access key ID

      • secret_access_key: AWS secret access key

      • session_token: AWS temporary session token

    • profile: The name of a profile to use. If not given, then the default profile is used.

    • anonymous: Set anonymous credentials.

  • endpoint: The complete URL to use for the constructed client.

  • region: The AWS Region used in instantiating the client.

  • close_connection: Immediately close all HTTP connections.

  • timeout: The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds.

  • s3_force_path_style: Set this to true to force the request to use path-style addressing, i.e. ⁠http://s3.amazonaws.com/BUCKET/KEY⁠.

  • sts_regional_endpoint: Set sts regional endpoint resolver to regional or legacy https://docs.aws.amazon.com/sdkref/latest/guide/feature-sts-regionalized-endpoints.html

credentials

Optional credentials shorthand for the config parameter

  • creds:

    • access_key_id: AWS access key ID

    • secret_access_key: AWS secret access key

    • session_token: AWS temporary session token

  • profile: The name of a profile to use. If not given, then the default profile is used.

  • anonymous: Set anonymous credentials.

endpoint

Optional shorthand for complete URL to use for the constructed client.

region

Optional shorthand for AWS Region used in instantiating the client.

Value

A client for the service. You can call the service's operations using syntax like svc$operation(...), where svc is the name you've assigned to the client. The available operations are listed in the Operations section.

Service syntax

svc <- emr(
  config = list(
    credentials = list(
      creds = list(
        access_key_id = "string",
        secret_access_key = "string",
        session_token = "string"
      ),
      profile = "string",
      anonymous = "logical"
    ),
    endpoint = "string",
    region = "string",
    close_connection = "logical",
    timeout = "numeric",
    s3_force_path_style = "logical",
    sts_regional_endpoint = "string"
  ),
  credentials = list(
    creds = list(
      access_key_id = "string",
      secret_access_key = "string",
      session_token = "string"
    ),
    profile = "string",
    anonymous = "logical"
  ),
  endpoint = "string",
  region = "string"
)

Operations

add_instance_fleet Adds an instance fleet to a running cluster
add_instance_groups Adds one or more instance groups to a running cluster
add_job_flow_steps AddJobFlowSteps adds new steps to a running cluster
add_tags Adds tags to an Amazon EMR resource, such as a cluster or an Amazon EMR Studio
cancel_steps Cancels a pending step or steps in a running cluster
create_security_configuration Creates a security configuration, which is stored in the service and can be specified when a cluster is created
create_studio Creates a new Amazon EMR Studio
create_studio_session_mapping Maps a user or group to the Amazon EMR Studio specified by StudioId, and applies a session policy to refine Studio permissions for that user or group
delete_security_configuration Deletes a security configuration
delete_studio Removes an Amazon EMR Studio from the Studio metadata store
delete_studio_session_mapping Removes a user or group from an Amazon EMR Studio
describe_cluster Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on
describe_job_flows This API is no longer supported and will eventually be removed
describe_notebook_execution Provides details of a notebook execution
describe_release_label Provides Amazon EMR release label details, such as the releases available the Region where the API request is run, and the available applications for a specific Amazon EMR release label
describe_security_configuration Provides the details of a security configuration by returning the configuration JSON
describe_step Provides more detail about the cluster step
describe_studio Returns details for the specified Amazon EMR Studio including ID, Name, VPC, Studio access URL, and so on
get_auto_termination_policy Returns the auto-termination policy for an Amazon EMR cluster
get_block_public_access_configuration Returns the Amazon EMR block public access configuration for your Amazon Web Services account in the current Region
get_cluster_session_credentials Provides temporary, HTTP basic credentials that are associated with a given runtime IAM role and used by a cluster with fine-grained access control activated
get_managed_scaling_policy Fetches the attached managed scaling policy for an Amazon EMR cluster
get_studio_session_mapping Fetches mapping details for the specified Amazon EMR Studio and identity (user or group)
list_bootstrap_actions Provides information about the bootstrap actions associated with a cluster
list_clusters Provides the status of all clusters visible to this Amazon Web Services account
list_instance_fleets Lists all available details about the instance fleets in a cluster
list_instance_groups Provides all available details about the instance groups in a cluster
list_instances Provides information for all active Amazon EC2 instances and Amazon EC2 instances terminated in the last 30 days, up to a maximum of 2,000
list_notebook_executions Provides summaries of all notebook executions
list_release_labels Retrieves release labels of Amazon EMR services in the Region where the API is called
list_security_configurations Lists all the security configurations visible to this account, providing their creation dates and times, and their names
list_steps Provides a list of steps for the cluster in reverse order unless you specify stepIds with the request or filter by StepStates
list_studios Returns a list of all Amazon EMR Studios associated with the Amazon Web Services account
list_studio_session_mappings Returns a list of all user or group session mappings for the Amazon EMR Studio specified by StudioId
list_supported_instance_types A list of the instance types that Amazon EMR supports
modify_cluster Modifies the number of steps that can be executed concurrently for the cluster specified using ClusterID
modify_instance_fleet Modifies the target On-Demand and target Spot capacities for the instance fleet with the specified InstanceFleetID within the cluster specified using ClusterID
modify_instance_groups ModifyInstanceGroups modifies the number of nodes and configuration settings of an instance group
put_auto_scaling_policy Creates or updates an automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster
put_auto_termination_policy Auto-termination is supported in Amazon EMR releases 5
put_block_public_access_configuration Creates or updates an Amazon EMR block public access configuration for your Amazon Web Services account in the current Region
put_managed_scaling_policy Creates or updates a managed scaling policy for an Amazon EMR cluster
remove_auto_scaling_policy Removes an automatic scaling policy from a specified instance group within an Amazon EMR cluster
remove_auto_termination_policy Removes an auto-termination policy from an Amazon EMR cluster
remove_managed_scaling_policy Removes a managed scaling policy from a specified Amazon EMR cluster
remove_tags Removes tags from an Amazon EMR resource, such as a cluster or Amazon EMR Studio
run_job_flow RunJobFlow creates and starts running a new cluster (job flow)
set_keep_job_flow_alive_when_no_steps You can use the SetKeepJobFlowAliveWhenNoSteps to configure a cluster (job flow) to terminate after the step execution, i
set_termination_protection SetTerminationProtection locks a cluster (job flow) so the Amazon EC2 instances in the cluster cannot be terminated by user intervention, an API call, or in the event of a job-flow error
set_unhealthy_node_replacement Specify whether to enable unhealthy node replacement, which lets Amazon EMR gracefully replace core nodes on a cluster if any nodes become unhealthy
set_visible_to_all_users The SetVisibleToAllUsers parameter is no longer supported
start_notebook_execution Starts a notebook execution
stop_notebook_execution Stops a notebook execution
terminate_job_flows TerminateJobFlows shuts a list of clusters (job flows) down
update_studio Updates an Amazon EMR Studio configuration, including attributes such as name, description, and subnets
update_studio_session_mapping Updates the session policy attached to the user or group for the specified Amazon EMR Studio

Examples

## Not run: 
svc <- emr()
svc$add_instance_fleet(
  Foo = 123
)

## End(Not run)


[Package paws version 0.6.0 Index]