S3FileSystem {s3fs} | R Documentation |
Access AWS S3 as if it were a file system.
Description
This creates a file system "like" API based off fs
(e.g. dir_ls, file_copy, etc.) for AWS S3 storage.
Public fields
s3_cache
Cache AWS S3
s3_cache_bucket
Cached s3 bucket
s3_client
paws s3 client
region_name
AWS region when creating new connections
profile_name
The name of a profile to use
multipart_threshold
Threshold to use multipart
request_payer
Threshold to use multipart
pid
Get the process ID of the R Session
Active bindings
retries
number of retries
Methods
Public methods
Method new()
Initialize S3FileSystem class
Usage
S3FileSystem$new( aws_access_key_id = NULL, aws_secret_access_key = NULL, aws_session_token = NULL, region_name = NULL, profile_name = NULL, endpoint = NULL, disable_ssl = FALSE, multipart_threshold = fs_bytes("2GB"), request_payer = FALSE, anonymous = FALSE, ... )
Arguments
aws_access_key_id
(character): AWS access key ID
aws_secret_access_key
(character): AWS secret access key
aws_session_token
(character): AWS temporary session token
region_name
(character): Default region when creating new connections
profile_name
(character): The name of a profile to use. If not given, then the default profile is used.
endpoint
(character): The complete URL to use for the constructed client.
disable_ssl
(logical): Whether or not to use SSL. By default, SSL is used.
multipart_threshold
(fs_bytes): Threshold to use multipart instead of standard copy and upload methods.
request_payer
(logical): Confirms that the requester knows that they will be charged for the request.
anonymous
(logical): Set up anonymous credentials when connecting to AWS S3.
...
Other parameters within
paws
client.
Method file_chmod()
Change file permissions
Usage
S3FileSystem$file_chmod( path, mode = c("private", "public-read", "public-read-write", "authenticated-read", "aws-exec-read", "bucket-owner-read", "bucket-owner-full-control") )
Arguments
path
(character): A character vector of path or s3 uri.
mode
(character): A character of the mode
Returns
character vector of s3 uri paths
Method file_copy()
copy files
Usage
S3FileSystem$file_copy( path, new_path, max_batch = fs_bytes("100MB"), overwrite = FALSE, ... )
Arguments
path
(character): path to a local directory of file or a uri.
new_path
(character): path to a local directory of file or a uri.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
Returns
character vector of s3 uri paths
Method file_create()
Create file on AWS S3, if file already exists it will be left unchanged.
Usage
S3FileSystem$file_create(path, overwrite = FALSE, ...)
Arguments
path
(character): A character vector of path or s3 uri.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
Returns
character vector of s3 uri paths
Method file_delete()
Delete files in AWS S3
Usage
S3FileSystem$file_delete(path, ...)
Arguments
path
(character): A character vector of paths or s3 uris.
...
parameters to be passed to
s3_delete_objects
Returns
character vector of s3 uri paths
Method file_download()
Downloads AWS S3 files to local
Usage
S3FileSystem$file_download(path, new_path, overwrite = FALSE, ...)
Arguments
path
(character): A character vector of paths or uris
new_path
(character): A character vector of paths to the new locations.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_get_object
Returns
character vector of s3 uri paths
Method file_exists()
Check if file exists in AWS S3
Usage
S3FileSystem$file_exists(path)
Arguments
path
(character) s3 path to check
Returns
logical vector if file exists
Method file_info()
Returns file information within AWS S3 directory
Usage
S3FileSystem$file_info(path)
Arguments
path
(character): A character vector of paths or uris.
Returns
A data.table with metadata for each file. Columns returned are as follows.
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
type (character): file type (file or directory)
etag (character): An entity tag is an opague identifier
last_modified (POSIXct): Created date of file.
delete_marker (logical): Specifies retrieved a logical marker
accept_ranges (character): Indicates that a range of bytes was specified.
expiration (character): File expiration
restore (character): If file is archived
archive_status (character): Archive status
missing_meta (integer): Number of metadata entries not returned in "x-amz-meta" headers
version_id (character): version id of file
cache_control (character): caching behaviour for the request/reply chain
content_disposition (character): presentational information of file
content_encoding (character): file content encodings
content_language (character): what language the content is in
content_type (character): file MIME type
expires (POSIXct): date and time the file is no longer cacheable
website_redirect_location (character): redirects request for file to another
server_side_encryption (character): File server side encryption
metadata (list): metadata of file
sse_customer_algorithm (character): server-side encryption with a customer-provided encryption key
sse_customer_key_md5 (character): server-side encryption with a customer-provided encryption key
ssekms_key_id (character): ID of the Amazon Web Services Key Management Service
bucket_key_enabled (logical): s3 bucket key for server-side encryption with
storage_class (character): file storage class information
request_charged (character): indicates successfully charged for request
replication_status (character): return specific header if request involves a bucket that is either a source or a destination in a replication rule https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.head_object
parts_count (integer): number of count parts the file has
object_lock_mode (character): the file lock mode
object_lock_retain_until_date (POSIXct): date and time of when object_lock_mode expires
object_lock_legal_hold_status (character): file legal holding
Method file_move()
Move files to another location on AWS S3
Usage
S3FileSystem$file_move( path, new_path, max_batch = fs_bytes("100MB"), overwrite = FALSE, ... )
Arguments
path
(character): A character vector of s3 uri
new_path
(character): A character vector of s3 uri.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_copy_object
Returns
character vector of s3 uri paths
Method file_size()
Return file size in bytes
Usage
S3FileSystem$file_size(path)
Arguments
path
(character): A character vector of s3 uri
Method file_stream_in()
Streams in AWS S3 file as a raw vector
Usage
S3FileSystem$file_stream_in(path, ...)
Arguments
path
(character): A character vector of paths or s3 uri
...
parameters to be passed to
s3_get_object
Returns
list of raw vectors containing the contents of the file
Method file_stream_out()
Streams out raw vector to AWS S3 file
Usage
S3FileSystem$file_stream_out( obj, path, max_batch = fs_bytes("100MB"), overwrite = FALSE, ... )
Arguments
obj
(raw|character): A raw vector, rawConnection, url to be streamed up to AWS S3.
path
(character): A character vector of paths or s3 uri
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
Returns
character vector of s3 uri paths
Method file_temp()
return the name which can be used as a temporary file
Usage
S3FileSystem$file_temp(pattern = "file", tmp_dir = "", ext = "")
Arguments
pattern
(character): A character vector with the non-random portion of the name.
tmp_dir
(character): The directory the file will be created in.
ext
(character): A character vector of one or more paths.
Returns
character vector of s3 uri paths
Method file_tag_delete()
Delete file tags
Usage
S3FileSystem$file_tag_delete(path)
Arguments
path
(character): A character vector of paths or s3 uri
...
parameters to be passed to
s3_put_object
Returns
character vector of s3 uri paths
Method file_tag_info()
Get file tags
Usage
S3FileSystem$file_tag_info(path)
Arguments
path
(character): A character vector of paths or s3 uri
Returns
data.table of file version metadata
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
version_id (character): version id of file
tag_key (character): name of tag
tag_value (character): tag value
Method file_tag_update()
Update file tags
Usage
S3FileSystem$file_tag_update(path, tags, overwrite = FALSE)
Arguments
path
(character): A character vector of paths or s3 uri
tags
(list): Tags to be applied
overwrite
(logical): To overwrite tagging or to modify inplace. Default will modify inplace.
Returns
character vector of s3 uri paths
Method file_touch()
Similar to fs::file_touch
this does not create the file if
it does not exist. Use s3fs$file_create()
to do this if needed.
Usage
S3FileSystem$file_touch(path, ...)
Arguments
path
(character): A character vector of paths or s3 uri
...
parameters to be passed to
s3_copy_object
Returns
character vector of s3 uri paths
Method file_upload()
Uploads files to AWS S3
Usage
S3FileSystem$file_upload( path, new_path, max_batch = fs_bytes("100MB"), overwrite = FALSE, ... )
Arguments
path
(character): A character vector of local file paths to upload to AWS S3
new_path
(character): A character vector of AWS S3 paths or uri's of the new locations.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
ands3_create_multipart_upload
Returns
character vector of s3 uri paths
Method file_url()
Generate presigned url for S3 object
Usage
S3FileSystem$file_url(path, expiration = 3600L, ...)
Arguments
path
(character): A character vector of paths or uris
expiration
(numeric): The number of seconds the presigned url is valid for. By default it expires in an hour (3600 seconds)
...
parameters passed to
s3_get_object
Returns
return character of urls
Method file_version_info()
Get file versions
Usage
S3FileSystem$file_version_info(path, ...)
Arguments
path
(character): A character vector of paths or uris
...
parameters to be passed to
s3_list_object_versions
Returns
return data.table with file version info, columns below:
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
version_id (character): version id of file
owner (character): file owner
etag (character): An entity tag is an opague identifier
last_modified (POSIXct): Created date of file.
Method is_file()
Test for file types
Usage
S3FileSystem$is_file(path)
Arguments
path
(character): A character vector of paths or uris
Returns
logical vector if object is a file
Method is_dir()
Test for file types
Usage
S3FileSystem$is_dir(path)
Arguments
path
(character): A character vector of paths or uris
Returns
logical vector if object is a directory
Method is_bucket()
Test for file types
Usage
S3FileSystem$is_bucket(path, ...)
Arguments
path
(character): A character vector of paths or uris
...
parameters to be passed to
s3_list_objects_v2
Returns
logical vector if object is a AWS S3
bucket
Method is_file_empty()
Test for file types
Usage
S3FileSystem$is_file_empty(path)
Arguments
path
(character): A character vector of paths or uris
Returns
logical vector if file is empty
Method bucket_chmod()
Change bucket permissions
Usage
S3FileSystem$bucket_chmod( path, mode = c("private", "public-read", "public-read-write", "authenticated-read") )
Arguments
path
(character): A character vector of path or s3 uri.
mode
(character): A character of the mode
Returns
character vector of s3 uri paths
Method bucket_create()
Create bucket
Usage
S3FileSystem$bucket_create( path, region_name = NULL, mode = c("private", "public-read", "public-read-write", "authenticated-read"), versioning = FALSE, ... )
Arguments
path
(character): A character vector of path or s3 uri.
region_name
(character): aws region
mode
(character): A character of the mode
versioning
(logical): Whether to set the bucket to versioning or not.
...
parameters to be passed to
s3_create_bucket
Returns
character vector of s3 uri paths
Method bucket_delete()
Delete bucket
Usage
S3FileSystem$bucket_delete(path)
Arguments
path
(character): A character vector of path or s3 uri.
Method dir_copy()
Copies the directory recursively to the new location.
Usage
S3FileSystem$dir_copy( path, new_path, max_batch = fs_bytes("100MB"), overwrite = FALSE, ... )
Arguments
path
(character): path to a local directory of file or a uri.
new_path
(character): path to a local directory of file or a uri.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
ands3_create_multipart_upload
Returns
character vector of s3 uri paths
Method dir_create()
Create empty directory
Usage
S3FileSystem$dir_create(path, overwrite = FALSE, ...)
Arguments
path
(character): A vector of directory or uri to be created in AWS S3
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
Returns
character vector of s3 uri paths
Method dir_delete()
Delete contents and directory in AWS S3
Usage
S3FileSystem$dir_delete(path)
Arguments
path
(character): A vector of paths or uris to directories to be deleted.
Returns
character vector of s3 uri paths
Method dir_exists()
Check if path exists in AWS S3
Usage
S3FileSystem$dir_exists(path = ".")
Arguments
path
(character) aws s3 path to be checked
Returns
character vector of s3 uri paths
Method dir_download()
Downloads AWS S3 files to local
Usage
S3FileSystem$dir_download(path, new_path, overwrite = FALSE, ...)
Arguments
path
(character): A character vector of paths or uris
new_path
(character): A character vector of paths to the new locations. Please ensure directories end with a
/
.overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_get_object
Returns
character vector of s3 uri paths
Method dir_info()
Returns file information within AWS S3 directory
Usage
S3FileSystem$dir_info( path = ".", type = c("any", "bucket", "directory", "file"), glob = NULL, regexp = NULL, invert = FALSE, recurse = FALSE, refresh = FALSE, ... )
Arguments
path
(character):A character vector of one or more paths. Can be path or s3 uri.
type
(character): File type(s) to return. Default ("any") returns all AWS S3 object types.
glob
(character): A wildcard pattern (e.g.
*.csv
), passed ontogrep()
to filter paths.regexp
(character): A regular expression (e.g.
[.]csv$
), passed ontogrep()
to filter paths.invert
(logical): If
code
return files which do not match.recurse
(logical): Returns all AWS S3 objects in lower sub directories
refresh
(logical): Refresh cached in
s3_cache
....
parameters to be passed to
s3_list_objects_v2
Returns
data.table with directory metadata
bucket_name (character): AWS S3 bucket of file
key (character): AWS S3 path key of file
uri (character): S3 uri of file
size (numeric): file size in bytes
version_id (character): version id of file
etag (character): An entity tag is an opague identifier
last_modified (POSIXct): Created date of file
Method dir_ls()
Returns file name within AWS S3 directory
Usage
S3FileSystem$dir_ls( path = ".", type = c("any", "bucket", "directory", "file"), glob = NULL, regexp = NULL, invert = FALSE, recurse = FALSE, refresh = FALSE, ... )
Arguments
path
(character):A character vector of one or more paths. Can be path or s3 uri.
type
(character): File type(s) to return. Default ("any") returns all AWS S3 object types.
glob
(character): A wildcard pattern (e.g.
*.csv
), passed ontogrep()
to filter paths.regexp
(character): A regular expression (e.g.
[.]csv$
), passed ontogrep()
to filter paths.invert
(logical): If
code
return files which do not match.recurse
(logical): Returns all AWS S3 objects in lower sub directories
refresh
(logical): Refresh cached in
s3_cache
....
parameters to be passed to
s3_list_objects_v2
Returns
character vector of s3 uri paths
Method dir_ls_url()
Generate presigned url to list S3 directories
Usage
S3FileSystem$dir_ls_url(path, expiration = 3600L, recurse = FALSE, ...)
Arguments
path
(character): A character vector of paths or uris
expiration
(numeric): The number of seconds the presigned url is valid for. By default it expires in an hour (3600 seconds)
recurse
(logical): Returns all AWS S3 objects in lower sub directories
...
parameters passed to
s3_list_objects_v2
Returns
return character of urls
Method dir_tree()
Print contents of directories in a tree-like format
Usage
S3FileSystem$dir_tree(path, recurse = TRUE, ...)
Arguments
path
(character): path A path to print the tree from
recurse
(logical): Returns all AWS S3 objects in lower sub directories
...
Additional arguments passed to s3_dir_ls.
Returns
character vector of s3 uri paths
Method dir_upload()
Uploads local directory to AWS S3
Usage
S3FileSystem$dir_upload( path, new_path, max_batch = fs_bytes("100MB"), overwrite = FALSE, ... )
Arguments
path
(character): A character vector of local file paths to upload to AWS S3
new_path
(character): A character vector of AWS S3 paths or uri's of the new locations.
max_batch
(fs_bytes): Maximum batch size being uploaded with each multipart.
overwrite
(logical): Overwrite files if the exist. If this is
FALSE
and the file exists an error will be thrown....
parameters to be passed to
s3_put_object
ands3_create_multipart_upload
Returns
character vector of s3 uri paths
Method path()
Constructs a s3 uri path
Usage
S3FileSystem$path(..., ext = "")
Arguments
...
(character): Character vectors
ext
(character): An optional extension to append to the generated path
Returns
character vector of s3 uri paths
Method path_dir()
Returns the directory portion of s3 uri
Usage
S3FileSystem$path_dir(path)
Arguments
path
(character): A character vector of paths
Returns
character vector of s3 uri paths
Method path_ext()
Returns the last extension for a path.
Usage
S3FileSystem$path_ext(path)
Arguments
path
(character): A character vector of paths
Returns
character s3 uri file extension
Method path_ext_remove()
Removes the last extension and return the rest of the s3 uri.
Usage
S3FileSystem$path_ext_remove(path)
Arguments
path
(character): A character vector of paths
Returns
character vector of s3 uri paths
Method path_ext_set()
Replace the extension with a new extension.
Usage
S3FileSystem$path_ext_set(path, ext)
Arguments
path
(character): A character vector of paths
ext
(character): New file extension
Returns
character vector of s3 uri paths
Method path_file()
Returns the file name portion of the s3 uri path
Usage
S3FileSystem$path_file(path)
Arguments
path
(character): A character vector of paths
Returns
character vector of file names
Method path_join()
Construct an s3 uri path from path vector
Usage
S3FileSystem$path_join(parts)
Arguments
parts
(character): A character vector of one or more paths
Returns
character vector of s3 uri paths
Method path_split()
Split s3 uri path to core components bucket, key and version id
Usage
S3FileSystem$path_split(path)
Arguments
path
(character): A character vector of one or more paths or s3 uri
Returns
list character vectors splitting the s3 uri path in "Bucket", "Key" and "VersionId"
Method clear_cache()
Clear S3 Cache
Usage
S3FileSystem$clear_cache(path = NULL)
Arguments
path
(character): s3 path to be cl
Method clone()
The objects of this class are cloneable with this method.
Usage
S3FileSystem$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
Note
This method will only update the modification time of the AWS S3 object.