mdb_txn {thor} | R Documentation |
Use mdb transactions
Description
Transactions are required for every mdb operation. Even when
using the convenience functions in mdb_env
(get
, etc), a transaction is created and committed each
time. Within a transaction, either everything happens or nothing
happens, and everything gets a single consistent view of the
database.
Details
There can be many read transactions per environment, but only one write transactions. Because R is single-threaded, that means that you can only simultaneously write from an mdb environment from a single object - any further attempts to open write transactions it would block forever while waiting for a lock that can't be released because there is only one thread!
Methods
id
-
Return the mdb internal id of the transaction
Usage:
id()
Value: An integer
Note: In lmdb.h this is
mdb_txn_id()
stat
-
Brief statistics about the database. This is the same as
mdb_env
'sstat()
but applying to the transactionUsage:
stat()
Value: An integer vector with elements
psize
(the size of a database page),depth
(depth of the B-tree),brancb_pages
(number of internal non-leaf) pages),leaf_pages
(number of leaf pages),overflow_pages
(number of overflow pages) andentries
(number of data items).Note: In lmdb.h this is
mdb_stat()
commit
-
Commit all changes made in this transaction to the database, and invalidate the transaction, and any cursors belonging to it (i.e., once committed the transaction cannot be used again)
Usage:
commit()
Value: Nothing, called for its side effects only
Note: In lmdb.h this is
mdb_txn_commit()
abort
-
Abandon all changes made in this transaction to the database, and invalidate the transaction, and any cursors belonging to it (i.e., once aborted the transaction cannot be used again). For read-only transactions there is no practical difference between abort and commit, except that using
abort
allows the transaction to be recycled more efficiently.Usage:
abort(cache = TRUE)
Arguments:
cache
: Logical, indicating if a read-only transaction should be cached for recycling
Value: Nothing, called for its side effects only
Note: In lmdb.h this is
mdb_txn_abort()
cursor
-
Create a
mdb_cursor
object in this transaction. This can be used for more powerful database interactions.Usage:
cursor()
Value: A
mdb_cursor
object.Note: In lmdb.h this is
mdb_cursor_open()
get
-
Retrieve a value from the database
Usage:
get(key, missing_is_error = TRUE, as_proxy = FALSE, as_raw = NULL)
Arguments:
key
: A string (or raw vector) - the key to getmissing_is_error
: Logical, indicating if a missing value is an error (by default it is). Alternatively, withmissing_is_error = FALSE
, a missing value will returnNULL
. Because no value can beNULL
(all values must have nonzero length) aNULL
is unambiguously missing.as_proxy
: Return a "proxy" object, which defers doing a copy into R. Seemdb_proxy
for more information.as_raw
: EitherNULL
, or a logical, to indicate the result type required. Withas_raw = NULL
, the default, the value will be returned as a string if possible. If not possible it will return a raw vector. Withas_raw = TRUE
,get()
will always return a raw vector, even when it is possibly to represent the value as a string. Ifas_raw = FALSE
,get
will return a string, but throw an error if this is not possible. This is discussed in more detail in the thor vignette (vignette("thor")
)
Note: In lmdb.h this is
mdb_get()
put
-
Put values into the database. In other systems, this might be called "
set
".Usage:
put(key, value, overwrite = TRUE, append = FALSE)
Arguments:
key
: The name of the key (string or raw vector)value
: The value to save (string or raw vector)overwrite
: Logical - whenTRUE
it will overwrite existing data; whenFALSE
throw an errorappend
: Logical - whenTRUE
, append the given key/value to the end of the database. This option allows fast bulk loading when keys are already known to be in the correct order. But if you load unsorted keys withappend = TRUE
an error will be thrown
Note: In lmdb.h this is
mdb_put()
del
-
Remove a key/value pair from the database
Usage:
del(key)
Arguments:
key
: The name of the key (string or raw vector)
Value: A scalar logical, indicating if the value was deleted
Note: In lmdb.h this is
mdb_del()
exists
-
Test if a key exists in the database.
Usage:
exists(key)
Arguments:
key
: The name of the key to test (string or raw vector). Unlikeget
,put
anddel
(but likemget
,mput
andmdel
),exists
is vectorised. So the input here can be; a character vector of any length (returning the same length logical vector), a raw vector (representing one key, returning a scalar logical) or alist
with each element being either a scalar character or a raw vector, returning a logical the same length as the list.
Details: This is an extension of the raw LMDB API and works by using
mdb_get
for each key (which for lmdb need not copy data) and then testing whether the return value isMDB_SUCCESS
orMDB_NOTFOUND
.Value: A logical vector
list
-
List keys in the database
Usage:
list(starts_with = NULL, as_raw = FALSE, size = NULL)
Arguments:
starts_with
: Optionally, a prefix for all strings. Note that is not a regular expression or a filename glob. Usingfoo
will matchfoo
,foo:bar
andfoobar
but notfo
orFOO
. Because LMDB stores keys in a sorted tree, using a prefix can greatly reduce the number of keys that need to be tested.as_raw
: Same interpretation asas_raw
in$get()
but with a different default. It is expected that most of the time keys will be strings, so by default we'll try and return a character vectoras_raw = FALSE
. Change the default if your database contains raw keys.size
: For use withstarts_with
, optionally a guess at the number of keys that would be returned. withstarts_with = NULL
we can look the number of keys up directly so this is ignored.
mget
-
Get values for multiple keys at once (like
$get
but vectorised overkey
)Usage:
mget(key, as_proxy = FALSE, as_raw = NULL)
Arguments:
key
: The keys to get values for. Zero, one or more keys are allowed.as_proxy
: Logical, indicating if a list ofmdb_proxy
objects should be returned.as_raw
: As for$get()
, logical (orNULL
) indicating if raw or string output is expected or desired.
mput
-
Put multiple values into the database (like
$put
but vectorised overkey
/value
).Usage:
mput(key, value, overwrite = TRUE, append = FALSE)
Arguments:
key
: The keys to setvalue
: The values to set against these keys. Must be the same length askey
.overwrite
: As for$put
append
: As for$put
Details: The implementation simply calls
mdb_put
repeatedly (but with a single round of error checking) so duplicatekey
entries will result in the last key winning. mdel
-
Delete multiple values from the database (like
$del
but vectorised overkey
).Usage:
mdel(key)
Arguments:
key
: The keys to delete
Value: A logical vector, the same length as
key
, indicating if each key was deleted. replace
-
Use a temporary cursor to replace an item; this function will replace the data held at
key
and return the previous value (orNULL
if it doesn't exist). Seemdb_cursor
for fuller documentation.Usage:
replace(key, value, as_raw = NULL)
Arguments:
key
: The key to replacevalue
: The new value value to stkey
toas_raw
: For the returned value, how should the data be returned?
Value: As for
$get()
, a single data item as either a string or raw vector. pop
-
Use a temporary cursor to "pop" an item; this function will delete an item but return the value that it had as it deletes it.
Usage:
pop(key, as_raw = NULL)
Arguments:
key
: The key to popas_raw
: For the returned value, how should the data be returned?
Value: As for
$get()
, a single data item as either a string or raw vector. cmp
-
Compare two keys for ordering
Usage:
cmp(a, b)
Arguments:
a
: A key (string or raw); it need not be in the databaseb
: A key to compare with b (string or raw)
Value: A scalar integer, being -1 (if a < b), 0 (if a == b) or 1 (if a > b).
Note: In lmdb.h this is
mdb_cmp()
Examples
# Start by creating a new environment, and within that a write
# transaction
env <- thor::mdb_env(tempfile())
txn <- env$begin(write = TRUE)
# With this transaction we can write values and see them as set
txn$put("a", "hello")
txn$get("a")
# But because the transaction is not committed, any new
# transaction will not see the values:
env$get("a", missing_is_error = FALSE) # NULL
txn2 <- env$begin()
txn2$get("a", missing_is_error = FALSE) # NULL
# Once we commit a transaction, *new* transactions will see the
# value
txn$commit()
env$get("a") # "hello"
env$begin()$get("a") # "hello"
# But old transactions retain their consistent view of the database
txn2$get("a", missing_is_error = FALSE)
# Cleanup
env$destroy()