bit64package {bit64}  R Documentation 
A S3 class for vectors of 64bit integers
Description
Package 'bit64' provides fast serializable S3 atomic 64bit (signed) integers
that can be used in vectors, matrices, arrays and data.frames. Methods are
available for coercion from and to logicals, integers, doubles, characters
and factors as well as many elementwise and summary functions.
Version 0.8
With 'integer64' vectors you can store very large integers at the expense
of 64 bits, which is by factor 7 better than 'int64' from package 'int64'.
Due to the smaller memory footprint, the atomic vector architecture and
using only S3 instead of S4 classes, most operations are one to three orders
of magnitude faster: Example speedups are 4x for serialization, 250x for
adding, 900x for coercion and 2000x for object creation. Also 'integer64'
avoids an ongoing (potentially infinite) penalty for garbage collection
observed during existence of 'int64' objects (see code in example section).
Version 0.9
Package 'bit64'  which extends R with fast 64bit integers  now has fast
(singlethreaded) implementations the most important univariate algorithmic
operations (those based on hashing and sorting). We now have methods for
'match', '
'quantile', 'median' and 'summary'. Regarding data management we also have
novel generics 'unipos' (positions of the unique values), 'tiepos' (
positions of ties), 'keypos' (positions of foreign keys in a sorted
dimension table) and derived methods 'as.factor' and 'as.ordered'. This 64
bit functionality is implemented carefully to be not slower than the
respective 32bit operations in Base R and also to avoid outlying waiting
times observed with 'order', 'rank' and 'table' (speedup factors 20/16/200
respective). This increases the dataset size with wich we can work truly
interactive. The speed is achieved by simple heuristic optimizers in high
level functions choosing the best from multiple lowlevel algorithms and
further taking advantage of a novel caching if activated. In an example R
session using a couple of these operations the 64bit integers performed 22x
faster than base 32bit integers, hashcaching improved this to 24x,
sortordercaching was most efficient with 38x (caching hashing and sorting
is not worth it with 32x at duplicated RAM consumption).
Usage
integer64(length)
## S3 method for class 'integer64'
is(x)
## S3 replacement method for class 'integer64'
length(x) < value
## S3 method for class 'integer64'
print(x, quote=FALSE, ...)
## S3 method for class 'integer64'
str(object, vec.len = strO$vec.len, give.head = TRUE, give.length = give.head, ...)
Arguments
length 
length of vector using 
x 
an integer64 vector 
object 
an integer64 vector 
value 
an integer64 vector of values to be assigned 
quote 
logical, indicating whether or not strings should be printed with surrounding quotes. 
vec.len 
see 
give.head 
see 
give.length 
see 
... 
further arguments to the 
Details
Package:  bit64 
Type:  Package 
Version:  0.5.0 
Date:  20111212 
License:  GPL2 
LazyLoad:  yes 
Encoding:  latin1 
Value
integer64
returns a vector of 'integer64',
i.e. a vector of double
decorated with class 'integer64'.
Design considerations
64 bit integers are related to big data: we need them to overcome address space limitations.
Therefore performance of the 64 bit integer type is critical.
In the S language – designed in 1975 – atomic objects were defined to be vectors for a couple of good reasons:
simplicity, option for implicit parallelization, good cache locality.
In recent years many analytical databases have learnt that lesson: column based data bases provide superior performance
for many applications, the result are products such as MonetDB, Sybase IQ, Vertica, Exasol, Ingres Vectorwise.
If we introduce 64 bit integers not natively in Base R but as an external package, we should at least strive to
make them as 'basic' as possible. Therefore the design choice of bit64 not only differs from int64
, it is obvious:
Like the other atomic types in Base R, we model data type 'integer64' as a contiguous atomic
vector in memory,
and we use the more basic S3
class system, not S4
. Like package int64
we want our 'integer64' to be serializeable
,
therefore we also use an existing data type as the basis. Again the choice is obvious: R has only one 64 bit data type: doubles.
By using doubles
, integer64
inherits
some functionality such as is.atomic
, length
,
length<
, names
, names<
, dim
, dim<
, dimnames
, dimnames
.
Our R level functions strictly follow the functional programming paragdim:
no modification of arguments or other sideffects. Before version 0.93 we internally deviated from the strict paradigm
in order to boost performance. Our C functions do not create new return values,
instead we passin the memory to be returned as an argument. This gives us the freedom to apply the Cfunction
to new or old vectors, which helps to avoid unnecessary memory allocation, unnecessary copying and unnessary garbage collection.
Prior to 0.93 within our R functions we also deviated from conventional R programming by not using attr<
and attributes<
because they always did new memory allocation and copying in older R versions. If we wanted to set attributes of return values that we have freshly created,
we instead used functions setattr
and setattributes
from package bit
.
From version 0.93 setattr
is only used for manipulating cache
objects, in ramsort.integer64
and sort.integer64
and in as.data.frame.integer64
.
Arithmetic precision and coercion
The fact that we introduce 64 bit long long integers – without introducing 128bit long doubles – creates some subtle challenges:
Unlike 32 bit integers
, the integer64
are no longer a proper subset of double
.
If a binary arithmetic operation does involve a double
and a integer
, it is a nobrainer to return double
without loss of information. If an integer64
meets a double
, it is not trivial what type to return.
Switching to integer64
limits our ability to represent very large numbers, switching to double
limits our ability
to distinguish x
from x+1
. Since the latter is the purpose of introducing 64 bit integers, we usually return integer64
from functions involving integer64
, for example in c
, cbind
and rbind
.
Different from Base R, our operators +
,

, %/%
and %%
coerce their arguments to
integer64
and always return integer64
.
The multiplication operator *
coerces its first argument to integer64
but allows its second argument to be also double
: the second argument is internaly coerced to 'long double'
and the result of the multiplication is returned as integer64
.
The division /
and power ^
operators also coerce their first argument to integer64
and coerce internally their second argument to 'long double', they return as double
, like sqrt
,
log
, log2
and log10
do.
argument1  op  argument2  >  coerced1  op  coerced2  >  result 
integer64  +  double  >  integer64  +  integer64  >  integer64 
double  +  integer64  >  integer64  +  integer64  >  integer64 
integer64    double  >  integer64    integer64  >  integer64 
double    integer64  >  integer64    integer64  >  integer64 
integer64  %/%  double  >  integer64  %/%  integer64  >  integer64 
double  %/%  integer64  >  integer64  %/%  integer64  >  integer64 
integer64  %%  double  >  integer64  %%  integer64  >  integer64 
double  %%  integer64  >  integer64  %%  integer64  >  integer64 
integer64  *  double  >  integer64  *  long double  >  integer64 
double  *  integer64  >  integer64  *  integer64  >  integer64 
integer64  /  double  >  integer64  /  long double  >  double 
double  /  integer64  >  integer64  /  long double  >  double 
integer64  ^  double  >  integer64  /  long double  >  double 
double  ^  integer64  >  integer64  /  long double  >  double 
Creating and testing S3 class 'integer64'
Our creator function integer64
takes an argument length
, creates an atomic double vector of this length,
attaches an S3 class attribute 'integer64' to it, and that's it. We simply rely on S3 method dispatch and interpret those
64bit elements as 'long long int'.
is.double
currently returns TRUE for integer64
and might return FALSE in a later release.
Consider is.double
to have undefined behaviour and do query is.integer64
before querying is.double
.
The methods is.integer64
and is.vector
both return TRUE
for integer64
.
Note that we did not patch storage.mode
and typeof
, which both continue returning 'double'
Like for 32 bit integer
, mode
returns 'numeric' and as.double
) tries coercing to double
).
It is possible that 'integer64' becomes a vmode
in package ff
.
Further methods for creating integer64
are range
which returns the range of the data type if calles without arguments,
rep
, seq
.
For all available methods on integer64
vectors see the index below and the examples.
Index of implemented methods
creating,testing,printing  see also  description 
NA_integer64_  NA_integer_  NA constant 
integer64  integer  create zero atomic vector 
runif64  runif  create random vector 
rep.integer64  rep  
seq.integer64  seq  
is.integer64  is  
is.integer  inherited from Base R  
is.vector.integer64  is.vector  
identical.integer64  identical  
length<.integer64  length<  
length  inherited from Base R  
names<  inherited from Base R  
names  inherited from Base R  
dim<  inherited from Base R  
dim  inherited from Base R  
dimnames<  inherited from Base R  
dimnames  inherited from Base R  
str  inherited from Base R, does not print values correctly  
print.integer64  print  
str.integer64  str  
coercing to integer64  see also  description 
as.integer64  generic  
as.integer64.bitstring  as.bitstring  
as.integer64.character  character  
as.integer64.double  double  
as.integer64.integer  integer  
as.integer64.integer64  integer64  
as.integer64.logical  logical  
as.integer64.NULL  NULL  
coercing from integer64  see also  description 
as.bitstring  as.bitstring  generic 
as.bitstring.integer64  
as.character.integer64  as.character  
as.double.integer64  as.double  
as.integer.integer64  as.integer  
as.logical.integer64  as.logical  
data structures  see also  description 
c.integer64  c  vector concatenate 
cbind.integer64  cbind  column bind 
rbind.integer64  rbind  row bind 
as.data.frame.integer64  as.data.frame  coerce atomic object to data.frame 
data.frame  inherited from Base R since we have coercion  
subscripting  see also  description 
[.integer64  [  vector and array extract 
[<.integer64  [<  vector and array assign 
[[.integer64  [[  scalar extract 
[[<.integer64  [[<  scalar assign 
binary operators  see also  description 
+.integer64  +  returns integer64 
.integer64    returns integer64 
*.integer64  *  returns integer64 
^.integer64  ^  returns double 
/.integer64  /  returns double 
%/%.integer64  %/%  returns integer64 
%%.integer64  %%  returns integer64 
comparison operators  see also  description 
==.integer64  ==  
!=.integer64  !=  
<.integer64  <  
<=.integer64  <=  
>.integer64  >  
>=.integer64  >=  
logical operators  see also  description 
!.integer64  !  
&.integer64  &  
.integer64    
xor.integer64  xor  
math functions  see also  description 
is.na.integer64  is.na  returns logical 
format.integer64  format  returns character 
abs.integer64  abs  returns integer64 
sign.integer64  sign  returns integer64 
log.integer64  log  returns double 
log10.integer64  log10  returns double 
log2.integer64  log2  returns double 
sqrt.integer64  sqrt  returns double 
ceiling.integer64  ceiling  dummy returning its argument 
floor.integer64  floor  dummy returning its argument 
trunc.integer64  trunc  dummy returning its argument 
round.integer64  round  dummy returning its argument 
signif.integer64  signif  dummy returning its argument 
cumulative functions  see also  description 
cummin.integer64  cummin  
cummax.integer64  cummax  
cumsum.integer64  cumsum  
cumprod.integer64  cumprod  
diff.integer64  diff  
summary functions  see also  description 
range.integer64  range  
min.integer64  min  
max.integer64  max  
sum.integer64  sum  
mean.integer64  mean  
prod.integer64  prod  
all.integer64  all  
any.integer64  any  
algorithmically complex functions  see also  description (caching) 
match.integer64  match  position of x in table (h//o/so) 
%in%.integer64  %in%  is x in table? (h//o/so) 
duplicated.integer64  duplicated  is current element duplicate of previous one? (h//o/so) 
unique.integer64  unique  (shorter) vector of unique values only (h/s/o/so) 
unipos.integer64  unipos  positions corresponding to unique values (h/s/o/so) 
tiepos.integer64  tiepos  positions of values that are tied (//o/so) 
keypos.integer64  keypos  position of current value in sorted list of unique values (//o/so) 
as.factor.integer64  as.factor  convert to (unordered) factor with sorted levels of previous values (//o/so) 
as.ordered.integer64  as.ordered  convert to ordered factor with sorted levels of previous values (//o/so) 
table.integer64  table  unique values and their frequencies (h/s/o/so) 
sort.integer64  sort  sorted vector (/s/o/so) 
order.integer64  order  positions of elements that would create sorted vector (//o/so) 
rank.integer64  rank  (average) ranks of nonNAs, NAs kept in place (/s/o/so) 
quantile.integer64  quantile  (existing) values at specified percentiles (/s/o/so) 
median.integer64  median  (existing) value at percentile 0.5 (/s/o/so) 
summary.integer64  summary  (/s/o/so) 
all.equal.integer64  all.equal  test if two objects are (nearly) equal (/s/o/so) 
helper functions  see also  description 
minusclass  minusclass  removing class attritbute 
plusclass  plusclass  inserting class attribute 
binattr  binattr  define binary op behaviour 
tested I/O functions  see also  description 
read.table  inherited from Base R  
write.table  inherited from Base R  
serialize  inherited from Base R  
unserialize  inherited from Base R  
save  inherited from Base R  
load  inherited from Base R  
dput  inherited from Base R  
dget  inherited from Base R  
Limitations inherited from implementing 64 bit integers via an external package

vector size of atomic vectors is still limited to
.Machine$integer.max
. However, external memory extending packages such asff
orbigmemory
can extend their address space now withinteger64
. Having 64 bit integers also help with those not so obvious address issues that arise once we exchange data with SQL databases and datawarehouses, which use big integers as surrogate keys, e.g. on indexed primary key columns. This puts R into a relatively strong position compared to certain commercial statistical softwares, which sell database connectivity but neither have the range of 64 bit integers, nor have integers at all, nor have a single numeric data type in their macrogluelanguage. 
literals such as
123LL
would require changes to Base R, up to then we need to write (and call)as.integer64(123L)
oras.integer64(123)
oras.integer64('123')
. Only the latter allows to specify numbers beyond Base R's numeric data types and therefore is the recommended way to use – using only one way may facilitate migrating code to literals at a later stage.
Limitations inherited from Base R, Core team, can you change this?

identical
with default parameters does not distinguish all bitpatterns of doubles. For testing purposes we provide a wrapperidentical.integer64
that will distinguish all bitpatterns. It would be desireable to have a single call ofidentical
handle both,double
andinteger64
. the colon operator
:
officially does not dispatches S3 methods, however, we have made it genericfrom < lim.integer64()[1] to < from+99 from:to
As a limitation remains: it will only dispatch at its first argument
from
but not at its secondto
.
is.double
does not dispatches S3 methods, However, we have made it generic and it will returnFALSE
oninteger64
. 
c
only dispatchesc.integer64
if the first argument isinteger64
and it does not recursively dispatch the proper method when called with argumentrecursive=TRUE
Thereforec(list(integer64,integer64))
does not work and for now you can only call
c.integer64(list(x,x))

generic binary operators fail to dispatch *any* userdefined S3 method if the two arguments have two different S3 classes. For example we have two classes
bit
andbitwhich
sparsely representing boolean vectors and we have methods&.bit
and&.bitwhich
. For an expression involving both as inbit & bitwhich
, none of the two methods is dispatched. Instead a standard method is dispatched, which neither handlesbit
norbitwhich
. Although it lacks symmetry, the better choice would be to dispatch simply the method of the class of the first argument in case of class conflict. This choice would allow authors of extension packages providing coherent behaviour at least within their contributed classes. But as long as none of the package authors methods is dispatched, he cannot handle the conflicting classes at all. 
unlist
is not generic and if it were, we would face similar problems as withc()

vector
with argumentmode='integer64'
cannot work without adjustment of Base R 
as.vector
with argumentmode='integer64'
cannot work without adjustment of Base R 
is.vector
does not dispatch its methodis.vector.integer64

mode<
drops the class 'integer64' which is returned fromas.integer64
. Also it does not remove an existing class 'integer64' when assigning mode 'integer'. 
storage.mode<
does not support external data types such asas.integer64

matrix
does drop the 'integer64' class attribute. 
array
does drop the 'integer64' class attribute. In current R versions (1.15.1) this can be circumvented by activating the functionas.vector.integer64
further down this file. However, the CRAN maintainer has requested to removeas.vector.integer64
, even at the price of breaking previously working functionality of the package. 
str
does not print the values ofinteger64
correctly
further limitations

subscripting nonexisting elements and subscripting with
NA
s is currently not supported. Such subscripting currently returns9218868437227407266
instead ofNA
(theNA
value of the underlying double code). Following the full R behaviour here would either destroy performance or require extensive Ccoding.
Note
integer64
are useful for handling database keys and exact counting in +2^63.
Do not use them as replacement for 32bit integers, integer64 are not
supported for subscripting by Rcore and they have different semantics
when combined with double. Do understand that integer64
can only be
useful over double
if we do not coerce it to double
.
While
integer + double > double + double > double
or
1L + 0.5 > 1.5
for additive operations we coerce to integer64
integer64 + double > integer64 + integer64 > integer64
hence
as.integer64(1) + 0.5 > 1LL + 0LL > 1LL
see section "Arithmetic precision and coercion" above
Author(s)
Jens OehlschlĂ¤gel <Jens.Oehlschlaegel@truecluster.com> Maintainer: Jens OehlschlĂ¤gel <Jens.Oehlschlaegel@truecluster.com>
See Also
integer
in base R
Examples
message("Using integer64 in vector")
x < integer64(8) # create 64 bit vector
x
is.atomic(x) # TRUE
is.integer64(x) # TRUE
is.numeric(x) # TRUE
is.integer(x) # FALSE  debatable
is.double(x) # FALSE  might change
x[] < 1:2 # assigned value is recycled as usual
x[1:6] # subscripting as usual
length(x) < 13 # changing length as usual
x
rep(x, 2) # replicate as usual
seq(as.integer64(1), 10) # seq.integer64 is dispatched on first given argument
seq(to=as.integer64(10), 1) # seq.integer64 is dispatched on first given argument
seq.integer64(along.with=x) # or call seq.integer64 directly
# c.integer64 is dispatched only if *first* argument is integer64 ...
x < c(x,runif(length(x), max=100))
# ... and coerces everything to integer64  including double
x
names(x) < letters # use names as usual
x
message("Using integer64 in array  note that 'matrix' currently does not work")
message("as.vector.integer64 removed as requested by the CRAN maintainer")
message("as consequence 'array' also does not work anymore")
message("we still can create a matrix or array by assigning 'dim'")
y < rep(as.integer64(NA), 12)
dim(y) < c(3,4)
dimnames(y) < list(letters[1:3], LETTERS[1:4])
y["a",] < 1:2 # assigning as usual
y
y[1:2,4] # subscripting as usual
# cbind.integer64 dispatched on any argument and coerces everything to integer64
cbind(E=1:3, F=runif(3, 0, 100), G=c("1","0","1"), y)
message("Using integer64 in data.frame")
str(as.data.frame(x))
str(as.data.frame(y))
str(data.frame(y))
str(data.frame(I(y)))
d < data.frame(x=x, y=runif(length(x), 0, 100))
d
d$x
message("Using integer64 with csv files")
fi64 < tempfile()
write.csv(d, file=fi64, row.names=FALSE)
e < read.csv(fi64, colClasses=c("integer64", NA))
unlink(fi64)
str(e)
identical.integer64(d$x,e$x)
message("Serializing and unserializing integer64")
dput(d, fi64)
e < dget(fi64)
identical.integer64(d$x,e$x)
e < d[,]
save(e, file=fi64)
rm(e)
load(file=fi64)
identical.integer64(d,e)
### A couple of unit tests follow hidden in a dontshow{} directive ###
## Not run:
message("== Differences between integer64 and int64 ==")
require(bit64)
require(int64)
message(" integer64 is atomic ")
is.atomic(integer64())
#is.atomic(int64())
str(integer64(3))
#str(int64(3))
message(" The following performance numbers are measured under RWin64 ")
message(" under RWin32 the advantage of integer64 over int64 is smaller ")
message(" integer64 needs 7x/5x less RAM than int64 under 64/32 bit OS
(and twice the RAM of integer as it should be) ")
#as.vector(object.size(int64(1e6))/object.size(integer64(1e6)))
as.vector(object.size(integer64(1e6))/object.size(integer(1e6)))
message(" integer64 creates 2000x/1300x faster than int64 under 64/32 bit OS
(and 3x the time of integer) ")
t32 < system.time(integer(1e8))
t64 < system.time(integer64(1e8))
#T64 < system.time(int64(1e7))*10 # using 1e8 as above stalls our R on an i7 8 GB RAM Thinkpad
#T64/t64
t64/t32
i32 < sample(1e6)
d64 < as.double(i32)
message(" the following timings are rather conservative since timings
of integer64 include garbage collection  due to looped calls")
message(" integer64 coerces 900x/100x faster than int64
under 64/32 bit OS (and 2x the time of coercing to integer) ")
t32 < system.time(for(i in 1:1000)as.integer(d64))
t64 < system.time(for(i in 1:1000)as.integer64(d64))
#T64 < system.time(as.int64(d64))*1000
#T64/t64
t64/t32
td64 < system.time(for(i in 1:1000)as.double(i32))
t64 < system.time(for(i in 1:1000)as.integer64(i32))
#T64 < system.time(for(i in 1:10)as.int64(i32))*100
#T64/t64
t64/td64
message(" integer64 serializes 4x/0.8x faster than int64
under 64/32 bit OS (and less than 2x/6x the time of integer or double) ")
t32 < system.time(for(i in 1:10)serialize(i32, NULL))
td64 < system.time(for(i in 1:10)serialize(d64, NULL))
i64 < as.integer64(i32);
t64 < system.time(for(i in 1:10)serialize(i64, NULL))
rm(i64); gc()
#I64 < as.int64(i32);
#T64 < system.time(for(i in 1:10)serialize(I64, NULL))
#rm(I64); gc()
#T64/t64
t64/t32
t64/td64
message(" integer64 adds 250x/60x faster than int64
under 64/32 bit OS (and less than 6x the time of integer or double) ")
td64 < system.time(for(i in 1:100)d64+d64)
t32 < system.time(for(i in 1:100)i32+i32)
i64 < as.integer64(i32);
t64 < system.time(for(i in 1:100)i64+i64)
rm(i64); gc()
#I64 < as.int64(i32);
#T64 < system.time(for(i in 1:10)I64+I64)*10
#rm(I64); gc()
#T64/t64
t64/t32
t64/td64
message(" integer64 sums 3x/0.2x faster than int64
(and at about 5x/60X the time of integer and double) ")
td64 < system.time(for(i in 1:100)sum(d64))
t32 < system.time(for(i in 1:100)sum(i32))
i64 < as.integer64(i32);
t64 < system.time(for(i in 1:100)sum(i64))
rm(i64); gc()
#I64 < as.int64(i32);
#T64 < system.time(for(i in 1:100)sum(I64))
#rm(I64); gc()
#T64/t64
t64/t32
t64/td64
message(" integer64 diffs 5x/0.85x faster than integer and double
(int64 version 1.0 does not support diff) ")
td64 < system.time(for(i in 1:10)diff(d64, lag=2L, differences=2L))
t32 < system.time(for(i in 1:10)diff(i32, lag=2L, differences=2L))
i64 < as.integer64(i32);
t64 < system.time(for(i in 1:10)diff(i64, lag=2L, differences=2L))
rm(i64); gc()
t64/t32
t64/td64
message(" integer64 subscripts 1000x/340x faster than int64
(and at the same speed / 10x slower as integer) ")
ts32 < system.time(for(i in 1:1000)sample(1e6, 1e3))
t32< system.time(for(i in 1:1000)i32[sample(1e6, 1e3)])
i64 < as.integer64(i32);
t64 < system.time(for(i in 1:1000)i64[sample(1e6, 1e3)])
rm(i64); gc()
#I64 < as.int64(i32);
#T64 < system.time(for(i in 1:100)I64[sample(1e6, 1e3)])*10
#rm(I64); gc()
#(T64ts32)/(t64ts32)
(t64ts32)/(t32ts32)
message(" integer64 assigns 200x/90x faster than int64
(and 50x/160x slower than integer) ")
ts32 < system.time(for(i in 1:100)sample(1e6, 1e3))
t32 < system.time(for(i in 1:100)i32[sample(1e6, 1e3)] < 1:1e3)
i64 < as.integer64(i32);
i64 < system.time(for(i in 1:100)i64[sample(1e6, 1e3)] < 1:1e3)
rm(i64); gc()
#I64 < as.int64(i32);
#I64 < system.time(for(i in 1:10)I64[sample(1e6, 1e3)] < 1:1e3)*10
#rm(I64); gc()
#(T64ts32)/(t64ts32)
(t64ts32)/(t32ts32)
tdfi32 < system.time(dfi32 < data.frame(a=i32, b=i32, c=i32))
tdfsi32 < system.time(dfi32[1e6:1,])
fi32 < tempfile()
tdfwi32 < system.time(write.csv(dfi32, file=fi32, row.names=FALSE))
tdfri32 < system.time(read.csv(fi32, colClasses=rep("integer", 3)))
unlink(fi32)
rm(dfi32); gc()
i64 < as.integer64(i32);
tdfi64 < system.time(dfi64 < data.frame(a=i64, b=i64, c=i64))
tdfsi64 < system.time(dfi64[1e6:1,])
fi64 < tempfile()
tdfwi64 < system.time(write.csv(dfi64, file=fi64, row.names=FALSE))
tdfri64 < system.time(read.csv(fi64, colClasses=rep("integer64", 3)))
unlink(fi64)
rm(i64, dfi64); gc()
#I64 < as.int64(i32);
#tdfI64 < system.time(dfI64<data.frame(a=I64, b=I64, c=I64))
#tdfsI64 < system.time(dfI64[1e6:1,])
#fI64 < tempfile()
#tdfwI64 < system.time(write.csv(dfI64, file=fI64, row.names=FALSE))
#tdfrI64 < system.time(read.csv(fI64, colClasses=rep("int64", 3)))
#unlink(fI64)
#rm(I64, dfI64); gc()
message(" integer64 coerces 40x/6x faster to data.frame than int64
(and factor 1/9 slower than integer) ")
#tdfI64/tdfi64
tdfi64/tdfi32
message(" integer64 subscripts from data.frame 20x/2.5x faster than int64
(and 3x/13x slower than integer) ")
#tdfsI64/tdfsi64
tdfsi64/tdfsi32
message(" integer64 csv writes about 2x/0.5x faster than int64
(and about 1.5x/5x slower than integer) ")
#tdfwI64/tdfwi64
tdfwi64/tdfwi32
message(" integer64 csv reads about 3x/1.5 faster than int64
(and about 2x slower than integer) ")
#tdfrI64/tdfri64
tdfri64/tdfri32
rm(i32, d64); gc()
message(" investigating the impact on garbage collection: ")
message(" the fragmented structure of int64 messes up R's RAM ")
message(" and slows down R's gargbage collection just by existing ")
td32 < double(21)
td32[1] < system.time(d64 < double(1e7))[3]
for (i in 2:11)td32[i] < system.time(gc(), gcFirst=FALSE)[3]
rm(d64)
for (i in 12:21)td32[i] < system.time(gc(), gcFirst=FALSE)[3]
t64 < double(21)
t64[1] < system.time(i64 < integer64(1e7))[3]
for (i in 2:11)t64[i] < system.time(gc(), gcFirst=FALSE)[3]
rm(i64)
for (i in 12:21)t64[i] < system.time(gc(), gcFirst=FALSE)[3]
#T64 < double(21)
#T64[1] < system.time(I64 < int64(1e7))[3]
#for (i in 2:11)T64[i] < system.time(gc(), gcFirst=FALSE)[3]
#rm(I64)
#for (i in 12:21)T64[i] < system.time(gc(), gcFirst=FALSE)[3]
#matplot(1:21, cbind(td32, t64, T64), pch=c("d","i","I"), log="y")
matplot(1:21, cbind(td32, t64), pch=c("d","i"), log="y")
## End(Not run)