compareOverlap {Ecfun}R Documentation

Compare y between newDat and refDat for shared values of x


Compute dy <- (y - yRef) for all cases where x == xRef, where x and y are columns of newDat and xRef and yRef are columns of refDat.

Also compute dyRef <- dy / yRef.

Return silently a data.frame with columns x, y, yRef, dy, and dyRef.

Also if min(yRef)*max(yRef)>0 plot(dyRef) else plot(dy).


compareOverlap(y=2, yRef=y, x=1, 
      xRef=x, newDat, refDat, 
      ignoreCase=TRUE, ...)


y, yRef

columns of newDat, refDat, respectively, to compare, ignoring case in the names unless ignoreCase is FALSE.

x, xRef

columns of newDat, refDat, respectively, to match when comparing y with yRef.

As with y and yRef, ignore case in name matching unless ignoreCase is FALSE.

newDat, refDat

data.frames of new and reference data in which to search for overlap, i.e., common values of newD[, x] and refDat[, xRef], and for those observations to compare newDat[, y] to refDat[, yRef].


logical: If TRUE, ignore case when searching for columns of newDat and refDat to match y, yRef, x, and xRef.


optional arguments to pass to plot


This function is particularly useful for updating datasets that are obtained from sources like the Bureau of Justice Statistics, which publish many series with each update including the most recent 11 years. This function can be used to evaluate the extent of equivalence between, e.g., historical data in refDat with the latest data in newDat.


Invisibly return a data.frame with columns x, paste0(y, 'New'), past0(yRef, 'Ref'), dy, and dyRef of the data compared.


Spencer Graves


nDat <- data.frame(yr=2000:2015, 
rDat <- data.frame(Yr=2018:2011, 
          y=c(17:13, 13:11))
nrDat <- compareOverlap(
  newDat=nDat, refDat=rDat)

# Correct answer
NRdat <- data.frame(yr=2011:2015, 
  YNew=11:15, yRef=c(11:13, 13:14), 
  dy=c(0,0,0, 1, 1), 
  dyRef=c(0,0,0, 1,1) / 
        c(11:13, 13:14))

all.equal(nrDat, NRdat)

[Package Ecfun version 0.3-2 Index]