DataCheck {BaSTA} | R Documentation |
Error checking for BaSTA input data.
Description
A function to check the input data file for a Bayesian Survival Trajectory Analysis (BaSTA) for capture-mark-recapture (CMR) or census data.
Usage
DataCheck (object, dataType = "CMR", studyStart = NULL, studyEnd = NULL, silent = TRUE)
Arguments
object |
A |
dataType |
A |
studyStart |
Only required for |
studyEnd |
Only required for |
silent |
Logical to indicate whether the results should be printed to the console. |
Details
The function checks for inconsistencies in the dataset and reports them back. See value
section for details on the types of errors detected by the function.
DATA SPECIFICATIONS:
1) CMR data:
The input data object
requires the following structure: the first column should be a vector of individual unique IDs, the second and third columns are birth and death years respectively. Columns represent the observation window (i.e., recapture matrix) of
years. This is followed (optionally) by columns for categorical and continuous covariates.
2) census data:
The input data object
requires at least five dates columns, namely “Birth.Date”, “Min.Birth.Date”, “Max.Birth.Date”, “Entry.Date”, and “Depart.Date”. All dates need to be format as “%Y-%m-%d”. In addition, a “Depart.Type” column is required with two types of departures “C” for Censored and “D” for dead.
Value
1) CMR data:
newData |
The original data frame (for consistency with previous versions of BaSTA). |
type1 |
A vector of row numbers in the original data frame where there are deaths occurring before the study starts. |
type2 |
A vector of row numbers in the original data frame where there are no birth/death AND no obervations. |
type3 |
A vector of row numbers in the original data frame where there are births recorded after death. |
type4 |
A vector of row numbers in the original data frame where there are observations (i.e. recaptures) after death. |
type5 |
A vector of row numbers in the original data frame where there are observations (i.e. recaptures) before birth. |
type6 |
A vector of row numbers in the original data frame where the year of birth is not a zero in the recapture matrix. |
summary |
List with summary information, e.g., sample size, number of records with known birth, number of records with known death, etc. |
stopExec |
Logical that indicates if the data are free of errors or not. i.e. |
probDescr |
Character vector explaining the six types of problems the |
dataType |
Type of dataset, i.e., “ |
studyStart |
Integer indicating the study start time. |
studyEnd |
Integer indicating the study end time. |
2) census data:
n |
Integer for the number of rows (i.e., records) in the dataset. |
stopExec |
Logical that indicates if the data are free of errors or not. i.e. |
nas |
List organised by column indicating whether |
DateRan |
Matrix of dates ranges (as character strings) for each date column in the dataset. |
probDescr |
Character vector explaining the seven types of problems the |
MinBBirth |
Vector of indices of rows where “ |
BirthMaxB |
Vector of indices of rows where “ |
MinBMaxB |
Vector of indices of rows where “ |
BirthEntr |
Vector of indices of rows where “ |
MinBEntr |
Vector of indices of rows where “ |
MaxBEntr |
Vector of indices of rows where “ |
EntrDep |
Vector of indices of rows where “ |
DepartType |
Vector of indices of rows where “ |
idUnCens |
Vector of indices of rows for uncensored (i.e., death) records. |
nUnCens |
Integer indicating the number of uncensored records. |
idCens |
Vector of indices of rows for censored records. |
nCens |
Integer indicating the number of uncensored records. |
idNoBirth |
Vector of indices of rows for records with uncertain birth date. |
nNoBirth |
Integer indicating the number of records with uncertain birth date. |
Author(s)
Fernando Colchero fernando_colchero@eva.mpg.de
See Also
FixCMRdata
to fix potential issues for capture-mark-recapture data.
Examples
## CMR data:
## --------- #
## Load data:
data("bastaCMRdat", package = "BaSTA")
## Check data consistency:
checkedData <- DataCheck(bastaCMRdat, dataType = "CMR", studyStart = 51,
studyEnd = 70)
## census data:
## ------------ #
## Load data:
data("bastaCensDat", package = "BaSTA")
## Check data consistency:
checkedData <- DataCheck(object = bastaCensDat, dataType = "census")
## Printed output:
## --------------- #
## Print DataCheck results:
print(checkedData)