physician_debias {InSilicoVA}R Documentation

Implement physician debias algorithm

Description

This function implements physician debias algorithm proposed in Salter-Townshend and Murphy (2013).

Usage

physician_debias(
  data,
  phy.id,
  phy.code,
  phylist,
  causelist,
  tol = 1e-04,
  max.itr = 5000,
  verbose = FALSE
)

Arguments

data

The original data to be used. It is suggested to use similar input as InterVA4, with the first column being death IDs. The only difference in input is InsilicoVA takes three levels: “present”, “absent”, and “missing (no data)”. Similar to InterVA software, “present” symptoms takes value “Y”; “absent” symptoms take take value “NA” or “”. For missing symptoms, e.g., questions not asked or answered in the original interview, corrupted data, etc., the input should be coded by “.” to distinguish from “absent” category. The order of the columns does not matter as long as the column names are correct. Currently it cannot other non-symptom columns such as subpopulation. And the first column should be the death ID. Everything other than the death ID, physician ID, and physician codes should be symptoms.

phy.id

vector of column names for physician ID

phy.code

vector of column names for physician code

phylist

vector of physician ID used in physician ID columns

causelist

vector of causes used in physician code columns

tol

tolerance of the EM algorithm

max.itr

maximum iteration to run

verbose

logical indicator for printing out likelihood change

Value

code.debias

Individual cause likelihood distribution

csmf

Cause specific distribution in the sample

phy.bias

Bias matrix for each physician

cond.prob

Conditional probability of symptoms given causes

References

M. Salter-Townshend and T. B. Murphy (2013).Sentiment analysis of online media.
In Algorithms from and for Nature and Life, pages 137-145, Springer.

Examples


data(RandomPhysician)
head(RandomPhysician[, 1:10])
## Not run: 
causelist <- c("Communicable", "TB/AIDS", "Maternal", 
               "NCD", "External", "Unknown")
phydebias <- physician_debias(RandomPhysician, phy.id = c("rev1", "rev2"), 
phy.code = c("code1", "code2"), phylist = paste0("doc", c(1:15)), 
causelist = causelist, tol = 0.0001, max.itr = 5000)

# see the first physician's bias matrix
round(phydebias$phy.bias[[1]], 2)

## End(Not run)

[Package InSilicoVA version 1.4.0 Index]