estimate_p_hat {bumblebee} | R Documentation |
estimate_p_hat
Estimates probability of linkage between two individuals
Description
This function computes the probability that pathogen sequences from two individuals randomly sampled from their respective population groups (e.g. communities) are linked.
Usage
estimate_p_hat(df_counts, ...)
## Default S3 method:
estimate_p_hat(df_counts, ...)
Arguments
df_counts |
A data.frame returned by the function: |
... |
Further arguments. |
Details
For a population group pairing (u,v)
, p_hat
is computed as the
fraction of distinct possible pairs between samples from groups u
and
v
that are linked. Note: The number of distinct possible (u,v)
pairs in the sample is the product of sampled individuals in groups u
and u
. If u = v
, then the distinct possible pairs is the number
of individuals sampled in population group u
choose 2. See bumblebee
website for more details https://magosil86.github.io/bumblebee/.
Value
Returns a data.frame containing:
H1_group, Name of population group 1
H2_group, Name of population group 2
number_hosts_sampled_group_1, Number of individuals sampled from population group 1
number_hosts_sampled_group_2, Number of individuals sampled from population group 2
number_hosts_population_group_1, Estimated number of individuals in population group 1
number_hosts_population_group_2, Estimated number of individuals in population group 2
max_possible_pairs_in_sample, Number of distinct possible transmission pairs between individuals sampled from population groups 1 and 2
max_possible_pairs_in_population, Number of distinct possible transmission pairs between individuals in population groups 1 and 2
num_linked_pairs_observed, Number of observed directed transmission pairs between samples from population groups 1 and 2
p_hat, Probability that pathogen sequences from two individuals randomly sampled from their respective population groups are linked
Methods (by class)
-
default
: Estimates probability of linkage between two individuals
References
Magosi LE, et al., Deep-sequence phylogenetics to quantify patterns of HIV transmission in the context of a universal testing and treatment trial – BCPP/ Ya Tsie trial. To submit for publication, 2021.
Carnegie, N.B., et al., Linkage of viral sequences among HIV-infected village residents in Botswana: estimation of linkage rates in the presence of missing data. PLoS Computational Biology, 2014. 10(1): p. e1003430.
See Also
See prep_p_hat
to prepare input data to estimate p_hat
Examples
library(bumblebee)
library(dplyr)
# Estimate the probability of linkage between two individuals randomly sampled from
# two population groups of interest.
# We shall use the data of HIV transmissions within and between intervention and control
# communities in the BCPP/Ya Tsie HIV prevention trial. To learn more about the data
# ?counts_hiv_transmission_pairs and ?sampling_frequency
# Prepare input to estimate p_hat
# View counts of observed directed HIV transmissions within and between intervention
# and control communities
counts_hiv_transmission_pairs
# View the estimated number of individuals with HIV in intervention and control
# communities and the number of individuals sampled from each
sampling_frequency
results_prep_p_hat <- prep_p_hat(group_in = sampling_frequency$population_group,
individuals_sampled_in = sampling_frequency$number_sampled,
individuals_population_in = sampling_frequency$number_population,
linkage_counts_in = counts_hiv_transmission_pairs,
verbose_output = FALSE)
# View results
results_prep_p_hat
# Estimate p_hat
results_estimate_p_hat <- estimate_p_hat(df_counts = results_prep_p_hat)
# View results
results_estimate_p_hat