proteins_1host {vDiveR} | R Documentation |
DiMA (v4.1.1) JSON converted-CSV Output Sample 1
Description
A dummy dataset with two proteins (A and B) from one host, human
Usage
proteins_1host
Format
A data frame with 806 rows and 17 variables:
- proteinName
name of the protein
- position
starting position of the aligned, overlapping k-mer window
- count
number of k-mer sequences at the given position
- lowSupport
k-mer position with sequences lesser than the minimum support threshold (TRUE) are considered of low support, in terms of sample size
- entropy
level of variability at the k-mer position, with zero representing completely conserved
- indexSequence
the predominant sequence (index motif) at the given k-mer position
- index.incidence
the fraction (in percentage) of the index sequences at the k-mer position
- major.incidence
the fraction (in percentage) of the major sequence (the predominant variant to the index) at the k-mer position
- minor.incidence
the fraction (in percentage) of minor sequences (of frequency lesser than the major variant, but not singletons) at the k-mer position
- unique.incidence
the fraction (in percentage) of unique sequences (singletons, observed only once) at the k-mer position
- totalVariants.incidence
the fraction (in percentage) of sequences at the k-mer position that are variants to the index (includes: major, minor and unique variants)
- distinctVariant.incidence
incidence of the distinct k-mer peptides at the k-mer position
- multiIndex
presence of more than one index sequence of equal incidence
- host
species name of the organism host to the virus
- highestEntropy.position
k-mer position that has the highest entropy value
- highestEntropy
highest entropy values observed in the studied protein
- averageEntropy
average entropy values across all the k-mer positions