proteins_1host {vDiveR}R Documentation

DiMA (v4.1.1) JSON converted-CSV Output Sample 1

Description

A dummy dataset with two proteins (A and B) from one host, human

Usage

proteins_1host

Format

A data frame with 806 rows and 17 variables:

proteinName

name of the protein

position

starting position of the aligned, overlapping k-mer window

count

number of k-mer sequences at the given position

lowSupport

k-mer position with sequences lesser than the minimum support threshold (TRUE) are considered of low support, in terms of sample size

entropy

level of variability at the k-mer position, with zero representing completely conserved

indexSequence

the predominant sequence (index motif) at the given k-mer position

index.incidence

the fraction (in percentage) of the index sequences at the k-mer position

major.incidence

the fraction (in percentage) of the major sequence (the predominant variant to the index) at the k-mer position

minor.incidence

the fraction (in percentage) of minor sequences (of frequency lesser than the major variant, but not singletons) at the k-mer position

unique.incidence

the fraction (in percentage) of unique sequences (singletons, observed only once) at the k-mer position

totalVariants.incidence

the fraction (in percentage) of sequences at the k-mer position that are variants to the index (includes: major, minor and unique variants)

distinctVariant.incidence

incidence of the distinct k-mer peptides at the k-mer position

multiIndex

presence of more than one index sequence of equal incidence

host

species name of the organism host to the virus

highestEntropy.position

k-mer position that has the highest entropy value

highestEntropy

highest entropy values observed in the studied protein

averageEntropy

average entropy values across all the k-mer positions


[Package vDiveR version 1.2.1 Index]