Data Formats {cubfits} | R Documentation |
Data Formats
Description
Data formats used in cubfits.
Format
All are in simple formats as S3 default lists or data frames.
Details
-
Format
b
:
A named listA
contains amino acids. Each element of the listA[[i]]
is a list of elementscoefficients
(coefficients of log(mu) and Delta.t),coef.mat
(matrix format ofcoefficients
), andR
(covariance matrix ofcoefficients
). Note thatcoefficients
andR
are typically as in the output ofvglm()
of VGAM package. Also,coef.mat
andR
may miss in some cases.
e.g.A[[i]]$coef.mat
is the regression beta matrix ofi
-th amino acid. -
Format
bVec
:
A vector simply contains all coefficients of ab
objectA
. Note that this is probably only used inside MCMC or the output ofvglm()
of VGAM package.
e.g.do.call("c", lapply(A, function(x) x$coefficients))
. -
Format
n
:
A named listA
contains amino acids. Each element of the listA[[i]]
is a vector containing total codon counts.
e.g.A[[i]][j]
is forj
-th ORF ofi
-th amino acidnames(A)[i]
. -
Format
n.list
:
A named listA
contains ORFs. Each element of the listA[[i]]
is a named list of amino acid containing total count.
e.g.A[[i]][[j]]
contains total count ofj
-th amino acid ini
-th ORF. -
Format
phi.df
:
A data frameA
contains two columnsORF
andphi.value
.
e.g.A[i,]
is fori
-th ORF. -
Format
reu13.df
:
A named listA
contains amino acids. Each element is a data frame summarizing ORF and expression. The data frame has four to five columns includingORF
,phi
(expression),Pos
(amino acid position),Codon
(synonymous codon), andCodon.id
(synonymous codon id, for computing only). Note thatCodon.id
may miss in some cases.
e.g.A[[i]][17,]
is the 17-th recode ofi
-th amino acid. -
Format
reu13.list
:
A named listA
contains ORFs. Each element is a named listA[[i]]
contains amino acids. Each element of nested listA[[i]][[j]]
is a position vector of synonymous codon.
e.g.A[[i]][[j]][k]
is thek
-th synonymous codon position ofj
-th amino acid in thei
-th ORF. -
Format
scuo
:
A data frame of 8 named columns includesAA
(amino acid),ORF
,C1
, ...,C6
whereC*
's are for codon counts. -
Format
seq.string
:
Default outputs ofread.fasta()
of seqinr package. A named listA
contains ORFs. Each element of the list is a long string of a ORF.
e.g.A[[i]][1]
orA[[i]]
is the sequence ofi
-th ORF. -
Format
seq.data
:
Converted fromseq.string
format. A named listA
contains ORFs. Each element of the listA[[i]]
is a string vector. Each element of the vector is a codon string.
e.g.A[[i]][j]
isi
-th ORF andj
-th codon. -
Format
phi.Obs
:
A named vectorA
of observed expression values and possibly with measurement errors.
e.g.A[i]
is the observed phi value ofi
-th ORF. -
Format
y
:
A named listA
contains amino acids. Each element of the listA[[i]]
is a matrix where ORFs are in row and synonymous codons are in column. The element of the matrix contains codon counts.
e.g.A[[i]][j, k]
is the count fori
-th amino acid,j
-th ORF, andk
-th synonymous codon. -
Format
y.list
:
A named listA
contains ORFs. Each element of the listA[[i]]
is a named listA[[i]][[j]]
contains amino acids. The element of amino acids list is a codon count vector.
e.g.A[[i]][[j]][k]
is the count fori
-th ORF,j
-th amino acid, andk
-th synonymous codon.
Author(s)
Wei-Chen Chen wccsnow@gmail.com.
References
https://github.com/snoweye/cubfits/