CombPredictSpecific {IntegratedMRF}R Documentation

Prediction for testing samples using specific combination weights from integrated RF or MRF model

Description

Generates Random Forest (One Output Feature) or Multivariate Random Forest (More than One Output Feature) model for each subtype of dataset and predicts testing samples using these models. The predictions are combined using the specific combination weights provided by the user. For the input combination weights, the testing cell lines should have the subtype data corresponding to the non-zero weight subtypes.

Usage

CombPredictSpecific(finalX, finalY_train, Cell, finalY_train_cell,
  finalY_test_cell, n_tree, m_feature, min_leaf, Coeff)

Arguments

finalX

List of Matrices where each matrix represent a specific data subtype (such as genomic characterizations for drug sensitivity prediction). Each subtype can have different types of features. For example, if there are three subtypes containing 100, 200 and 250 features respectively, finalX will be a list containing 3 matrices of sizes M x 100, M x 200 and M x 250 where M is the number of Samples.

finalY_train

A M x T matrix of output features for training samples, where M is the number of samples and T is the number of output features. The dataset is assumed to contain no missing values. If there are missing values, an imputation method should be applied before using the function. A function 'Imputation' is included within the package.

Cell

It contains a list of samples (the samples can be represented either numerically by indices or by names) for each data subtype. For the example of 3 data subtypes, it will be a list containing 3 arrays where each array contains the sample information for each data subtype.

finalY_train_cell

Sample names of output features for training samples

finalY_test_cell

Sample names of output features for testing samples (All these testing samples must have features for each subtypes of dataset)

n_tree

Number of trees in the forest, which must be positive integer

m_feature

Number of randomly selected features considered for a split in each regression tree node, which must be a positive integer

min_leaf

Minimum number of samples in the leaf node, which must be a positive integer less than or equal to M (number of training samples)

Coeff

Combination Weights (user defined or some combination weights generated using the 'Combination' function). The size must be C, which is equal to the number of subtypes of dataset given in finalX.

Details

Input feature matrix and output feature matrix have been used to generate Random Forest (One Output Feature) or Multivariate Random Forest (More than One Output Feature) model for each subtype of dataset separately. The prediction of testing samples using each subtype trained model is generated. The predictions are combined using the specific combination weights provided by the user. For the input combination weights, the testing cell lines should have the subtype data corresponding to the non-zero weight subtypes. For instance, if combination weights is [0.6 0.3 0 0.1], then the subtype 1, 2 and 4 needs to be present for the testing samples. Furthermore, all the features should be present for the required subtypes for the testing samples.

Value

Final Prediction of testing samples based on provided testing sample names

Examples

library(IntegratedMRF)
data(Dream_Dataset)
Tree=1
Feature=1
Leaf=10
Confidence=80
finalX=Dream_Dataset[[1]]
Cell=Dream_Dataset[[2]]
Y_train_Dream=Dream_Dataset[[3]]
Y_train_cell=Dream_Dataset[[4]]
Y_test=Dream_Dataset[[5]]
Y_test_cell=Dream_Dataset[[6]]
Drug=1
Y_train_Drug=matrix(Y_train_Dream[,Drug],ncol=length(Drug))
Result=Combination(finalX,Y_train_Drug,Cell,Y_train_cell,Tree,Feature,Leaf,Confidence)

CombPredictSpecific(finalX,Y_train_Drug,Cell,Y_train_cell,Y_test_cell,Tree,
        Feature,Leaf,runif(length(Cell)*1))

[Package IntegratedMRF version 1.1.9 Index]