discovery_probability {cellOrigins} | R Documentation |

## In situ discovery probability as a function of FPKM

### Description

Groups transcripts by expression strength and calculates for each such group the percentage of genes that gave a positive staining signal in the in situ hybridisation.

If the sequenced material matches the in situ hybridisation tissue, then weakly expressed genes in the sequenced material should be rearely in the in situ staining set of genes. Strongly expressed genes should correspondingly often also stain during hybridisation. Overall, if the match is not spurious, there should be a logarithmic dose-response relationship between sequencing read coverage and staining probability. In a plot of discovery probability against log(coverage) this shows as an approximately straight line (see example).

### Usage

```
discovery_probability(seq_signature, terms, cut.points,
insitu=cellOrigins::BDGP_insitu_dmel_embryo)
```

### Arguments

`seq_signature` |
A named vector containing FPKM RNAseq data. Each element name must correspond to the names used in the |

`terms` |
A vector of anatomical terms which together are assumed to be the origin of the RNAseq data. |

`cut.points` |
A vector of cut points for grouping of values. E.g. 0:3 denotes the bins 0<=x<1, 1<=x<2, 2<=x<3. |

`insitu` |
Matrix with in situ hybridisation data. Rows are transcript names (same names as used for |

### Value

A matrix with a row for each bin and three coloumns. The first coloumn is the probability of discovery, the second the number of transcripts in the expression bin that were discovered by in situ hybridisation. The third coloumn is the total number of transcripts in the bin.

### See Also

`iterating_seqVsInsitu`

, `BDGP_insitu_dmel_embryo`

, `discovery.log`

, `discovery.linear`

, `discovery.identic`

, `prior.temporal_proximity_is_good`

, `prior.all_equal`

, `diagnosticPlots`

.

### Examples

```
fpath <- system.file("extdata", "vncMedianCoverage.tsv", package="cellOrigins")
vncExpression <- read.delim(file = fpath, header=FALSE, as.is=TRUE)
expression <- vncExpression$V2
names(expression) <- vncExpression$V1
p <- discovery_probability(expression,
"6|ventral nerve cord", c(0, 2^(0:10)))
plot(x=-1:9, y=p[,1], type="l",
xlab="log2(FPKM)", ylab="p(discovery in situ)")
```

*cellOrigins*version 0.1.3 Index]