bfsl {bfsl} R Documentation

## Calculates the Best-fit Straight Line

### Description

`bfsl` calculates the best-fit straight line to independent points with (possibly correlated) normally distributed errors in both coordinates.

### Usage

```bfsl(x, y = NULL, sd_x = 0, sd_y = 1, r = 0, control = bfsl_control())
```

### Arguments

 `x` A vector of x observations or a data frame (or an object coercible by `as.data.frame` to a data frame) containing the named vectors x, y, and optionally sd_x, sd_y and r. If weights w_x and w_y are given, then sd_x and sd_y are calculated from sd_x = 1/sqrt(w_x) and sd_y = 1/sqrt(w_y). Specifying `y`, `sd_x`, `sd_y` or `r` directly as function arguments overwrites these variables in the data structure. `y` A vector of y observations. `sd_x` A vector of x measurement error standard deviations. If it is of length one, all data points are assumed to have the same x standard deviation. `sd_y` A vector of y measurement error standard deviations. If it is of length one, all data points are assumed to have the same y standard deviation. `r` A vector of correlation coefficients between errors in x and y. If it is of length one, all data points are assumed to have the same correlation coefficient. `control` A list of control settings. See `bfsl_control` for the names of the settable control values and their effect.

### Details

`bfsl` provides the general least-squares estimation solution to the problem of fitting a straight line to independent data with (possibly correlated) normally distributed errors in both `x` and `y`.

With `sd_x = 0` the (weighted) ordinary least squares solution is obtained. The calculated standard errors of the slope and intercept multiplied with `sqrt(chisq)` correspond to the ordinary least squares standard errors.

With `sd_x = c`, `sd_y = d`, where `c` and `d` are positive numbers, and `r = 0` the Deming regression solution is obtained. If additionally `c = d`, the orthogonal distance regression solution, also known as major axis regression, is obtained.

Setting `sd_x = sd(x)`, `sd_y = sd(y)` and `r = 0` leads to the geometric mean regression solution, also known as reduced major axis regression or standardised major axis regression.

The goodness of fit metric `chisq` is a weighted reduced chi-squared statistic. It compares the deviations of the points from the fit line to the assigned measurement error standard deviations. If `x` and `y` are indeed related by a straight line, and if the assigned measurement errors are correct (and normally distributed), then `chisq` will equal 1. A `chisq > 1` indicates underfitting: the fit does not fully capture the data or the measurement errors have been underestimated. A `chisq < 1` indicates overfitting: either the model is improperly fitting noise, or the measurement errors have been overestimated.

### Value

An object of class "`bfsl`", which is a `list` containing the following components:

 `coefficients` A `2x2` matrix with columns of the fitted coefficients (intercept and slope) and their standard errors. `chisq` The goodness of fit (see Details). `control` The control `list` used, see the `control` argument. `convInfo` A `list` with convergence information. `call` The matched call. `data` A `list` containing `x`, `y`, `sd_x`, `sd_y` and `r`.

### References

York, D. (1968). Least squares fitting of a straight line with correlated errors. Earth and Planetary Science Letters, 5, 320–324, https://doi.org/10.1016/S0012-821X(68)80059-7

### Examples

```x = pearson_york\$x
y = pearson_york\$y
sd_x = 1/sqrt(pearson_york\$w_x)
sd_y = 1/sqrt(pearson_york\$w_y)
bfsl(x, y, sd_x, sd_y)

fit = bfsl(pearson_york)
plot(fit)

```

[Package bfsl version 0.1.0 Index]