impacts {MLID} | R Documentation |
Impact calculations
Description
Calculates the total contribution to the index of dissimilarity of neighbourhoods grouped by regions or other higher-level geographies
Usage
impacts(data, vars, levels, omit = NULL)
Arguments
data |
a data frame with |
vars |
a character or numeric vector of length 2 or 3 giving either
the names or columns positions of the variables in
|
levels |
a character or numeric vector of minimum length 1 identifying
either the names or columns positions of the variables in |
omit |
(optional) a character vector containing the names of places to search for in the data and to omit from the calculations |
Details
When the index of dissimilarity (ID) is estimated as a regression model
the residuals from that model are the differences between the share of
population group Y and the share of population group X that are observed in
each neighbourhood. The impacts
function summaries those differences
by higher-level geographies to consider which places or regions have the
neighbourhoods that contribute most to the ID. The measures are useful
for understanding where the seperations of the two population groups are
greatest. However, to look at scale effects, where the effect of each level
net of the other levels is wanted, fit a multilevel index using
function id
.
Value
A list of data.frames, each containing the impact calculations for the higher-level geographies. The variables are
-
pcntID
The total contribution of the neighbourhoods within the region to the overall ID score, expressed as a percentage -
pcntN
The number of neighbourhoods within the region, expressed as a percentage of the total number indata
-
impact
The ratio ofpcntID
topcntN
multiplied by 100. Values over 100 indicate a group of neighbourhoods that have a disproportionately high impact on the ID -
scldMean
The average difference between the share of the Y population and the share of the X population, scaled by the standard error of the differences for the whole data set (to give a z-value). Positive values mean that, on average, the region has a greater share of the Y population than the X. Negative values mean it has less. -
scldSD
A measure of how much the differences between the shares of the two populations vary within the region. It is the standard deviation of those differences scaled by the standard error for the whole data set. Higher values indicate greater variability within the region. -
scldMin
The minimum difference between the share of the Y population and the share of the X for neighbourhoods within the region, scaled by the standard error -
scldMax
The maximum difference between the share of the Y population and the share of the X for neighbourhoods within the region, scaled by the standard error -
pNYgtrNX
The percentage of neighbourhoods within the region where the count of population group Y (as opposed to the share) is greater than the count of population group X
Examples
data(aggdata)
impx <- impacts(aggdata, c("Bangladeshi", "WhiteBrit"), c("LAD","RGN"))
head(impx)
# sorted by impact score
# For $RGN London has the greatest impact on the ID
# The 'excess' share of the Bangladeshi population is not especially
# significant (see scldMean) but there is a lot of variation between
# neighbourhoods (see scldSD)
# For $LAD note the impacts of Tower Hamlets and Newham