| DSM_GoodsMatrix {wordspace} | R Documentation | 
A Scored Co-occurrence Matrix of Nouns Denoting Goods (wordspace)
Description
A pre-scored verb-object co-occurrence matrix for 240 target nouns denoting goods and the 3 feature verbs own, buy and sell. This matrix is useful for illustrating the application and purpose of dimensionality reduction techniques.
Usage
DSM_GoodsMatrix
Format
A numeric matrix with 240 rows corresponding to target nouns denoting goods and 4 columns, corresponding to
- own,- buy,- sell:
- 
association scores for co-occurrences of the nouns with the verbs own, buy and sell 
- fringe:
- 
an indicator of how close each point is to the “fringe” of the data set (ranging from 0 to 1) 
Details
Co-occurrence data are based on verb-object dependency relations in the British National Corpus, obtained from DSM_VerbNounTriples_BNC.  Only nouns that co-occur with all three verbs are included in the data set.
The co-occurrence matrix is weighted with non-sparse log-likelihood (simple-ll) and an additional logarithmic transformation (log).  Row vectors are not normalized.
The fringeness score in column fringe indicates how close a data point is to the fringe of the data set.  Values are distance quantiles based on PCA-whitened Manhattan distance from the centroid.  For example, fringe >= .8 characterizes 20% of points that are closest to the fringe.  Fringeness is mainly used to select points to be labelled in plots or to take stratified samples from the data set.
Examples
DSM_GoodsMatrix[c("time", "goods", "service"), ]