SolrList-class {rsolr} | R Documentation |
SolrList
Description
The SolrList
object makes Solr data accessible through a
list-like interface. This interface is appropriate when the data are
highly ragged.
Details
A SolrList
should more or less behave analogously to a list. It
provides the same basic accessors (length
,
names
, [
, [<-
,
[[
, [[<-
, $
,
$<-
, head
, tail
, etc) and
can be coerced to a list via as.list
. Supported types of
data manipulations include subset
,
transform
, sort
, xtabs
,
aggregate
, unique
, summary
,
etc.
An obvious difference between a SolrList
and an ordinary list
is that we know the SolrList
contains only documents, which are
themselves represented as named lists of fields, usually vectors of
length one. This constraint enables us to provide the convenience of
accessing fields by slicing across every document. We can pass a field
selection to the second argument of [
. Like data frame,
selecting a single column with e.g. x[,"foo"]
will return the
field as a vector, filling NAs whereever a document lacks a
value for the field.
The names are taken from the field declared in the schema to
represent the unique document key. Schemas are not strictly required
to declare such a field, so if there is no unique key, the names
are NULL
.
Field restrictions passed to e.g. [
or subset(fields=)
may be specified by name, or wildcard pattern (glob). Similarly, a row
index passed to [
must be either a character vector of
identifiers (of length <= 1024, NAs are not supported, and this
requires a unique key in the schema) or a
SolrPromise
/SolrExpression
,
but note that if it evaluates to NAs, the corresponding rows are
excluded from the result, as with subset
. Using a
SolrPromise
or SolrExpression
is recommended, as
filtering happens at the database.
A SolrList
can be made lazy by calling defer
on a
SolrList
, so that all column retrieval, e.g., via [
,
returns a SolrPromise
object. Many operations on
promises are deferred, until they are finally fulfill
ed by
being shown or through explicit coercion to an R vector.
A note for developers: SolrFrame
and SolrList
share
common functionality through the base Solr
class. Much of the
functionality mentioned here is actually implemented as methods on the
Solr
class.
Accessors
These are some accessors that SolrList
adds on top of the
basic data frame accessors. Most of these are for advanced use only.
-
ndoc(x)
: Gets the number of documents (rows); serves as an abstraction overSolrFrame
andSolrList
-
nfield(x)
: Gets the number of fields (columns); serves as an abstraction overSolrFrame
andSolrList
-
ids(x)
: Gets the document unique identifiers (may beNULL
, treated as rownames); serves as an abstraction overSolrFrame
andSolrList
-
fieldNames(x, ...)
: Gets the name of each field represented by any document in the Solr core, with ... being passed down tofieldNames
onSolrCore
. -
core(x)
: Gets theSolrCore
wrapped byx
-
query(x)
: Gets the query that is being constructed byx
Extended API
Most of the typical data frame accessors and data manipulation
functions will work analogously on SolrList
(see
Details). Below, we list some of the non-standard methods that might
be seen as an extension of the data frame API.
rename(x, ...)
: Renames the columns ofx
, where the names and character values of ... indicates the mapping (newname = oldname
).defer(x)
: Returns aSolrList
that yieldsSolrPromise
objects instead of vectors whenever a field is retrievedsearchDocs(x, q)
: Performs a conventional document search using the query stringq
. The main difference to filtering is that (by default) Solr will order the result by score, i.e., how well each document matches the query.
Constructor
-
SolrList(uri, ...)
: Constructs a newSolrList
instance, representing a Solr core located aturi
, which should be a string or aRestUri
object. The ... are passed to theSolrQuery
constructor.
Evaluation
-
eval(expr, envir, enclos)
: Evaluates R languageexpr
in theSolrList
envir
, usingenclos
as the enclosing environment.
Coercion
-
as.data.frame(x, row.names=NULL, optional=FALSE, fill=FALSE)
: Downloads the data into an actual data.frame, specifically an instance ofDocDataFrame
. Iffill
is FALSE, only the fields represented in at least one document are added as columns. -
as.list(x), as(x, "DocCollection")
: Coercesx
into the corresponding list, specifically an instance ofDocList
.
Author(s)
Michael Lawrence
See Also
SolrFrame
for representing a Solr collection as a
table instead of a list
Examples
solr <- TestSolr()
sr <- SolrList(solr$uri)
length(sr)
head(sr)
sr[["GB18030TEST"]]
# Solr tends to crash for some reason running this inside R CMD check
## Not run:
as.list(subset(sr, price > 100))[,"price"]
## End(Not run)
solr$kill()