cr_works {rcrossref} | R Documentation |
Search CrossRef works (articles)
Description
Search CrossRef works (articles)
Usage
cr_works(
dois = NULL,
query = NULL,
filter = NULL,
offset = NULL,
limit = NULL,
sample = NULL,
sort = NULL,
order = NULL,
facet = FALSE,
cursor = NULL,
cursor_max = 5000,
.progress = "none",
flq = NULL,
select = NULL,
async = FALSE,
...
)
cr_works_(
dois = NULL,
query = NULL,
filter = NULL,
offset = NULL,
limit = NULL,
sample = NULL,
sort = NULL,
order = NULL,
facet = FALSE,
cursor = NULL,
cursor_max = 5000,
.progress = "none",
parse = FALSE,
flq = NULL,
select = NULL,
async = FALSE,
...
)
Arguments
dois |
Search by a single DOI or many DOIs. Note that using this
parameter at the same time as the |
query |
Query terms |
filter |
Filter options. See examples for usage examples
and |
offset |
Number of record to start at. Minimum: 1. For
|
limit |
Number of results to return in the query. Not relavant when searching with specific dois. Default: 20. Max: 1000 |
sample |
(integer) Number of random results to return. when you use
the sample parameter, the rows and offset parameters are ignored.
Ignored unless |
sort |
Field to sort on. Acceptable set of fields to sort on:
|
order |
(character) Sort order, one of 'asc' or 'desc' |
facet |
(logical) Include facet results. Boolean or string with
field to facet on. Valid fields are *, affiliation, funder-name,
funder-doi, orcid, container-title, assertion, archive, update-type,
issn, published, source, type-name, publisher-name, license,
category-name, assertion-group. Default: |
cursor |
(character) Cursor character string to do deep paging.
Default is None. Pass in '*' to start deep paging. Any combination of
query, filters and facets may be used with deep paging cursors.
While the |
cursor_max |
(integer) Max records to retrieve. Only used when
cursor param used. Because deep paging can result in continuous requests
until all are retrieved, use this parameter to set a maximum number of
records. Of course, if there are less records found than this value,
you will get only those found. When cursor pagination is being used
the |
.progress |
Show a |
flq |
field queries. One or more field queries. Acceptable set of field query parameters are:
Note: |
select |
(character) One or more field to return (only those fields are returned) |
async |
(logical) use async HTTP requests. Default: |
... |
Named parameters passed on to |
parse |
(logical) Whether to output json |
Deep paging (using the cursor)
When using the cursor, a character string called next-cursor
is
returned from Crossref that we use to do the next request, and so on. We
use a while loop to get number of results up to the value of
cursor_max
. Since we are doing each request for you, you may not
need the next-cursor
string, but if you do want it, you can get
to it by indexing into the result like x$meta$next_cursor
Note that you can pass in curl options when using cursor, via "..."
Beware
The API will only work for CrossRef DOIs.
Functions
-
cr_works()
- Does data request and parses to data.frame for easy downstream consumption -
cr_works_()
- Does data request, and gives back json (default) or lists, with no attempt to parse to data.frame's
Explanation of some data fields
score: a term frequency, inverse document frequency score that comes from the Crossref Solr backend, based on bibliographic metadata fields title, publication title, authors, ISSN, publisher, and date of publication.
Note
See the "Rate limiting" seciton in rcrossref to get into the "fast lane"
References
https://github.com/CrossRef/rest-api-doc
See Also
Other crossref:
cr_funders()
,
cr_journals()
,
cr_licenses()
,
cr_members()
,
cr_prefixes()
,
cr_types()
Examples
## Not run:
# Works funded by the NSF
cr_works(query="NSF")
# Works that include renear but not ontologies
cr_works(query="renear+-ontologies")
# Filter
cr_works(query="global state", filter=c(has_orcid=TRUE), limit=3)
# Filter by multiple fields
cr_works(filter=c(has_orcid=TRUE, from_pub_date='2004-04-04'))
# Only full text articles
cr_works(filter=c(has_full_text = TRUE))
# has affilitation data
cr_works(filter=c(has_affiliation = TRUE))
# has abstract
cr_works(filter=c(has_abstract = TRUE))
# has clinical trial number
cr_works(filter=c(has_clinical_trial_number = TRUE))
# Querying dois
cr_works(dois='10.1063/1.3593378')
cr_works('10.1371/journal.pone.0033693')
cr_works(dois='10.1007/12080.1874-1746')
cr_works(dois=c('10.1007/12080.1874-1746','10.1007/10452.1573-5125',
'10.1111/(issn)1442-9993'))
# progress bar
cr_works(dois=c('10.1007/12080.1874-1746','10.1007/10452.1573-5125'),
.progress="text")
# Include facetting in results
cr_works(query="NSF", facet=TRUE)
## Get facets only, by setting limit=0
cr_works(query="NSF", facet=TRUE, limit=0)
## you can also set facet to a query
cr_works(facet = "license:*", limit=0)
# Sort results
cr_works(query="ecology", sort='relevance', order="asc")
res <- cr_works(query="ecology", sort='score', order="asc")
res$data$score
cr_works(query="ecology", sort='published')
x=cr_works(query="ecology", sort='published-print')
x=cr_works(query="ecology", sort='published-online')
# Get a random number of results
cr_works(sample=1)
cr_works(sample=10)
# You can pass in dot separated fields to filter on specific fields
cr_works(filter=c(award.number='CBET-0756451',
award.funder='10.13039/100000001'))
# Use the cursor for deep paging
cr_works(query="NSF", cursor = "*", cursor_max = 300, limit = 100)
cr_works(query="NSF", cursor = "*", cursor_max = 300, limit = 100,
facet = TRUE)
## with optional progress bar
x <- cr_works(query="NSF", cursor = "*", cursor_max = 1200, limit = 200,
.progress = TRUE)
# Low level function - does no parsing to data.frame, get json or a list
cr_works_(query = "NSF")
cr_works_(query = "NSF", parse=TRUE)
cr_works_(query="NSF", cursor = "*", cursor_max = 300, limit = 100)
cr_works_(query="NSF", cursor = "*", cursor_max = 300, limit = 100,
parse=TRUE)
# field queries
## query.author
res <- cr_works(query = "ecology", flq = c(query.author = 'Boettiger'))
## query.container-title
res <- cr_works(query = "ecology",
flq = c(`query.container-title` = 'Ecology'))
## query.author and query.bibliographic
res <- cr_works(query = "ecology",
flq = c(query.author = 'Smith', query.bibliographic = 'cell'))
# select only certain fields to return
res <- cr_works(query = "NSF", select = c('DOI', 'title'))
names(res$data)
# asyc
queries <- c("ecology", "science", "cellular", "birds", "European",
"bears", "beets", "laughter", "hapiness", "funding")
res <- cr_works(query = queries, async = TRUE)
res_json <- cr_works_(query = queries, async = TRUE)
unname(vapply(res_json, class, ""))
jsonlite::fromJSON(res_json[[1]])
queries <- c("ecology", "science", "cellular")
res <- cr_works(query = queries, async = TRUE, verbose = TRUE)
res
# time
queries <- c("ecology", "science", "cellular", "birds", "European",
"bears", "beets", "laughter", "hapiness", "funding")
system.time(cr_works(query = queries, async = TRUE))
system.time(lapply(queries, function(z) cr_works(query = z)))
## End(Not run)