w_Wikipedias {wikiTools} | R Documentation |
Get Wikipedia pages of Wikidata entities
Description
Get from Wikidata all Wikipedia page titles and URL of the Wikidata entities
in entity_list. If parameter wikilangs
=”, then returns all Wikipedia page
titles, else only the languages in wikilangs
. The returned dataframe also
includes the Wikidata entity classes of which the searched entity is
an instance. If set the parameter instanceof
, then only returns the pages
for Wikidata entities which are instances of the Wikidata class indicated in
it. The data-frame doesn't return labels or descriptions about entities: the
function w_LabelDesc
can be used for this. Duplicated entities are deleted
before search. Index of the data-frame returned are also set to entity_list.
Usage
w_Wikipedias(
entity_list,
wikilangs = "",
instanceof = "",
nlimit = 1500,
debug = FALSE
)
Arguments
entity_list |
A vector of Wikidata entities. |
wikilangs |
List of languages to limit the search, using "|" as separator. Wikipedias page titles are returned in same order as languages in this parameter. If wikilangs=” the function returns Wikipedia page titles in any language, not sorted. |
instanceof |
Wikidata entity class to limit the result to the instances of that class. For example, if instanceof='Q5', limit the results to "human". |
nlimit |
If the number of entities exceeds this number, chunked queries are done. This is the number of entities requested in each chunk. Please, reduce the default value if error is raised. |
debug |
For debugging purposes (default FALSE). If debug='info' information about chunked queries is shown. If debug='query' also the query launched is shown. |
Value
A data-frame with five columns: entities, instanceof, npages, page titles and page URLs. Last three use "|" as separator. Index of data-frame is also set to the entity_list.
Author(s)
Angel Zazo, Department of Computer Science and Automatics, University of Salamanca
Examples
## Not run:
# aux: get a vector of entities (l).
df <- w_SearchByLabel(string='Napoleon', langsorder='en', mode='inlabel')
l <- df$entity # aprox. 3600
w <- w_Wikipedias(entity_list=l, debug='info')
w <- w_Wikipedias(entity_list=l, wikilangs='es|en|fr', debug='info')
# Filter instanceof=Q5 (human):
w_Q5 <- w[grepl("\\bQ5\\b", w$instanceof), ]
w_Q5b <- w_Wikipedias(entity_list=l, wikilangs='es|en|fr', instanceof='Q5', debug='info')
## End(Not run)