can_fetch {spiderbar} | R Documentation |
Test URL paths against a robxp
robots.txt
object
Description
Provide a character vector of URL paths plus optional user agent and this function will return a logical vector indicating whether you have permission to fetch the content at the respective path.
Usage
can_fetch(obj, path = "/", user_agent = "*")
Arguments
obj |
|
path |
path to test |
user_agent |
user agent to test |
Value
logical vector indicating whether you have permission to fetch the content
Examples
gh <- paste0(readLines(system.file("extdata", "github-robots.txt",
package="spiderbar")), collapse="\n")
gh_rt <- robxp(gh)
can_fetch(gh_rt, "/humans.txt", "*") # TRUE
can_fetch(gh_rt, "/login", "*") # FALSE
can_fetch(gh_rt, "/oembed", "CCBot") # FALSE
can_fetch(gh_rt, c("/humans.txt", "/login", "/oembed"))
[Package spiderbar version 0.2.5 Index]