RobotParser {Rcrawler} | R Documentation |
RobotParser fetch and parse robots.txt
Description
This function fetch and parse robots.txt file of the website which is specified in the first argument and return the list of correspending rules .
Usage
RobotParser(website, useragent)
Arguments
website |
character, url of the website which rules have to be extracted . |
useragent |
character, the useragent of the crawler |
Value
return a list of three elements, the first is a character vector of Disallowed directories, the third is a Boolean value which is TRUE if the user agent of the crawler is blocked.
Examples
#RobotParser("http://www.glofile.com","AgentX")
#Return robot.txt rules and check whether AgentX is blocked or not.
[Package Rcrawler version 0.1.9-1 Index]