R: RobotParser fetch and parse robots.txt

RobotParser {Rcrawler}

R Documentation

RobotParser fetch and parse robots.txt

Description

This function fetch and parse robots.txt file of the website which is specified in the first argument and return the list of correspending rules .

Usage

RobotParser(website, useragent)

Arguments

`website`	character, url of the website which rules have to be extracted .
`useragent`	character, the useragent of the crawler

Value

return a list of three elements, the first is a character vector of Disallowed directories, the third is a Boolean value which is TRUE if the user agent of the crawler is blocked.

Examples


#RobotParser("http://www.glofile.com","AgentX")
#Return robot.txt rules and check whether AgentX is blocked or not.

[Package Rcrawler version 0.1.9-1 Index]