spiderbar {spiderbar}R Documentation

Parse and Test Robots Exclusion Protocol Files and Rules


The 'Robots Exclusion Protocol' (https://www.robotstxt.org/orig.html) documents a set of standards for allowing or excluding robot/spider crawling of different areas of site content. Tools are provided which wrap The rep-cpp https://github.com/seomoz/rep-cpp C++ library for processing these 'robots.txt“ files.


Bob Rudis (bob@rud.is)

[Package spiderbar version 0.2.5 Index]