scrapeR {scrapeR} | R Documentation |
Web Page Content Scraper
Description
The scrapeR
function fetches and extracts text content from the specified web page.
It handles HTTP errors and parses HTML efficiently.
Usage
scrapeR(url)
Arguments
url |
A character string specifying the URL of the web page to be scraped. |
Details
The function uses tryCatch
to handle potential web scraping errors. It fetches
the webpage content, checks for HTTP errors, and then parses the HTML content to extract
text. The text from different HTML nodes like headings and paragraphs is combined into a
single string.
Value
A character string containing the combined text from the specified HTML nodes of the web
page. Returns NA
if an error occurs or if the page content is not accessible.
Note
This function requires the httr and rvest packages. Ensure that these dependencies are installed and loaded in your R environment.
Author(s)
Mathieu Dubeau, Ph.D.
References
Refer to the rvest package documentation for underlying HTML parsing and extraction methods.
See Also
GET
, read_html
, html_nodes
,
html_text
Examples
url <- "http://www.example.com"
scraped_text <- scrapeR(url)