scrapeToCsv {CHCN}R Documentation

A function to scrape files to local csv files


This function uses the monthly list of stations and downloads them to a local directory. There are 7676 files as of July 2011. The function throws warnings about wrong files sizes. These can be ignored or suppressed by setting warning options


scrapeToCsv(Stations, get = seq(from = 1, to = 1e+05), directory = "EnvCanada")



A data structure returned from readMonthlyStations If the monthly station file already exists, it can simply be read from disk with read.csv


get is assigned to a sequence of numbers that is used to index the monthly station list. It defaults to 1:100000. This results in the function trying to download all 7676 files from Env Canada. Alternatively, one can download the files in chunks, for example setting get to 1:1000, or any other sequence of numbers. Internal checking ensures that the sequence sought is available for download. Irregular sequences are also supported: get = c( 23,65,257,7000) would get those elements from the list of stations in monthly.env.csv


The local directory to write the csv files to. "EnvCanada"


When createMonthlyStations is executed the master list is parsed and only those stations that report monthly are copied into a file. The file contains a web Id that is used when downloading. To scrape the files in the monthly data structure youc all scrapeToCsv and provide a sequence of stations you want to download. The download will occasionally fail for server timeouts. By using the function getMissingScrapes you can determine which files are missing from the directory. So if you try to download all 7676 files and the server times out after 2365, the function getMissingScrapes will provide a sequence of files to be downloaded to complete your scrape.


function downloads files according to the sequence of values in the "get" parameter.


Steven Mosher

See Also



## Not run: 
   Stations <- writeMonthlyStations()

## End(Not run)

[Package CHCN version 1.5 Index]