indexFTP {rdwd} | R Documentation |
Create a recursive index of an FTP Server
Description
Create a list of all the files (in all subfolders) of an FTP server.
Defaults to the German Weather Service (DWD, Deutscher WetterDienst) OpenData server at
https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/.
The R package RCurl
must be available to do this.
It's not suggested to run this for all folders, as it can take quite some time
and you may get kicked off the FTP-Server. This package contains an index
of the climatic observations at weather stations (fileIndex
)
and gridded datasets (gridIndex
).
If they are out of date, please let me know!
Getting banned from the FTP Server
Normally, this shouldn't happen anymore: since Version 0.10.10 (2018-11-26),
a single RCurl handle is used for all FTP requests and since version 1.0.17 (2019-05-14),
the file tree provided by the DWD is used to obtain all folders first,
eliminating the recursive calls.
There's a provision if the FTP server detects bot requests and denies access.
If RCurl::getURL()
fails, there will still be an output
which you can pass in a second run via folder
to extract the remaining dirs.
You might need to wait a bit and set sleep
to a higher value in that case.
Here's an example:
gridindex <- indexFTP("", gridbase) gridindex <- indexFTP(gridindex, gridbase, sleep=15)
Of course, with a higher sleep value, the execution will take longer!
Usage
indexFTP(
folder = "currentfindex",
base = dwdbase,
is.file.if.has.dot = TRUE,
exclude.latest.bin = TRUE,
fast = TRUE,
sleep = 0,
dir = "DWDdata",
filename = folder[1],
overwrite = FALSE,
quiet = rdwdquiet(),
progbar = !quiet,
verbose = FALSE
)
Arguments
folder |
Folder(s) to be indexed recursively, e.g. "/hourly/wind/".
Leading slashes will be removed.
Use |
base |
Main directory of FTP server. Trailing slashes will be removed.
DEFAULT: |
is.file.if.has.dot |
Logical: if some of the input paths contain a dot, treat those as files, i.e. do not try to read those as if they were a folder. Only set this to FALSE if you know what you're doing. DEFAULT: TRUE |
exclude.latest.bin |
Exclude latest file at opendata.dwd.de/weather/radar/radolan? RCurl::getURL indicates this is a pointer to the last regularly named file. DEFAULT: TRUE |
fast |
Read tree file with |
sleep |
If not 0, a random number of seconds between 0 and |
dir |
Writeable directory name where to save the downloaded file.
Created if not existent.
DEFAULT: "DWDdata" at current |
filename |
Character: Part of output filename. "INDEX_of_DWD_" is prepended, "/" replaced with "_", ".txt" appended. DEFAULT: folder[1] |
overwrite |
Logical: Overwrite existing file? If not, "_n" is added to the
filename, see |
quiet |
Suppress progbars and message about directory/files?
DEFAULT: FALSE through |
progbar |
Logical: present a progress bar in each level? DEFAULT: TRUE |
verbose |
Logical: write a lot of messages from |
Value
a vector with file paths
Author(s)
Berry Boessenkool, berry-b@gmx.de, Oct 2016
See Also
createIndex()
, updateIndexes()
,
website index chapter
Examples
## Not run: ## Needs internet connection
sol <- indexFTP(folder="/daily/solar", dir=tempdir())
head(sol)
# mon <- indexFTP(folder="/monthly/kl", dir=tempdir(), verbose=TRUE)
## End(Not run)