read_mnist {dslabs} | R Documentation |
Download and read the mnist dataset
Description
This function downloads the mnist training and test data available here http://yann.lecun.com/exdb/mnist/
Usage
read_mnist(
path = NULL,
download = FALSE,
destdir = tempdir(),
url = "https://www2.harvardx.harvard.edu/courses/IDS_08_v2_03/",
keep.files = TRUE
)
Arguments
path |
A character giving the full path of the directory to look for files. It assumes the filenames are the same as the originals. If path is |
download |
If |
destdir |
A character giving the full path of the directory in which to save the downloaded files. The default is to use a temporary directory. |
url |
A character giving the URL from which to download files. Currently a copy of the data is available at https://www2.harvardx.harvard.edu/courses/IDS_08_v2_03/, the current default URL. |
keep.files |
A logical. If |
Value
A list with two components: train and test. Each of these is a list with two components: images and labels. The images component is a matrix with each column representing one of the 28*28 = 784 pixels. The values are integers between 0 and 255 representing grey scale. The labels components is a vector representing the digit shown in the image.
Note that the data is over 10MB, so the download may take several seconds depending on internet speed. If you plan to load the data more than once we recommend you download the data once and read it from disk in the future. See examples.
Author(s)
Samuela Pollack
Rafael A. Irizarry, rafael_irizarry@dfci.harvard.edu
References
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition." Proceedings of the IEEE, 86(11):2278-2324, November 1998.
Examples
# this can take several seconds, depending on internet speed.
## Not run:
mnist <- read_mnist()
i <- 5
image(1:28, 1:28, matrix(mnist$test$images[i,], nrow=28)[ , 28:1],
col = gray(seq(0, 1, 0.05)), xlab = "", ylab="")
## the labels for this image is:
mnist$test$labels[i]
## End(Not run)
# You can download and save the data to a directory like this:
## Not run:
mnist <- read_mnist(download = TRUE, destdir = "~/Downloads")
# and then, going forward, read from disk
mnist <- read_mnist("~/Downloads")
## End(Not run)