R: read colorSpec objects from files

readSpectra {colorSpec}

R Documentation

read colorSpec objects from files

Description

These functions read colorSpec objects from files. In case of ERROR, they return NULL. There are 5 different file formats supported; see Details.

Usage

readSpectra( pathvec, ... )

readSpectraXYY( path )
readSpectraSpreadsheet( path )
readSpectrumScope( path )
readSpectraCGATS( path )
readSpectraControl( path )

Arguments

`pathvec`	a character vector to (possibly) multiple files. The file extension and a few lines from each file are read and a guess is made regarding the file format.
`...`	optional arguments passed on to `resample()`. The most important is `wavelength`. If these are missing then `resample()` is not called.
`path`	a path to a single file with the corresponding format: `XYY`, `Spreadsheet`, `Scope`, `CGATS`, or `Control`. See Details. If the function cannot recognize the format, it returns NULL.

Details

readSpectra() reads the first few lines of the file in order to determine the format, and then calls the corresponding format-specific function. If readSpectra() cannot determine the format, it returns NULL. The 5 file formats are:

XYY
There is a column header line matching ^(wave|wv?l) (not case sensitive) followed by the the names of the spectra. All lines above this one are taken to be metadata. The separarator on this header line can be space, tab, or comma; the line is examined and the separator found there is used in the lines with data below. The organization of the returned object is 'df.col'. This is probably the most common file format; see the sample file ciexyz31_1.csv.

Spreadsheet
There is a line matching "^(ID|SAMPLE|Time)". This line and lines below must be tab-separated. Fields matching '^[A-Z]+([0-9.]+)nm$' are taken to be spectral data and other fields are taken to be extradata. All lines above this one are taken to be metadata. The organization of the returned object is 'df.row'. This is a good format for automated acquisition, using a spectrometer, of many spectra. See the sample file N130501.txt from Wolf Faust.

Scope
This is a file format used by Ocean Optics spectrometer software. There is a line
>>>>>Begin Processed Spectral Data<<<<<
followed by wavelength and energy separated by a tab. There is only 1 spectrum per file. The organization of the returned object is 'vector'. See the sample file pos1-20x.scope.

CGATS
This is a complex format that is best understood by looking at some samples, such as
extdata/objects/Rosco.txt; see also the References. The function readCGATS() is first called to get all the tables, and then for each table the column names are examined. There are 2 conventions for presenting the spectral data:

In the standard convention the fields SPECTRAL_DEC or SPECTRAL_PCT have the spectral values. The former is the true value, and the latter is the true value x 100. Each value column is preceded a corresponding wavelength column, which has the field name SPECTRAL_NM. Note that these field names are highly duplicated. In principle, this convention allows each record in a CGATS table to have a different wavelength vector. However, this complication is rejected by readSpectraCGATS(), which treats it as an ERROR.
In the non-standard convention the field names that match the pattern
"^(nm|SPEC_|SPECTRAL_)[_A-Z]*([0-9.]+)$" are considered to be spectral value data, and other fields are considered extradata. The wavelength is the numerical value of the 2nd parenthesized expression ([0-9.]+) in nanometers. Note that every record in this CGATS table has the same wavelength vector. Although this convention is non-standard, it appears in files from many companies, including X-Rite.

If a data.frame has spectral data, it is converted to a colorSpec object and placed in the returned list. The organization of the resulting colorSpec object is 'df.row'. If the data.frame of extradata contains a column SAMPLE_NAME, SAMPLE_ID, SampleID, or Name, (examined in that order), then that column is taken to be the specnames of the object. If a table has no spectral data, then it is ignored. If the CGATS file has no tables with spectral data, then it is an ERROR and the function returns NULL.

Control
This is a personal format used for digitizing images of plots from manufacturer datasheets and academic papers. It is structured like a .INI file. There is a [Control] section establishing a simple linear map from pixels to the wavelength and spectrum quantities. Only 3 points are really necessary. It is OK for there to be a little rotation of the plot axes relative to the image. This is followed by a section for each spectrum, in XY pixel units only. Conversion to wavelength and spectral quantities is done on-the-fly after they are read. Extrapolation can be a problem, especially when the value is near 0. To force constant extrapolation (see resample()), repeat the control point (knot) at the endpoint. See the sample file Lumencor-SpectraX.txt. The organization of the returned objects is 'vector'.

Value

readSpectra() returns a single colorSpec object or NULL in case of ERROR. If there are multiple files in pathvec and they cannot be combined using bind() because their wavelengths are different, it is an ERROR. To avoid this ERROR, the wavelength argument can be used for resampling to common wavelengths. If there are multiple files, the organization of the returned object is 'df.row' and the first column is the path from which the spectrum was read.

The functions readSpectraXYY(), readSpectraSpreadsheet(), and readSpectraScope(), return a single colorSpec object, or NULL in case of ERROR.

The functions readSpectraCGATS() and readSpectraControl() are more complicated. These 2 file formats can contain multiple spectra with different wavelength sequences, so both functions return a list of colorSpec objects, even when that list has length 1. If no spectral objects are found, they return NULL.

If readSpectra() calls readSpectraCGATS() or readSpectraControl() and receives a list of colorSpec objects, readSpectra() attempts to bind() them into a single object. If they all have the same wavelength vector, then the bind() succeeds and the single colorSpec object is returned. Otherwise the bind() fails, and it is an ERROR. To avoid this error readSpectra() can be called with a wavelength argument. The multiple spectra are resampled using resample() and then combined using bind(), which makes it much more convenient to read such files.

Note

During import, each read function tries to guess the quantity from spectrum names or other cues. For example the first line in N130501.txt is IT8.7/1, which indicates that the quantity is 'transmittance' (a reflective target is denoted by IT8.7/2). If a confident guess cannot be made, it makes a wild guess and issues a warning. If the quantity is incorrect, one can assign the correct value after import. Alternatively one can add a line to the header part of the file with the keyword 'quantity' followed by some white-space and the correct value. It is OK to put the value in quotes. See example files under folder extdata.

References

CGATS.17 Text File Format. http://www.colorwiki.com/wiki/CGATS.17_Text_File_Format.

ANSI/CGATS.17. Graphic technology - Exchange format for colour and process control data using XML or ASCII text. https://webstore.ansi.org/ 2009.

ISO/28178. Graphic technology - Exchange format for colour and process control data using XML or ASCII text. https://www.iso.org/standard/44527.html. 2009.

Examples

#   read file with header declaring the quantity to be "photons->neural"
bird = readSpectra( system.file( "extdata/eyes/BirdEyes.txt", package='colorSpec' ) )
quantity(bird)   # [1] "photons->neural"

[Package colorSpec version 1.5-0 Index]