diff --git a/data/people.RData b/data/people.RData new file mode 100755 index 0000000..df44f92 Binary files /dev/null and b/data/people.RData differ diff --git a/man/people.Rd b/man/people.Rd new file mode 100755 index 0000000..71754df --- /dev/null +++ b/man/people.Rd @@ -0,0 +1,40 @@ +\name{people} +\alias{people} +\docType{data} + +\title{ +People data +} + +\description{ +Dataset for exploratory analysis with 32 objects (male and female persons) and 12 variables. +} + +\usage{data(people)} +\format{ + a matrix with 32 observations (persons) and 12 variables. + \tabular{rlll}{ + \code{[, 1]} \tab Height in cm. \cr + \code{[, 2]} \tab Weight in kg. \cr + \code{[, 3]} \tab Hair length (-1 for short, +1 for long). \cr + \code{[, 4]} \tab Shoe size (EU standard). \cr + \code{[, 5]} \tab Age, years. \cr + \code{[, 6]} \tab Income, euro per year. \cr + \code{[, 7]} \tab Beer consumption, liters per year. \cr + \code{[, 8]} \tab Wine consumption, liters per year. \cr + \code{[, 9]} \tab Sex (-1 for male, +1 for female). \cr + \code{[, 10]} \tab Swimming ability (index, based on 500 m swimming time). \cr + \code{[, 11]} \tab Region (-1 for Scandinavia, +1 for Mediterranean. \cr + \code{[, 12]} \tab IQ (European standardized test). \cr + } +} + +\details{ +The data was taken from the book [1] and is in fact a small subset of a pan-European demographic survey. It includes information about 32 persons, 16 represent northern Europe (Scandinavians) and 16 are from the Mediterranean regions. In both groups there are 8 male and 8 female persons. The data includes both quantitative and qualitative variables and is particularly useful for benchmarking exploratory data analysis methods. +} + +\source{ +1. K. Esbensen. Multivariate Data Analysis in Practice. Camo, 2002. +} + +\keyword{datasets} diff --git a/man/simdata.Rd b/man/simdata.Rd new file mode 100755 index 0000000..7f3c8ec --- /dev/null +++ b/man/simdata.Rd @@ -0,0 +1,29 @@ +\name{simdata} +\alias{simdata} +\docType{data} +\title{ +Spectral data of polyaromatic hydrocarbons mixing +} +\description{ +Simdata contains training and test set with spectra and concentration values of polyaromatic hydrocarbons mixings. +} + +\usage{data(simdata)} +\format{ + The data is a list with following fields: + \tabular{rlll}{ + \code{$spectra.c} \tab a matrix (100x150) with spectral values for the training set. \cr + \code{$spectra.t} \tab a matrix (100x150) with spectral values for the test set. \cr + \code{$conc.c} \tab a matrix (100x3) with concentration of components for the training set. \cr + \code{$conc.t} \tab a matrix (100x3) with concentration of components for the test set. \cr + \code{$wavelength} \tab a vector with spectra wavelength in nm. \cr + } +} + +\details{ +This is a simulated data containing UV/Vis spectra of three component (polyaromatic hydrocarbons) mixings - C1, C2 and C3. The spectral range is betwen 210 and 360 nm. The spectra were simulated as a linear combination of pure component spectra plus 5\% of random noise. The concentration range is (in moles): C1 [0, 1], C2 [0, 0.5], C3 [0, 0.1]. + +There are 100 mixings in a training set and 50 mixings in a test set. The data can be used for multivariate regression examples. +} + +\keyword{datasets}