Skip to content

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
svkucheryavski committed May 2, 2016
2 parents 9256d23 + 5ee53fa commit 72f37f1
Show file tree
Hide file tree
Showing 3 changed files with 69 additions and 0 deletions.
Binary file added data/people.RData
Binary file not shown.
40 changes: 40 additions & 0 deletions man/people.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
\name{people}
\alias{people}
\docType{data}

\title{
People data
}

\description{
Dataset for exploratory analysis with 32 objects (male and female persons) and 12 variables.
}

\usage{data(people)}
\format{
a matrix with 32 observations (persons) and 12 variables.
\tabular{rlll}{
\code{[, 1]} \tab Height in cm. \cr
\code{[, 2]} \tab Weight in kg. \cr
\code{[, 3]} \tab Hair length (-1 for short, +1 for long). \cr
\code{[, 4]} \tab Shoe size (EU standard). \cr
\code{[, 5]} \tab Age, years. \cr
\code{[, 6]} \tab Income, euro per year. \cr
\code{[, 7]} \tab Beer consumption, liters per year. \cr
\code{[, 8]} \tab Wine consumption, liters per year. \cr
\code{[, 9]} \tab Sex (-1 for male, +1 for female). \cr
\code{[, 10]} \tab Swimming ability (index, based on 500 m swimming time). \cr
\code{[, 11]} \tab Region (-1 for Scandinavia, +1 for Mediterranean. \cr
\code{[, 12]} \tab IQ (European standardized test). \cr
}
}

\details{
The data was taken from the book [1] and is in fact a small subset of a pan-European demographic survey. It includes information about 32 persons, 16 represent northern Europe (Scandinavians) and 16 are from the Mediterranean regions. In both groups there are 8 male and 8 female persons. The data includes both quantitative and qualitative variables and is particularly useful for benchmarking exploratory data analysis methods.
}

\source{
1. K. Esbensen. Multivariate Data Analysis in Practice. Camo, 2002.
}

\keyword{datasets}
29 changes: 29 additions & 0 deletions man/simdata.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
\name{simdata}
\alias{simdata}
\docType{data}
\title{
Spectral data of polyaromatic hydrocarbons mixing
}
\description{
Simdata contains training and test set with spectra and concentration values of polyaromatic hydrocarbons mixings.
}

\usage{data(simdata)}
\format{
The data is a list with following fields:
\tabular{rlll}{
\code{$spectra.c} \tab a matrix (100x150) with spectral values for the training set. \cr
\code{$spectra.t} \tab a matrix (100x150) with spectral values for the test set. \cr
\code{$conc.c} \tab a matrix (100x3) with concentration of components for the training set. \cr
\code{$conc.t} \tab a matrix (100x3) with concentration of components for the test set. \cr
\code{$wavelength} \tab a vector with spectra wavelength in nm. \cr
}
}

\details{
This is a simulated data containing UV/Vis spectra of three component (polyaromatic hydrocarbons) mixings - C1, C2 and C3. The spectral range is betwen 210 and 360 nm. The spectra were simulated as a linear combination of pure component spectra plus 5\% of random noise. The concentration range is (in moles): C1 [0, 1], C2 [0, 0.5], C3 [0, 0.1].

There are 100 mixings in a training set and 50 mixings in a test set. The data can be used for multivariate regression examples.
}

\keyword{datasets}

0 comments on commit 72f37f1

Please sign in to comment.