Data Frames

A data frame is a 2-dimensional data structure. It is different from a matrix. The rows are observations, the columns are variables. All columns/variables must have the same number of elements and they are expected to be aligned so that the i-th element in each column corresponds to the same i-th observational unit.

The purpose of a data frame is to allow each column have a different type. This allows us to have integers in one column, logical values in another, Dates in another, and even a vector of more complex objects, e.g., each element in a column might be a data frame itself, or a matrix, or a function.

A data frame is a list. Query this with typeof().

So we can use list subsetting

mtcars[ c(1, 2) ]
mtcars[ c("mpg", "wt") ]
mtcars[ grepl("^d", names(mtcars) ) ]
mtcars[[ "mpg" ]]
mtcars$mpg

When we assign a value to a column, e.g.,

mtcars$old = TRUE

the recycling rule is used. R ensures that each column (element of the list) of the data.frame has the same length. So R repeats TRUE nrow(mtcars) times.

So what does

mtcars$old = c(TRUE, FALSE)

yield?

And what does

mtcars$old = c(TRUE, FALSE, TRUE)

do?

We can use 2-dimensional subsetting also. See Subsetting2D.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrames.md

DataFrames.md

Data Frames

Files

DataFrames.md

Latest commit

History

DataFrames.md

File metadata and controls

Data Frames