forked from hadley/adv-r
-
Notifications
You must be signed in to change notification settings - Fork 1
/
base-types.Rmd
168 lines (121 loc) · 6.27 KB
/
base-types.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
# Base types {#base-types}
## Introduction
\index{base objects}
\index{OO objects}
```{r setup, include = FALSE}
source("common.R")
```
To talk about objects and OOP in R we first need to clear up a fundamental confusion about two uses of the word "object". So far in this book, we've used the word in the general sense captured by John Chambers' pithy quote: "Everything that exists in R is an object". However, while everything _is_ an object, not everything is object-oriented. This confusion arises because the base objects come from S, and were developed before anyone thought that S might need an OOP system. The tools and nomenclature evolved organically over many years without a single guiding principle.
Most of the time, the distinction between objects and object-oriented objects is not important. But here we need to get into the nitty gritty details so we'll use the terms __base objects__ and __OO objects__ to distinguish them.
```{r, out.width = NULL, echo = FALSE}
knitr::include_graphics("diagrams/oo-venn.png")
```
### Outline {-}
* Section \@ref(base-vs-oo) shows you how to identify base and OO objects.
* Section \@ref(base-types-2) gives a complete set of the base types used to build all objects.
## Base versus OO objects {#base-vs-oo}
\indexc{is.object()}
\indexc{otype()}
\index{attributes!class}
\indexc{class()}
To tell the difference between a base and OO object, use `is.object()` or `sloop::otype()`:
```{r}
# A base object:
is.object(1:10)
sloop::otype(1:10)
# An OO object
is.object(mtcars)
sloop::otype(mtcars)
```
Technically, the difference between base and OO objects is that OO objects have a "class" attribute:
```{r}
attr(1:10, "class")
attr(mtcars, "class")
```
You may already be familiar with the `class()` function. This function is safe to apply to S3 and S4 objects, but it returns misleading results when applied to base objects. It's safer to use `sloop::s3_class()`, which returns the implicit class that the S3 and S4 systems will use to pick methods. You'll learn more about `s3_class()` in Section \@ref(implicit-class).
```{r}
x <- matrix(1:4, nrow = 2)
class(x)
sloop::s3_class(x)
```
## Base types {#base-types-2}
\indexc{typeof()}
\index{base type|see {\texttt{typeof()}}}
While only OO objects have a class attribute, every object has a __base type__:
```{r}
typeof(1:10)
typeof(mtcars)
```
Base types do not form an OOP system because functions that behave differently for different base types are primarily written in C code that uses switch statements. This means that only R-core can create new types, and creating a new type is a lot of work because every switch statement needs to be modified to handle a new case. As a consequence, new base types are rarely added. The most recent change, in 2011, added two exotic types that you never see in R itself, but are needed for diagnosing memory problems. Prior to that, the last type added was a special base type for S4 objects added in 2005.
<!--
https://github.com/wch/r-source/blob/f5bb85782509ddadbcec94ab7648886c2d008bda/src/main/util.c#L185-L211-->
In total, there are 25 different base types. They are listed below, loosely grouped according to where they're discussed in this book. These types are most important in C code, so you'll often see them called by their C type names. I've included those in parentheses.
* Vectors, Chapter \@ref(vectors-chap), include types `NULL` (`NILSXP`),
`logical` (`LGLSXP`), `integer` (`INTSXP`), `double` (`REALSXP`), `complex`
(`CPLXSXP`), `character` (`STRSXP`), `list` (`VECSXP`), and `raw` (`RAWSXP`).
```{r}
typeof(NULL)
typeof(1L)
typeof(1i)
```
* Functions, Chapter \@ref(functions), include types `closure` (regular R
functions, `CLOSXP`), `special` (internal functions, `SPECIALSXP`), and
`builtin` (primitive functions, `BUILTINSXP`).
```{r}
typeof(mean)
typeof(`[`)
typeof(sum)
```
Internal and primitive functions are described in Section
\@ref(primitive-functions).
* Environments, Chapter \@ref(environments), have type `environment`
(`ENVSXP`).
```{r}
typeof(globalenv())
```
* The `S4` type (`S4SXP`), Chapter \@ref(s4), is used for S4 classes that
don't inherit from an existing base type.
```{r}
mle_obj <- stats4::mle(function(x = 1) (x - 2) ^ 2)
typeof(mle_obj)
```
* Language components, Chapter \@ref(expressions), include `symbol` (aka
name, `SYMSXP`), `language` (usually called calls, `LANGSXP`), and
`pairlist` (used for function arguments, `LISTSXP`) types.
```{r}
typeof(quote(a))
typeof(quote(a + 1))
typeof(formals(mean))
```
`expression` (`EXPRSXP`) is a special purpose type that's only returned by
`parse()` and `expression()`. Expressions are generally not needed in user
code.
* The remaining types are esoteric and rarely seen in R. They are important
primarily for C code: `externalptr` (`EXTPTRSXP`), `weakref` (`WEAKREFSXP`),
`bytecode` (`BCODESXP`), `promise` (`PROMSXP`), `...` (`DOTSXP`), and
`any` (`ANYSXP`).
\indexc{mode()}
You may have heard of `mode()` and `storage.mode()`. Do not use these functions: they exist only to provide type names that are compatible with S.
### Numeric type {#numeric-type}
\index{numeric vectors}
\index{vectors!numeric|see {numeric vectors}}
Be careful when talking about the numeric type, because R uses "numeric" to mean three slightly different things:
1. In some places numeric is used as an alias for the double type. For
example `as.numeric()` is identical to `as.double()`, and `numeric()` is
identical to `double()`.
(R also occasionally uses real instead of double; `NA_real_` is the one
place that you're likely to encounter this in practice.)
1. In the S3 and S4 systems, numeric is used as a shorthand for either
integer or double type, and is used when picking methods:
```{r}
sloop::s3_class(1)
sloop::s3_class(1L)
```
1. `is.numeric()` tests for objects that _behave_ like numbers. For example,
factors have type "integer" but don't behave like numbers (i.e. it doesn't
make sense to take the mean of factor).
```{r}
typeof(factor("x"))
is.numeric(factor("x"))
```
In this book, I consistently use numeric to mean an object of type integer or double.