-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathindex.Rmd
564 lines (407 loc) · 30.5 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
---
pagetitle: "Intro to Raster"
author: "Loïc Dutrieux, Jan Verbesselt, Johannes Eberenz, Dainius Masiliūnas, Sabina Rosca"
date: "`r format(Sys.time(), '%Y-%m-%d')`"
output:
rmdformats::html_clean:
title: "Intro to Raster"
highlight: zenburn
---
```{css, echo=FALSE}
@import url("https://netdna.bootstrapcdn.com/bootswatch/3.0.0/simplex/bootstrap.min.css");
.main-container {max-width: none;}
pre {color: inherit; background-color: inherit;}
code[class^="sourceCode"]::before {
content: attr(class);
display: block;
text-align: right;
font-size: 70%;
}
code[class^="sourceCode r"]::before { content: "R Source";}
code[class^="sourceCode python"]::before { content: "Python Source"; }
code[class^="sourceCode bash"]::before { content: "Bash Source"; }
```
<font size="6">[WUR Geoscripting](https://geoscripting-wur.github.io/)</font> <img src="https://www.wur.nl/upload/854757ab-168f-46d7-b415-f8b501eebaa5_WUR_RGB_standard_2021-site.svg" alt="WUR logo" style="height: 35px; margin:inherit;"/>
# Intro to Raster
## Learning objectives
* Read & write raster data
* Perform basic raster file operations/conversions
* Perform simple raster calculations
## Introduction
Raster data is like any image. Although it may portray various properties of objects in the real world, these objects don’t exist as separate objects; rather, they are represented using pixels of various values which are assigned a color.
Today's tutorial is about constructing a simple spatio-temporal analysis using raster data, R, and git.
In a [previous tutorial](../Scripting4GeoIntro/index.html#reading-and-writing-data) you briefly saw how to read and use vector data from a file into your R environment. These vector read/write operations were made possible thanks to the *GDAL* library. You can check the project home page at [http://www.gdal.org/](http://www.gdal.org/). You will be surprised to see that a lot of the software you have used in the past to read gridded geospatial data use GDAL (i.e.: ArcGIS, QGIS, GRASS, etc). In this tutorial, we will use *GDAL* indirectly via the *terra* package. However, it is also possible to call *GDAL* functionalities directly through the command line from a terminal, which is equivalent to calling a `system()` command directly from within R. In addition, if you are familiar with R and its string handling utilities, it may facilitate the building of the expressions that have to be passed to *GDAL*. (*Note*: This is also doable in *Bash* scripting, as learned in a [previous tutorial](../Intro2Linux/index.html), and you can even combine the two.)
Let's start working with *terra* by performing system setup checks.
```{r}
# Example to perform system set-up checks
if(!"terra" %in% installed.packages()){install.packages("terra")}
library(terra)
gdal()
```
The previous function should return the version number of the current version of *GDAL* installed on your machine. Starting with [GDAL 2.0](http://trac.osgeo.org/gdal/wiki/Release/2.0.1-News) vector processing becomes incorporated into *GDAL*. In case the function above returns an error, or if you cannot install *terra* at all, you should verify that all required software and libraries are properly installed. Please refer to the [system setup page](http://geoscripting-wur.github.io/system_setup/).
# Overview of the *terra* package
The *raster* package used to be the reference R package for raster processing, with Robert J. Hijmans as its the original developer. The introduction of the *raster* package to R was a revolution for geo-processing and analysis using R. The *raster* package is now deprecated, as Robert Hijmans has developed a successor to it called *[terra](https://cran.r-project.org/web/packages/terra/)* which is both simpler and much faster, as it's rewritten in C++.
Among other things the *terra* package allows to:
* Read and write raster data of most commonly used formats.
* Perform most raster operations, such as creation of raster objects, performing spatial/geometric operations (re-projections, resampling, etc), filtering and raster calculations.
* Work on large raster datasets thanks to its built-in block processing functionalities.
* Perform fast operations thanks to optimized back-end C++ code.
* Visualize and interact with the data.
* etc...
```{block, type="alert alert-info"}
**Tip**: Check the [home page](http://cran.r-project.org/web/packages/terra/) of the *terra* package. The package is extremely well documented, including vignettes and demos. See also the [reference manual](https://cran.r-project.org/web/packages/terra/terra.pdf) there. Another useful resource for information on spatial data handling with terra can be found [here](https://rspatial.org/spatial/index.html).
```
## Explore the terra objects
The terra package produces and uses R objects of two main classes: **SpatRaster** and **SpatVector**. A SpatRaster represents a spatially referenced surface divided into three dimensional cells (rows, columns, and layers). A SpatVector represents geometries as well as attributes (variables) describing the geometries. Note: we will be using *terra* only for raster processing, as you will see in subsequent tutorials.
Let's take a look into the structure of a SpatRaster.
```{r}
# Generate a SpatRaster
r <- rast(ncol = 40, nrow = 20)
class(r)
# Simply typing the object name displays its general properties / metadata
r
```
From the metadata displayed above, we can see that the SpatRaster contains all the properties that geo-data should have; that is to say a coordinate reference system, an extent and a pixel size.
Multi-layer SpatRasters can also fairly easily be generated directly in R, as shown in the example below.
```{r}
# Using the previously generated SpatRaster
# Let's first put some values in the cells of the layer
r[] <- rnorm(n = ncell(r))
# Create a SpatRaster with 3 layers
s <- c(r, r*2, r)
# Let's look at the properties of the resulting object
s
```
Also note that using `c()` here behaves different from using `list()`:
```{r}
# Create a list of three separate SpatRasters
s2 <- list(r, r*2, r)
# Let's look at the properties of the resulting object
s2
```
# SpatRaster manipulations
## Reading and writing from/to file
The actual data used in geo-processing projects often comes as geo-data, stored on files such as *GeoTIFF* or other commonly used file formats. Reading data directly from these files into the R working environment is made possible thanks to the *terra* package. The main command for reading raster objects from files is the `rast()` function, which returns a SpatRaster.
Writing a SpatRaster to file is achieved using the `writeRaster()` function.
To illustrate the reading and writing of raster files, we will use data subsets that we have prepared for the course and need to be downloaded from the repository. For that, first make sure your working directory is set properly. Then run the following line; it will handle the download:
```{r, eval=TRUE}
# Start by making sure that your working directory is properly set
# If not you can set it using setwd()
getwd()
# Create data directory if it does not yet exist
if (!dir.exists("data")) {
dir.create("data")
}
# Download the data
# In case the download code doesn't work, use method = 'wget'
download.file(url = 'https://github.com/GeoScripting-WUR/IntroToRaster/releases/download/tahiti/gewata.zip', destfile = 'data/gewata.zip')
# Unpack the archive
unzip('data/gewata.zip', exdir = "data")
```
*Gewata* is the name of the dataset that we just downloaded. It is a multi-layer GeoTIFF object, and its file name is *LE71700552001036SGS00\_SR\_Gewata\_INT1U.tif*, informing us that this is a subset from a scene acquired by NASA's [Landsat 7 ETM+ sensor](https://www.usgs.gov/landsat-missions/landsat-7). Let's not worry about the region that the data covers for now, we will find a nice way to discover that later on in the tutorial.
Now that we have downloaded and unpacked the GeoTIFF file, it should be present in our working directory. We can investigate the content of the working directory (or any directory) using the `list.files()` function.
```{r, eval=FALSE}
# When passed without arguments, list.files() returns a character vector, listing the content of the working directory
list.files()
# To get only the files with .tif extension in the data directory
list.files('data', pattern = glob2rx('*.tif'))
# Or if you are familiar with regular expressions
list.files('data', pattern = '^.*\\.tif$')
```
We can now load this object in R:
```{r}
gewata <- rast('data/LE71700552001036SGS00_SR_Gewata_INT1U.tif')
```
Let's take a look at the structure of this object.
```{r}
gewata
```
The metadata above informs us that the gewata object is a relatively small (593x653 pixels) SpatRaster with 6 layers.
## Data type is (still) important
When writing files to disk using `writeRaster()` or the `filename =` argument in most raster processing functions, you are able set an appropriate data type. Using the `datatype =` argument, you can save some precious disk space compared to the default datatype, and thus increase read and write speed.
## Geoprocessing: in memory vs. on disk
When looking at the documentation of most functions of the *terra* package, you will notice that the list of arguments is almost always ended by `...`. These 'three dots' are called an ellipsis; it means that extra arguments can be passed to the function. Often these arguments are those that can be passed to the `writeRaster()` function; meaning that most geoprocessing functions are able to write their output directly to file, on disk. This reduces the number of steps and is always a good consideration when working with big raster objects that tend to overload the memory if not written directly to file.
## Cropping a SpatRaster
`crop()` is the terra package function that allows you to crop data to smaller spatial extents. It accepts objects of classes SpatRaster, SpatVector or SpatExtent to crop to. One way of obtaining such a SpatExtent object interactively is by using the `draw()` function. In the example below, we will manually draw a regular extent that we will use later to crop the *gewata* SpatRaster.
```{r, eval=FALSE}
# Plot the first layer of the SpatRaster
plot(gewata, 1)
e <- draw()
```
Now you have to define a rectangular bounding box that will define the spatial extent of the extent object. Click twice, for the two opposite corners of the rectangle. Note that R will wait until you finish drawing before it will let you run any more code (!). Now we can crop the data following the boundaries of this extent.
```{r, eval=FALSE}
# Crop gewata using e
gewataSub <- crop(gewata, e)
# Visualize the new cropped object
plot(gewataSub, 1)
```
You should see on the resulting plot that the original image has been cropped.
## Creating layer stacks
To end this section on general files and raster object manipulations, we will see how multi-layer objects can be created from multiple single-layer objects. We have already derived the [Normalized Difference Vegetation Index (NDVI)](https://gisgeography.com/ndvi-normalized-difference-vegetation-index/) from the Landsat 7 imagery in advance for you. The resulting product is an image (2-dimensional SpatRaster) where high pixel values represent green vegetated areas. The dataset that we are working with is a series of NDVI images of a region in Tahiti over time. Let's create a multi-layer NDVI object, composed of NDVI layers derived from Landsat acquisitions at different dates. We first need to fetch the data file by file.
```{r, eval=FALSE}
# Again, make sure that your working directory is properly set
getwd()
# Download and unzip the data
download.file(url = 'https://github.com/GeoScripting-WUR/IntroToRaster/releases/download/tahiti/tura.zip', destfile = 'data/tura.zip')
unzip(zipfile = 'data/tura.zip', exdir = 'data')
# Retrieve the content of the tura sub-directory
turaList <- list.files(path = 'data/tura/', full.names = TRUE)
```
The object `turaList` contains the file names of all the single layers we have to stack. Let's open the first one to visualize it.
```{r, eval=FALSE}
plot(rast(turaList[1]))
```
We see an NDVI layer, with the clouds masked out. Now let's create a multi-layer SpatRaster.
```{r, eval=FALSE}
turaStack <- rast(turaList)
turaStack
```
Now that we have our SpatRaster with 166 layers in memory, let's write it to disk using the `writeRaster()` function. Note that we adjust the layer names to the original file names (in which information on dates is written). Also note that the data range is comprised between -10000 and +10000, therefore such a file can be stored as signed 2 byte integer (INT2S).
```{r, eval=FALSE}
# Create output directory if it does not yet exist
if (!dir.exists("output")) {
dir.create("output")
}
# Write this file to the
writeRaster(x = turaStack, filename = 'output/turaStack.tif', names = list.files(path = 'data/tura/'), datatype = "INT2S")
```
Now this object is stored on your computer, ready to be archived for later use.
# Simple raster arithmetic
## Adding, subtracting, multiplying and dividing SpatRasters
Performing simple raster operations with a SpatRaster is fairly easy. For instance, if you want to subtract two SpatRasters of same extent, `r1` and `r2`; simply doing `r1 - r2` will give the expected output, which is, every pixel value of `r2` will be subtracted from the matching pixel value of `r1`. These types of pixel-based operations almost always require a set of conditions to be met in order to be executed; the two SpatRasters need to be identical in term of extent, pixel size, coordinate reference system, etc.
## Subsetting layers from SpatRaster
Previously, we used data with already derived NDVI. Now, we will make use of the Landsat 7 spectral bands to create the NDVI SpatRaster ourselves. The different spectral bands of the same satellite scene are often stored in multi-layer objects. This means that you will very likely import them in your R working environment as one multi-layer SpatRaster. As a consequence, to perform calculations between these bands, you will have to write an expression referring to individual layers of the object. Referring to individual layers in a SpatRaster can be done by using double square brackets `[[]]`.
Let's look for instance at how the famous NDVI index would have to be calculated from the *gewata* SpatRaster read earlier. The *gewata* SpatRaster already contains [6 bands of the Landsat 7 sensor](https://www.usgs.gov/faqs/what-are-band-designations-landsat-satellites). The formula below uses Landsat 7's NIR and Red bands, corresponding to layers 4 and 3 of the gewata SpatRaster, respectively.
$$
NDVI=\frac{NIR-Red}{NIR+Red}
$$
with NIR and Red being band 4 and 3 of Landsat 7 respectively.
```{r}
ndvi <- (gewata[[4]] - gewata[[3]]) / (gewata[[4]] + gewata[[3]])
```
The `plot()` function automatically recognises the objects of `terra` classes and returns an appropriate spatial plot.
```{r, ndvi, fig.align='center'}
plot(ndvi)
```
The resulting NDVI can be viewed in the figure above. As expected the NDVI ranges from about 0.2, which corresponds to nearly bare soils, to 0.9, which means that there is some dense vegetation in the area.
Although this is a quick way to perform the calculation, directly adding, subtracting, multiplying, etc, the layers of big raster objects is not recommended. When working with big objects, it is advisable to use the `app()` function to perform these types of calculations. The reason is that R needs to load all the data first into its internal memory before performing the calculation and then runs everything in one block. It is really easy to run out of memory when doing that. A big advantage of the `app()` function is that it has a built-in block processing option for any vectorized function, allowing such calculations to be fully "RAM friendly". The example below illustrates how to calculate NDVI from the same date set using the `app()` function.
```{r}
# Define the function to calculate NDVI using app()
ndviApp <- function(x) {
ndvi <- (x[[4]] - x[[3]]) / (x[[4]] + x[[3]])
return(ndvi)
}
ndvi2 <- app(x = gewata, fun = ndviApp)
```
We can verify that the two layers ndvi and ndvi2 are actually identical using the `all.equal()` function from the *terra* package.
```{r}
all.equal(ndvi, ndvi2)
```
Here we see that only the name of `ndvi2` is different. This is because for the first method, which resulted in `ndvi`, the layer name was copied from `gewata[[4]]`, whereas this was not the case for the second approach.
## Simple raster statistics
The `global()` function allows the extraction of basic global statistics from an entire SpatRaster, such as the mean value, maximum value, the range of values, or the number of NA cells. `global` will return a dataframe, from which we can consequently select the value we need. See also `?global` for more information and documentation. Let's try some of these options for our `ndvi` SpatRaster.
```{r}
# Calculate the mean (this produces a dataframe)
mean <- global(ndvi, fun = 'mean')
mean
# Get only the value from the dataframe
mean$mean
# Calculate the amount of NA values (which should be zero in this case)
global(ndvi, fun = 'isNA')
# Calculate the amount of non-NA values
global(ndvi, fun = 'notNA')
# Which, as we have no NA values, should be equal to
ncell(ndvi)
```
## Reprojections
The `project()` function allows re-projection of raster objects to any projection one can think of. It accepts the following formats to define coordinate reference systems: *WKT*, *PROJ.4* (e.g., `+proj=longlat +datum=WGS84`), or an *EPSG* code (e.g., `epsg:4326`). Note that the PROJ.4 notation has been deprecated, and you can only use it with the WGS84/NAD83 and NAD27 datums. Other datums are silently ignored.
A central place to search for coordinate reference systems is the spatial reference website ([http://spatialreference.org/](http://spatialreference.org/)), from this database you will be able to query almost any reference and retrieve it in any format, including its *EPSG* code. Well-Known Text (*WKT*) expressions are preferred for scientific correctness and lack of ambiguity.
Instead of specifying a coordinate reference system to convert to, you can also provide `project()` with another SpatRaster, and it will convert your input to the CRS of that SpatRaster.
```{r}
# One single line is sufficient to project any raster to any CRS
ndviLL <- project(ndvi, 'epsg:4326')
```
Note that if re-projecting and mosaicking is really a large part of your project, you may want to consider using the `gdalwarp` command line utility ([gdalwarp](http://www.gdal.org/gdalwarp.html)) directly. The `gdalUtils` R package provides utilities to run GDAL commands from R, including `gdalwarp`, for reprojection, resampling and mosaicking.
## Exporting and inspecting
By the way, we still don't know where this area is. In order to investigate that, we are going to try displaying it in QGIS. Let's write our NDVI layer in Lat/Long to a .tif file first.
```{r, eval = FALSE}
# Since this function will write a file to your working directory
# you want to make sure that it is set where you want the file to be written
# It can be changed using setwd()
getwd()
# Note that we are using the filename argument, contained in the ellipsis (...) of
# the function, since we want to write the output directly to file.
writeRaster(x = ndviLL, filename = 'output/gewataNDVI.tif')
```
Now open this file in QGIS, and add a base layer (in QGIS for instance OpenStreetMap, using XYZ tiles with the url `https://tile.openstreetmap.org/{z}/{x}/{y}.png`). Once we zoom out a bit, we see we are all the way in ... Ethiopia. More information will come later in the course about that specific area.
```{block, type="alert alert-success"}
> **Question 1**: Could you also have used `ndvi` SpatRaster instead of `ndviLL` for this final step? Why (not)?
```
We are done with this data set for this tutorial. So let's explore another data set, from the Landsat sensors. This dataset will allow us to find other interesting raster operations to perform.
## Performing simple value replacements
Since 2014, the USGS has started releasing Landsat data processed to surface reflectance. This means that they are taking care of important steps such as atmospheric correction and conversion from sensor radiance to reflectance factors. Additionally, they provide a cloud mask with this product. The cloud mask is an extra layer, at the same resolution as the surface reflectance bands, that contains information about the presence or absence of cloud as well as shadowing effects from the clouds. The cloud mask of Landsat surface reflectance product is named *cfmask*, after the name of the algorithm used to detect the clouds. For more information about cloud detection, see the [algorithm page](https://github.com/GERSL/Fmask), and the publication by [Zhu & Woodcock](https://doi.org/10.1016/j.rse.2011.10.028). In the following section we will use that cfmask layer to mask out remaining clouds in a Landsat scene.
### About the area
The area selected for this exercise covers most of the South Pacific island of Tahiti, French Polynesia. It is a mountainous, volcanic island, and according to Wikipedia about 180,000 people live on the island. For convenience, the Landsat scene was subsetted to cover only the area of interest and is stored online.
```{r, eval=TRUE}
# Download the data
download.file(url = 'https://github.com/GeoScripting-WUR/IntroToRaster/releases/download/tahiti/tahiti.zip', destfile = 'data/tahiti.zip')
unzip(zipfile = 'data/tahiti.zip', exdir = 'data')
# Load the data as a SpatRaster and investigate its contents
tahiti <- rast('data/LE70530722000126_sub.tif')
tahiti
# Display names of each individual layer
names(tahiti)
# Visualize the data
plotRGB(tahiti, 3, 4, 5, stretch = "lin")
```
We can also visualize the cloud mask layer (layer 7).
```{r, fig.align='center'}
plot(tahiti, 7)
```
According to the [algorithm description](https://github.com/GERSL/Fmask), water is coded as 1, cloud as 4 and cloud shadow as 2.
```{block, type="alert alert-success"}
> **Question 2**: Does the cloud mask fit with the visual interpretation of the RGB image we plotted before?
```
We can also plot the two on top of each other, but before that we need to assign no values (NA) to the 'clear land pixels' so that they appear transparent on the overlay plot.
```{r, fig.align='center'}
# Extract cloud layer from the SpatRaster
cloud <- tahiti[[7]]
# Replace 'clear land' with 'NA'
cloud[cloud == 0] <- NA
# Plot the stack and the cloud mask on top of each other
plotRGB(tahiti, 3, 4, 5, stretch = "lin")
plot(cloud, add = TRUE, legend = FALSE)
```
Applying a cloud mask to a dataset simply consists in performing value replacement. In this case, a condition on the 7th layer of the stack (the `fmask` layer) will determine whether values in the other layers are kept, or replaced by NA, which is equivalent to masking them. It is more convenient to work on the cloud mask as a separate SpatRaster. We will therefore subset the SpatRaster using the `subset()` function.
```{r}
# Extract cloud mask layer
fmask <- tahiti[[7]]
# Create a subset of the first six layers
tahiti6 <- subset(tahiti, 1:6)
```
We will first do the masking using simple vector arithmetic, as if `tahiti6` and `fmask` were simple vectors. We want to keep any value with a 'clean land pixel' flag in the cloud mask; or rather, since we are assigning NAs, we want to discard any value of the stack which has a corresponding cloud mask pixel different from 0. This can be done in one line of code.
```{r}
# Perform value replacement
tahiti6[fmask != 0] <- NA
```
However, this is possible here because both objects are relatively small and the values can all be loaded in the computer memory without any risk of overloading it. When working with very large raster objects, you will very likely run into problems if you do that. It is then preferable, as presented earlier in this tutorial to use `app()`.
```{r}
# First define a value replacement function
cloud2NA <- function(x) {
x[1:6][x[7] != 0] <- NA
return(x)
}
```
The value replacement function takes one argument, `x`. `x` corresponds to a SpatRaster, where the 7th layer is our cloud mask.
```{r, fig.align='center'}
# Apply the function on the two SpatRasters
tahitiCloudFree <- app(x = tahiti, fun = cloud2NA)
# Visualize the output
plotRGB(tahitiCloudFree, 3, 4, 5, stretch = "lin")
```
There are holes in the image, but at least the clouds are gone. We could use another image from another date, to create a composite image, but that is a little bit too much for today.
# Obtaining satellite imagery
So far we have worked on a prepared dataset.
However, in actual work (and the project), you will need to obtain your own data.
A lot of satellite imagery is free to use, but sometimes downloading the data in a script can be challenging.
Thankfully, there are packages developed to help automate it.
One package that is very helpful for downloading, cleaning, performing classification and postprocessing satellite image time series is called `sits` (for Satellite Image Time Series).
We will not go over most of its functionality (though it can be useful to study it for the project, [see its documentation here](https://e-sensing.github.io/sitsbook/)), but we will use its image downloading capabitilites to quickly get our own dataset to work with.
First, let's get the `sits` package and its dependencies (this can take hours if you are not using `r2u`):
```{r}
if(!"sits" %in% installed.packages()){install.packages("sits")}
if(!"geojsonsf" %in% installed.packages()){install.packages("geojsonsf")}
if(!"stars" %in% installed.packages()){install.packages("stars")}
if(!"tmap" %in% installed.packages()){install.packages("tmap")}
if(!"httr2" %in% installed.packages()){install.packages("httr2")}
library(sits)
```
Next, let's take a look at what kind of satellite data the package lets us download.
There are several data providers, but the most useful are `AWS` (Amazon Web Services), `MPC` (Microsoft Planetary Computer), `CDSE` (Copernicus Dataspace Ecosystem), and `USGS` (US Geological Survey).
Let's see what they have:
```{r}
sits_list_collections(c("AWS", "MPC", "CDSE", "USGS"))
```
We can see that several providers provide the same data, such as Sentinel-2 imagery.
Note that their level of preprocessing may differ, so it is always good to check the details to know what exactly is the data you are downloading!
In this tutorial we were working with Tahiti data, so let's get some more images over Tahiti!
First, let's find what area our Tahiti images cover.
At the moment `sits` requires either the coordinates in the WGS84 CRS, or a vector object, which you will learn about in the next tutorial.
For now, let's get the WGS84 coordinates by reprojecting the raster.
```{r}
tahiti84 <- project(tahiti, "EPSG:4326")
(roi <- ext(tahiti84))
```
We have a Landsat image so far, so let's now get a few MODIS images to compare with.
It's easiest to get data from Microsoft Planetary Computer, since it does not require authentication.
```{r}
tahitiMODIS <- sits_cube("MPC", "MOD13Q1-6.1",
roi = as.vector(roi), # At the moment SpatExtent objects need to be converted to a vector manually
bands = c("RED", "NIR", "BLUE"),
start_date = "2023-07-01", end_date = "2023-08-01")
```
Let's explore what we got. We can see all details in the `file_info` slot:
```{r}
tahitiMODIS$file_info[[1]]
```
We got six dates and three bands, that is 18 items in total. What are the dates of the images we downloaded?
```{r}
# This is equivalent to unique(tahitiMODIS$file_info[[1]]$date)
sits_timeline(tahitiMODIS)
```
Let's look at one of these images:
```{r}
plot(tahitiMODIS, red="RED", blue="BLUE", green="NIR", date="2023-07-04")
```
As you can see, this is way zoomed out! MODIS has a coarser pixel size compared to Landsat, and therefore a tile in MODIS is much larger than in Landsat.
Let's load the data with `terra`.
First, we will save all the files on our disk.
Do not forget to again include the `roi`, else it will download the entire tile (as shown above), rather than what you actually need, and will waste your time and space!
```{r results='hide'}
outdir <- "output"
if (!dir.exists(outdir))
dir.create(outdir)
sits_cube_copy(tahitiMODIS, output_dir = outdir, roi=c(lon_min=xmin(roi),lat_min=ymin(roi),lon_max=xmax(roi),lat_max=ymax(roi)))
```
Now load the files in `terra`:
```{r}
(MODISfiles <- list.files(outdir, full.names=TRUE))
tahitiMOD = rast(MODISfiles)
plotRGB(tahitiMOD, 13, 7, 1, stretch = "lin")
```
That looks zoomed in to Tahiti, though the island looks quite distorted!
This is because MODIS uses a sinusoidal projection that distorts shapes that are far away from the center (Africa), and Tahiti is almost as far away as you can get!
Let's reproject and crop to our region of interest:
```{r}
tahitiMOD = project(tahitiMOD, tahiti)
tahitiMOD = crop(tahitiMOD, tahiti)
plotRGB(tahitiMOD, 13, 7, 1, stretch = "lin")
```
It looks a bit more pixelated than the Landsat image, but now the shape of the island is no longer distorted!
# Summary
Today you got a general introduction to the *terra* package, its basic functions, its object classes and methods. You also learned how to use a few functions from the `sits` package to retrieve data. The functions can be categorized as follows:
## Terra classes
* `SpatRaster`: Single- or multi-layer raster object.
* `SpatVector`: Vector object.
## Functions
### Reading and writing
* `rast()`: Read a raster object from disk.
* `ext()`: get the extent of the raster
* `writeRaster()`: Write a `SpatRaster` to disk.
* `filename =` argument: Available for most functions of the *terra* package that produce SpatRaster objects, write directly the output of the function to disk.
### Reformat data
* `crop()`: Modify the extent of a SpatRaster based on another spatial object or a SpatExtent object.
* `project()`: Reproject (and resample) a SpatRaster to a desired coordinate reference system or reference SpatRaster.
* `subset()`: Create a subset of a SpatRaster.
* `c()`: Combine multiple SpatRasters into a single SpatRaster.
### Simple visualization
* `plot()`: Plot a SpatRaster. Use `add = TRUE` to overlay several objects.
* `plotRGB()`: Plot an RGB color composite. Specifying the bands to assign colors as arguments.
### Raster calculations
* First of all, SpatRasters work just like vectors of numerics (`c(1, 2, 3)`). They can be subsetted, added, subtracted, etc.
* `app()`: Apply a function to every pixel independently of a single SpatRaster (Single or multi-layer). RAM friendly and can write output directly to disk using the `filename =` argument.
### sits functions
* `sits_cube()`: Download satellite imagery from a cloud platform
* `sits_cube_copy()`: Write the imagery to disk
# Extra
* The [gdalUtilities](http://cran.r-project.org/web/packages/gdalUtilities/index.html) package provides interesting wrappers facilitating the use of GDAL functions from within R.