forked from FAO-SID/SOC-Mapping-Cookbook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path03-setting-up-software-environment.Rmd
188 lines (113 loc) · 13.4 KB
/
03-setting-up-software-environment.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# Setting-up the software environment
*Y. Yigini*
This cookbook focuses on SOC modeling using open source digital mapping tools. The instructions in this chapter will guide user through installing and manually configuring the software to be used for DSM procedures for Microsoft Windows desktop platform. Instructions for the other platforms (e.g. Linux Flavours, MacOS) can be found through free online resources.
## Use of R, RStudio and R Packages
**R** is a language and environment for statistical computing. It provides a wide variety of statistical (e.g. linear modeling, statistical tests, time-series, classification, clustering, etc.) and graphical methods, and is highly extensible.
### Obtaining and installing R
Installation files and instructions can be downloaded from the Comprehensive R Archive Network (CRAN).
1. Go to the following link https://cloud.r-project.org/index.html to download and install **R**.
2. Pick an installation file for your platform.
### Obtaining and installing RStudio
Beginners will find it very hard to start using **R** because it has no Graphical User Interface (GUI). There are some GUIs which offer some of the functionality of **R**. **RStudio** makes **R** easier to use. It includes a code editor, debugging and visualization tools. Similar steps need to be followed to install **RStudio**.
1. Go to https://www.rstudio.com/products/rstudio/download/ to download and install **RStudio**'s open source edition.
2. On the download page, *RStudio Desktop, Open Source License* option should be selected.
3. Pick an installation file for your platform.
### Getting started with R
* **R** manuals: http://cran.r-project.org/manuals.html
* Contributed documentation: http://cran.r-project.org/other-docs.html
* Quick-**R**: http://www.statmethods.net/index.html
* Stackoverflow **R** community: https://stackoverflow.com/questions/tagged/r
## R packages
When you download **R**, you get the basic **R** system which implements the **R** language. **R** becomes more useful with the large collection of packages that extend the basic functionality of it. **R** packages are developed by the **R** community.
### Finding R packages
The primary source for **R** packages is [CRAN’s](https://cran.r-project.org/) official website, where currently about 12,000 available packages are listed. For spatial applications, various packages are available. You can obtain information about the available packages directly on CRAN with the `available.packages()` function. The function returns a matrix of details corresponding to packages currently available at one or more repositories. An easier way to browse the list of packages is using the [*Task Views*](https://cran.r-project.org/web/views/) link, which groups together packages related to a given topic.
### Most used R packages for digital soil mapping
As was previously mentioned, **R** is extensible through packages. **R** packages are collections of **R** functions, data, documentation and compiled code easy to share with others. In the following Subsections, we are going to present the most used packages related to digital soil property mapping.
#### Soil science and pedometrics{#SoilPedometrics}
[**aqp**](https://CRAN.R-project.org/package=aqp): Algorithms for quantitative pedology. A collection of algorithms related to modeling of soil resources, soil classification, soil profile aggregation, and visualization.
[**GSIF**](https://CRAN.R-project.org/package=GSIF): Global Soil Information Facility (GSIF). Tools, functions and sample datasets for digital soil mapping. GSIF tools (standards and functions) and sample datasets for global soil mapping.
[**ithir**](http://ithir.r-forge.r-project.org/): A collection of functions and algorithms specific to pedometrics. The package was developed by Brendan Malone at the University of Sydney.
[**soiltexture**](https://CRAN.R-project.org/package=soiltexture): The [*Soil Texture Wizard*](https://cran.r-project.org/web/packages/soiltexture/vignettes/soiltexture_vignette.pdf) is a set of **R** functions designed to produce texture triangles (also called texture plots, texture diagrams, texture ternary plots), classify and transform soil textures data. These functions virtually allow to plot any soil texture triangle (classification) into any triangle geometry (isosceles, right-angled triangles, etc.). The set of functions is expected to be useful to people using soil textures data from different soil texture classification or different particle size systems. Many (> 15) texture triangles from all around the world are predefined in the package. A simple text-based GUI is provided: `soiltexture_gui()`.
#### Spatial analysis
[**raster**](https://CRAN.R-project.org/package=raster): Reading, writing, manipulating, analyzing and modeling of gridded spatial data. The package implements basic and high-level functions, processing of very large files is supported.
[**rgdal**](https://CRAN.R-project.org/package=rgdal): Provides bindings to Frank Warmerdam’s Geospatial Data Abstraction Library (GDAL).
[**RSAGA**](https://CRAN.R-project.org/package=RSAGA): The package provides access to geocomputing and terrain analysis functions of [SAGA GIS](\url{http://www.saga-gis.org/en/index.html} ) from within **R** by running the command line version of System for Automated Geoscientific Analyses (SAGA).
[**sf**](https://cran.r-project.org/web/packages/sf/index.html): The package provides an improvement on **sp** and **rgdal** for spatial datasets. It is using the newly Simple Feratures (SF) standard which is widely implemented in spatial databases (PostGIS, ESRI ArcGIS), and forms the vector data basis for libraries such as GDAL and web standards such as GeoJSON (http://geojson.org/).
[**sp**](https://CRAN.R-project.org/package=sp): The package provides classes and methods for spatial data. The classes document where the spatial location information resides, for 2D or 3D data.
#### Modelling
[**automap**](https://CRAN.R-project.org/package=automap): This package performs an automatic interpolation by automatically estimating the variogram and then calling [**gstat**](https://CRAN.R-project.org/package=gstat).
[**caret**](https://CRAN.R-project.org/package=caret): Extensive range of functions for training and plotting classification and regression models.
[**Cubist**](https://CRAN.R-project.org/package=Cubist): Regression modeling using rules with added instance-based corrections. Cubist models were developed by Ross Quinlan.
[**C5.0**](https://CRAN.R-project.org/package=C5.0): C5.0 decision trees and rule-based models for pattern recognition. Another model structure developed by Ross Quinlan.
[**gam**](https://CRAN.R-project.org/package=gam): Functions for fitting and working with generalized additive models.
[**gstat**](https://CRAN.R-project.org/package=gstat): Variogram modeling with simple, ordinary and universal point or block (co)kriging, sequential Gaussian or indicator (co)simulation. The package includes variogram and variogram map plotting utility functions.
[**nnet**](https://CRAN.R-project.org/package=nnet): Software for feed-forward neural networks with a single hidden layer, and for multinomial log-linear models.
#### Mapping and plotting
Both **raster** and **sp** have handy functions for plotting spatial data. The following packages can be used as functional extentions.
[**ggplot2**](https://cran.r-project.org/web/packages/ggplot2/index.html): Besides using the base plotting functionality, this is another useful plotting package.
[**leaflet**](https://CRAN.R-project.org/package=leaflet): Create and customize interactive maps using the Leaflet JavaScript library and the [**htmlwidgets**](https://cran.r-project.org/web/packages/htmlwidgets/index.html) package. These maps can be used directly from the **R** console, from **RStudio**, in [**Shiny**](https://shiny.rstudio.com/) apps and **RMarkdown** documents.
[**plotKML**](https://CRAN.R-project.org/package=plotKML): Writes sp-class, spacetime-class, raster-class and similar spatial and spatiotemporal objects to KML following some basic cartographic rules.
### Packages used in this cookbook
The authors of this cookbook used a number of different **R** packages. All required packages used in the cookbook can be installed using the following code and the `install.packages()` function when starting a new SOC mapping project. Alternatively, the code for the installation of the needed packages is included at the beginning of each Chapter.
```{r, eval=FALSE}
# Install all required R packages used in the cookbook
install.packages(c("aqp", "automap", "car", "caret", "e1071",
"GSIF", "htmlwidgets", "leaflet", "mapview",
"Metrics", "openair", "plotKML", "psych",
"quantregForest", "randomForest", "raster",
"rasterVis", "reshape", "rgdal", "RSQLite",
"snow", "soiltexture", "sf", "sp"))
```
## R and spatial data
**R** has a large and growing number of spatial data packages. We recommend taking a quick browse on **R**’s official website to see the spatial packages available: http://cran.r-project.org/web/views/Spatial.html.
### Reading shapefiles
The Environmental Systems Research Institute (ESRI) provides a shapefile format SHP which is widely used for storing vector-based spatial data (i.e., points, lines, polygons). This example demonstrates use of **raster** package that provides functions for reading and/or writing shapefiles.
```{r}
library(raster)
# Load the soil map from a shapefile *.shp file
soilmap <- shapefile("MK_soilmap_simple.shp")
```
We may want to use these data in other GIS environments such as ArcGIS, QGIS, SAGA GIS, etc. This means we need to export the `SpatialPointsDataFrame` to an appropriate spatial data format such as a shapefile `*.shp`.
```{r}
# For example, we can select the soil units classified as
# fluvisols according to WRB
Fluvisols <- soilmap[soilmap$WRB == "Fluvisol",]
# Save this as a new shapefile
shapefile(Fluvisols, filename = 'results/fluvisols.shp',
overwrite = TRUE)
```
### Coordinate reference systems in R
We need to define the Coordinate Reference System (CRS) to be able to perform any sort of spatial analysis in **R**. To clearly tell **R** this information we define the CRS which describes a reference system in a way understood by the PROJ.4 projection library. Find more information at this link: http://trac.osgeo.org/proj/.
An interface to the PROJ.4 library is available in the **rgdal** package. An alternative to using PROJ.4 character strings, we can use the corresponding yet simpler EPSG (European Petroleum Survey Group) code. **rgdal** also recognizes these codes. If you are unsure of the PROJ.4 or EPSG code for the spatial data that you have but know the CRS, you should consult http://spatialreference.org/ for assistance.
```{r}
# Print the CRS for the object soilmap
soilmap@proj4string
```
The following example shows how a spatial object from a (comma-separated values) CSV file (`*.csv`) can be created. We can use the `coordinates()` function from the **sp** package to define which columns in the data frame refer to actual spatial coordinates. Here, the coordinates are listed in columns X and Y.
```{r}
# Load the table with the soil observations site information
dat_sites <- read.csv(file = "data/site-level.csv")
# Convert from table to spatial points object
coordinates(dat_sites) <- ~ X + Y
# Check the coordinate system
dat_sites@proj4string
# As the CRS is not defined, we can assign the correct CRS as we
# have information about it. In this case, it should be EPSG:4326
dat_sites@proj4string <- CRS("+init=epsg:4326")
# Check the CRS again
dat_sites@proj4string
```
### Working with rasters
Most of the functions for handling raster data are available in the **raster** package. There are functions for reading and writing raster files from and to different formats. In DSM, we mostly work with data in table format and then rasterize this data so that we can produce a continuous map. For doing this in **R** environment, we will load raster data in a data frame. This data is a digital elevation model (DEM) provided by the International Soil Reference and Information Centre (ISRIC) for FYROM.
```{r}
# For handling raster data, we load raster package
library(raster)
# Load DEM from the raster *.tif files
DEM <- raster("covs/DEMENV5.tif")
```
We may want to export this raster to a suitable format to work in a standard GIS environment. See the help file for writing a raster `?writeRaster` to get information regarding the supported grid types that data can be exported into. Here, we will export our raster to ESRI ASCII (American Standard Code for Information Interchange), as it is a common and universal raster format.
We may also want to export our DEM to KML (Keyhole Markup Language) file (`*.kml`) using the `KML()` function. `KML()` is a handy function from the **raster** package for exporting grids to KML format. Note that we need to re-project the data to the World Geodetic System (WGS) WGS84 geographic. The raster re-projection is performed using the `projectRaster()` function. Look at the help file `?projectRaster` for more information.
## Other DSM software and tools
* **GRASS GIS**: Available at https://grass.osgeo.org/ (free and open source).
* **SAGA GIS**: Available at https://sourceforge.net/projects/saga-gis/files/ (free and open source).
* **QGIS**: Available at http://www.qgis.org/en/site/forusers/download.html (free and open source).