This notebook will take a basic overview of raster data.
Rasters are a form of data that is georeferenced and (somewhat) continuous. Raster data is perhaps best envisioned as a matrix of values, whose entries represent spatially referenced data points. The raster itself can be visualized as you could for normal matrix output. What makes a raster different, however, is that it is (or should be) georeferenced. This means that each element of the matrix represents some measurement on the ground having a specific location and spread. This is analogous to an image, where if you zoom in on it enough, you will be able to differentiate between individual pixels, it is just that for rasters, each pixel has a spatial location and size associated with it that we can map onto the earth.
You can either create raster objects de novo or you can acquire them from some external source. To create one from scratch, you start with a matrix of values and then construct the raster object using the
raster() function as:
## Loading required package: sp
r <- matrix(runif(10000),nrow=100) rnd <- raster( r ) rnd
## class : RasterLayer ## dimensions : 100, 100, 10000 (nrow, ncol, ncell) ## resolution : 0.01, 0.01 (x, y) ## extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax) ## crs : NA ## source : memory ## names : layer ## values : 3.676349e-05, 0.9999721 (min, max)
which can be visualized using the normal plot command. The raster library has overridden several of the plotting functions and you can plot raster objects and decorate the images in the same way you do for normal plotting materials (??).
A Raster is simply a matrix of values, and in fact, we can get to the contents of the raster data using the function
values() (these R people are sneaky!)
values(rnd) -> x class(x)
##  "numeric"
##  0.5657748 0.7406126 0.9772058 0.9305299 0.4996966 0.8370530
Just like any other matrix, we can do operations on the contents of the raster itself.
values( rnd ) <- values(rnd) * -1 plot(rnd)
Loading Rasters from Files
Rasters come in many formats and there is a wide array of file types that we can import. For simplicity, we are going to use GeoTIFF formats for input because they represent a large fraction of content we may need to get.
From the Blackboard site, download the
data.zip file and uncomprtess it in the same folder as this notebook. You are using RStudio as a project, right? It should decompress into a folder called
data and if you look into it it will have the following data files.
Some of these are raw text files (the .csv files), some are
R data (the .rda), some are netCDF files (the .nc) and the rest are GeoTiff files (the .tif ones).
To load in the data, we first load in the
Then we can load in one of these files using the
raster() function and passing it the path to the file relative to this document.
elev <- raster("../data/alt_22.tif") elev
## class : RasterLayer ## dimensions : 3600, 3600, 12960000 (nrow, ncol, ncell) ## resolution : 0.008333333, 0.008333333 (x, y) ## extent : -120, -90, 0, 30 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 ## source : /Users/rodney/Desktop/Dyerlab-Webpage/content/post/R/data/alt_22.tif ## names : alt_22 ## values : -202, 5469 (min, max)
Take a minute and look at the raster funciton help file.
For an object like a
raster we can identify some basic commands they understand and know how to do.
methods( class="raster" )
##  [ [<- anyNA as.matrix as.raster is.na Ops ##  plot print ## see '?methods' for accessing help and source code
So, it is clear that we can plot the raster using the normal plotting commands (don’t worry,
ggplot comes into play in a little bit).
plot( elev, xlab="Longitude", ylab="Latitude", col)
Congratulations!!!!! You did it.
That is a rather large extent we are dealing with and for the purpose of our needs, we are going to only use a subset of that big raster. To do this, we need to define an extent (e.g., the bounding box in terms of minimum and maximum longitude and latitude that we want) and then create a new raster that is a cut-out of the previous one.
Let’s make an extent that contains just the Baja California section.
e <- extent( -116, -108, 22, 30) e
## class : Extent ## xmin : -116 ## xmax : -108 ## ymin : 22 ## ymax : 30
We can do this by
eyeballing it, or use an interactive plotting on the raster (and the old mouse) to get some points. There is a great function for this called
And can be used like the code below. This is an interactive thing, so I am not going to run the chunk below (notice the
eval=FALSE code in the first line of the chunk), do it IRL and you will see the output of the x, y, and the value of the cell you are clicking on printed out to the console.
plot(elev) click( elev, xy=TRUE, n=2 )
click() function can create points, line, polygons, have them plotted, or whatever you like. Quite helpful. But for our purposes, we just need it to identify the extent of Baja California and then we will use that to
crop() the original raster.
baja_elev <- crop( elev, e ) baja_elev
## class : RasterLayer ## dimensions : 960, 960, 921600 (nrow, ncol, ncell) ## resolution : 0.008333333, 0.008333333 (x, y) ## extent : -116, -108, 22, 30 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 ## source : memory ## names : alt_22 ## values : -202, 2774 (min, max)
Which now looks like:
plot( baja_elev, xlab="Longitude", ylab="Latitude", main = "Elevation for Baja California", col = terrain.colors( 50 ) )
Now that we have the elevation for our target region, we can delete the big raster from our memory.
rm( elev )
GGPlot and Rasters
As you probably can guess,
ggplot has the ability to do this kind of work as well.
We are going to use our way cool piping function
1. load in the raster,
2. crop it to the extent we want
3. turn it into a
raster("../data/alt_22.tif") %>% crop( extent( -116, -108, 22, 30) ) %>% rasterToPoints() %>% data.frame() -> baja_df summary( baja_df )
## x y alt_22 ## Min. :-115.8 Min. :22.88 Min. :-202.0 ## 1st Qu.:-112.3 1st Qu.:26.64 1st Qu.: 96.0 ## Median :-110.6 Median :27.93 Median : 315.0 ## Mean :-110.9 Mean :27.65 Mean : 501.9 ## 3rd Qu.:-109.2 3rd Qu.:29.05 3rd Qu.: 683.0 ## Max. :-108.0 Max. :30.00 Max. :2774.0
What this did was make a large
data.frame object that we will use for plotting. It is rather large.
dim( baja_df )
##  318933 3
Now, we can plot it using the
geom_tile() or it’s identical twin (that is a bit faster in execution)
baja_df %>% ggplot( aes(x,y,fill=alt_22) ) + geom_raster() + coord_equal()
We can adjust the color ramp to be more approriate as:
baja_df %>% ggplot( aes(x,y,fill=alt_22) ) + geom_raster() + coord_equal() + scale_fill_gradientn( name="Elevation (m)", colours = terrain.colors(100)) + theme_bw() + xlab("Longitude") + ylab("Latitude")
Extracting Data From Rasters
OK, it is all nice an good that we can make, load, manipulate, and plot rasters, what about extracting raw data from them?
To extract data at points, we need to identify the points.
Cities <- c("Cabo San Lucas", "La Pax", "Loreto", "San Javier") Longitude <- c(-109.916, -110.310, -111.343, -111.544) Latitude <- c(22.890, 25.144, 26.012, 25.860 ) cities_df <- data.frame(Cities, Longitude, Latitude) summary( cities_df )
## Cities Longitude Latitude ## Length:4 Min. :-111.5 Min. :22.89 ## Class :character 1st Qu.:-111.4 1st Qu.:24.58 ## Mode :character Median :-110.8 Median :25.50 ## Mean :-110.8 Mean :24.98 ## 3rd Qu.:-110.2 3rd Qu.:25.90 ## Max. :-109.9 Max. :26.01
Now let’s plot this.
library( ggrepel ) baja_df %>% ggplot( ) + geom_raster( aes(x,y,fill=alt_22) ) + geom_label_repel( aes(x = Longitude, y = Latitude, label = Cities), data=cities_df) + coord_equal() + scale_fill_gradientn( name="Elevation (m)", colours = terrain.colors(100)) + theme_bw() + xlab("Longitude") + ylab("Latitude")
To extract the data, we need to identify the items listed in the data frame as representing “Longitude” and “Latitude” objects and make it a type of data that can be
city_pts <- SpatialPoints( cbind(Longitude, Latitude) ) city_pts
## class : SpatialPoints ## features : 4 ## extent : -111.544, -109.916, 22.89, 26.012 (xmin, xmax, ymin, ymax) ## crs : NA
Next AND THIS IS EXTREMELY IMPORTANT, we need to specify the Coordinate Reference System for these data. Notice how ours is missing,
crs( city_pts )
## CRS arguments: NA
What we want to do is make it the same as the raster we will be extracting data from.
crs( baja_elev )
## CRS arguments: ## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
To do this, we assign the
crs of the raster to the points. THIS IS NOT A RE-PROJECTIONS OF THE POINTS, THESE ARE ALERADY IN THE SAME PROJECTION, WE ARE JUST TELLING R WHAT THEY ARE
crs( city_pts ) <- crs( baja_elev ) city_pts
## class : SpatialPoints ## features : 4 ## extent : -111.544, -109.916, 22.89, 26.012 (xmin, xmax, ymin, ymax) ## crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
We will come back to reprojecting points, lines, and polygons later.
Now we can do the extraction.
raster::extract( baja_elev, city_pts )
##  11 NA 7 409
Why might that second point be missing?
If you do not have it installed, just type
install.packages("raster")and it will be installed on your machine↩