Rasters In R

This notebook will take a basic overview of raster data.

Rasters are a form of data that is georeferenced and (somewhat) continuous. Raster data is perhaps best envisioned as a matrix of values, whose entries represent spatially referenced data points. The raster itself can be visualized as you could for normal matrix output. What makes a raster different, however, is that it is (or should be) georeferenced. This means that each element of the matrix represents some measurement on the ground having a specific location and spread. This is analogous to an image, where if you zoom in on it enough, you will be able to differentiate between individual pixels, it is just that for rasters, each pixel has a spatial location and size associated with it that we can map onto the earth.

You can either create raster objects de novo or you can acquire them from some external source. To create one from scratch, you start with a matrix of values and then construct the raster object using the raster() function as:

## Loading required package: sp
r <- matrix(runif(10000),nrow=100)
rnd <- raster( r )
## class      : RasterLayer 
## dimensions : 100, 100, 10000  (nrow, ncol, ncell)
## resolution : 0.01, 0.01  (x, y)
## extent     : 0, 1, 0, 1  (xmin, xmax, ymin, ymax)
## crs        : NA 
## source     : memory
## names      : layer 
## values     : 3.676349e-05, 0.9999721  (min, max)

which can be visualized using the normal plot command. The raster library has overridden several of the plotting functions and you can plot raster objects and decorate the images in the same way you do for normal plotting materials (??).


Raster Structure

A Raster is simply a matrix of values, and in fact, we can get to the contents of the raster data using the function values() (these R people are sneaky!)

values(rnd) -> x
## [1] "numeric"
## [1] 0.5657748 0.7406126 0.9772058 0.9305299 0.4996966 0.8370530

Just like any other matrix, we can do operations on the contents of the raster itself.

values( rnd ) <- values(rnd) * -1 

Loading Rasters from Files

Rasters come in many formats and there is a wide array of file types that we can import. For simplicity, we are going to use GeoTIFF formats for input because they represent a large fraction of content we may need to get.

From the Blackboard site, download the data.zip file and uncomprtess it in the same folder as this notebook. You are using RStudio as a project, right? It should decompress into a folder called data and if you look into it it will have the following data files.

## character(0)

Some of these are raw text files (the .csv files), some are R data (the .rda), some are netCDF files (the .nc) and the rest are GeoTiff files (the .tif ones).

To load in the data, we first load in the raster library1.


Then we can load in one of these files using the raster() function and passing it the path to the file relative to this document.

elev <- raster("../data/alt_22.tif")
## class      : RasterLayer 
## dimensions : 3600, 3600, 12960000  (nrow, ncol, ncell)
## resolution : 0.008333333, 0.008333333  (x, y)
## extent     : -120, -90, 0, 30  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : /Users/rodney/Desktop/Dyerlab-Webpage/content/post/R/data/alt_22.tif 
## names      : alt_22 
## values     : -202, 5469  (min, max)

Take a minute and look at the raster funciton help file.


For an object like a raster we can identify some basic commands they understand and know how to do.

methods( class="raster" )
## [1] [         [<-       anyNA     as.matrix as.raster is.na     Ops      
## [8] plot      print    
## see '?methods' for accessing help and source code

So, it is clear that we can plot the raster using the normal plotting commands (don’t worry, ggplot comes into play in a little bit).

plot( elev, xlab="Longitude", ylab="Latitude", col)

Congratulations!!!!! You did it.

Cropping Rasters

That is a rather large extent we are dealing with and for the purpose of our needs, we are going to only use a subset of that big raster. To do this, we need to define an extent (e.g., the bounding box in terms of minimum and maximum longitude and latitude that we want) and then create a new raster that is a cut-out of the previous one.


Let’s make an extent that contains just the Baja California section.

e <- extent( -116, -108, 22, 30)
## class      : Extent 
## xmin       : -116 
## xmax       : -108 
## ymin       : 22 
## ymax       : 30

We can do this by eyeballing it, or use an interactive plotting on the raster (and the old mouse) to get some points. There is a great function for this called click().


And can be used like the code below. This is an interactive thing, so I am not going to run the chunk below (notice the eval=FALSE code in the first line of the chunk), do it IRL and you will see the output of the x, y, and the value of the cell you are clicking on printed out to the console.

click( elev, xy=TRUE, n=2 )

The click() function can create points, line, polygons, have them plotted, or whatever you like. Quite helpful. But for our purposes, we just need it to identify the extent of Baja California and then we will use that to crop() the original raster.

baja_elev <- crop( elev, e )
## class      : RasterLayer 
## dimensions : 960, 960, 921600  (nrow, ncol, ncell)
## resolution : 0.008333333, 0.008333333  (x, y)
## extent     : -116, -108, 22, 30  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : memory
## names      : alt_22 
## values     : -202, 2774  (min, max)

Which now looks like:

plot( baja_elev, 
      main = "Elevation for Baja California",
      col = terrain.colors( 50 ) )

Now that we have the elevation for our target region, we can delete the big raster from our memory.

rm( elev )

GGPlot and Rasters

As you probably can guess, ggplot has the ability to do this kind of work as well.


We are going to use our way cool piping function %>% to:
1. load in the raster,
2. crop it to the extent we want
3. turn it into a data.frame object

raster("../data/alt_22.tif") %>%
  crop( extent( -116, -108, 22, 30) ) %>%
  rasterToPoints() %>%
  data.frame() -> baja_df

summary( baja_df )
##        x                y             alt_22      
##  Min.   :-115.8   Min.   :22.88   Min.   :-202.0  
##  1st Qu.:-112.3   1st Qu.:26.64   1st Qu.:  96.0  
##  Median :-110.6   Median :27.93   Median : 315.0  
##  Mean   :-110.9   Mean   :27.65   Mean   : 501.9  
##  3rd Qu.:-109.2   3rd Qu.:29.05   3rd Qu.: 683.0  
##  Max.   :-108.0   Max.   :30.00   Max.   :2774.0

What this did was make a large data.frame object that we will use for plotting. It is rather large.

dim( baja_df )
## [1] 318933      3

Now, we can plot it using the geom_tile() or it’s identical twin (that is a bit faster in execution) geom_raster() functions.

baja_df %>%
  ggplot( aes(x,y,fill=alt_22) ) + 
  geom_raster() + 

We can adjust the color ramp to be more approriate as:

baja_df %>%
  ggplot( aes(x,y,fill=alt_22) ) + 
  geom_raster() + 
  coord_equal() + 
  scale_fill_gradientn( name="Elevation (m)", 
                        colours = terrain.colors(100)) + 
  theme_bw() + 
  xlab("Longitude") + ylab("Latitude")

Extracting Data From Rasters

OK, it is all nice an good that we can make, load, manipulate, and plot rasters, what about extracting raw data from them?

To extract data at points, we need to identify the points.

Cities <- c("Cabo San Lucas", "La Pax", "Loreto", "San Javier")
Longitude <- c(-109.916, -110.310, -111.343, -111.544)
Latitude <- c(22.890, 25.144, 26.012, 25.860 )

cities_df <- data.frame(Cities, Longitude, Latitude)

summary( cities_df )
##     Cities            Longitude         Latitude    
##  Length:4           Min.   :-111.5   Min.   :22.89  
##  Class :character   1st Qu.:-111.4   1st Qu.:24.58  
##  Mode  :character   Median :-110.8   Median :25.50  
##                     Mean   :-110.8   Mean   :24.98  
##                     3rd Qu.:-110.2   3rd Qu.:25.90  
##                     Max.   :-109.9   Max.   :26.01

Now let’s plot this.

library( ggrepel )
baja_df %>%
  ggplot( ) + 
  geom_raster( aes(x,y,fill=alt_22) ) + 
  geom_label_repel( aes(x = Longitude, 
                        y = Latitude, 
                        label = Cities), 
                    data=cities_df) + 
  coord_equal() + 
  scale_fill_gradientn( name="Elevation (m)", 
                        colours = terrain.colors(100)) + 
  theme_bw() + 
  xlab("Longitude") + ylab("Latitude")  

To extract the data, we need to identify the items listed in the data frame as representing “Longitude” and “Latitude” objects and make it a type of data that can be

city_pts <- SpatialPoints( cbind(Longitude, Latitude) )
## class       : SpatialPoints 
## features    : 4 
## extent      : -111.544, -109.916, 22.89, 26.012  (xmin, xmax, ymin, ymax)
## crs         : NA

Next AND THIS IS EXTREMELY IMPORTANT, we need to specify the Coordinate Reference System for these data. Notice how ours is missing,

crs( city_pts )
## CRS arguments: NA

What we want to do is make it the same as the raster we will be extracting data from.

crs( baja_elev )
## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0


crs( city_pts ) <- crs( baja_elev )
## class       : SpatialPoints 
## features    : 4 
## extent      : -111.544, -109.916, 22.89, 26.012  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0

We will come back to reprojecting points, lines, and polygons later.

Now we can do the extraction.

raster::extract( baja_elev, city_pts )
## [1]  11  NA   7 409

Why might that second point be missing?

  1. If you do not have it installed, just type install.packages("raster") and it will be installed on your machine

Rodney J. Dyer

Research interests include applied population and landscape genetics, contemporary gene flow, and examining the effects anthropogenic landscape modifications have on extant plant and animal species persistence.