Polygons

ImportantSummary

This tutorial explore how to handle spatial polygons in R with terra package:

  • read a spatial object with (terra::vect())
  • calculate length of lines with (terra::perim())
  • calculate distances between points and lines with (terra::distance())
TipThe ecologist mind

In which type of habitats were the otter observed? To answer this question, we will need to discover another type of spatial vector: polygons.

Setup

Follow the setup instructions if you haven’t followed the tutorial on points

If haven’t done it already, please follow the setup instructions. Let’s start with loading the required packages.

suppressPackageStartupMessages({
  library(mapview)
  library(here)
  library(sf)
  library(terra)
})
pt_otter <- vect(here("data", "gbif_otter_2021_mpl50km.gpkg"))
pt_otter <- vect(
  "https://github.com/FRBCesab/spatial-r/raw/main/data/gbif_otter_2021_mpl50km.gpkg"
)

Load polygons from a shapefile

In this example, we will load land cover information for the area of interest from IGN data BD CARTO.

Note that this dataset has rough resolution (OSO or Corine land cover would be more suited for real analysis), but it’s perfect for our illustration and learning purposes.

You can load all vector dataset in R with the function terra::vect().

landuse <- vect(here("data", "BDCARTO-LULC_mpl50km.shp"))
landuse_sf <- st_read(here("data", "BDCARTO-LULC_mpl50km.shp"))
Reading layer `BDCARTO-LULC_mpl50km' from data source 
  `/home/romain/GitHub/spatial-r/data/BDCARTO-LULC_mpl50km.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 2910 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 3.18983 ymin: 43.20373 xmax: 4.56447 ymax: 44.16754
Geodetic CRS:  WGS 84
landuse <- vect(
  "https://github.com/FRBCesab/spatial-r/raw/refs/heads/main/data/BDCARTO-LULC_mpl50km.shp"
)
NoteYour turn
  1. How many different polygons make the land cover of the area?
  2. What is the coordinate reference system (CRS) of the loaded river data?
  3. How many land cover classes are there? which class has most polygons?
Click to see the answer
  1. There was 2910 polygons in the dataset. You can access it with dim(landuse), nrow(landuse) or just by typing landuse in the console.
  2. The coordinates are in WGS 84 (EPSG 4326). You can access this information with crs(landuse, describe = TRUE) (or in sf with st_crs(landuse_sf)).
  3. The column that stored the cover classes is nature. You could identify it with head(landuse) or names(landuse). There are 12 land cover classes, Prairie is the class with most polygons (table(landuse$nature)).

Visualization

mapview(landuse, z = "nature") +
  mapview(pt_otter, col.regions = "black")
plot(landuse, y = "nature", border = NA)
plot(pt_otter, add = TRUE)

plot(landuse_sf["nature"], reset = FALSE, border = FALSE, axes = TRUE)
plot(pt_otter, add = TRUE, pch = 16)

Warning

If you want to overlay multiple spatial object with base plot from sf, don’t forget to use the argument reset=FALSE.

Calculate the area

The function expanse calculate the area in \(m^2\). When calculating areas, be careful with projection systems. Some are not suited to calculate areas. Prefer equalarea projections or use local projection system (if your study area is small).

The package terra recommends the calculation of areas in lat/long to get more accurate results (accounting for Earth’s curvature). The package sf handle lat/long with s2 which is not very accurate or prone to errors.

So it is recommended to use terra when dealing with geographic coordinates, or projecting the coordinates in an appropriate projection system.

# calculate the area in ha
area_polygons <- expanse(landuse) * 0.0001
# store the area as atribute
landuse$area_ha <- as.numeric(area_polygons)

For sf, it is recommended to project the dataset to a local projection, such as the Universal Transverse Mercator defined for Europe (EPSG:25832).

# 25832 is a commun projection for France
landuse_25832 <- st_transform(landuse_sf, crs = 25832)
# calculate area of polygons
area_polygons <- st_area(landuse_25832) * 0.0001
# store as attribute in landuse
landuse_25832$area_ha <- as.numeric(area_polygons)
NoteYour turn
  1. Which is the largest land cover class in our study area? Why?
  2. Remove the Mediterranean from the land cover dataset (hint: it is the largest polygon).
  3. Calculate the percentage of terrestrial land cover in the study area.
Click to see the answer
# see area per land use classes
tapply(landuse$area_ha, landuse$nature, sum)
              Bâti       Broussailles Carrière, décharge          Eau libre 
       55654.39730        97923.34835         1916.22618       272561.76875 
             Forêt      Marais salant  Marais, tourbière            Prairie 
      278134.18692         3455.79520        29467.47117       206833.73332 
   Rocher, éboulis     Sable, gravier      Vigne, verger   Zone d'activités 
          37.78454         2353.14638       206514.68518        15224.16350 
Click to see the answer 2
# the largest polygon correspond to the mediterranean
landuse$nature[which.max(landuse$area_ha)]
[1] "Eau libre"
# remove the largest polygon
landuse_nomed <- landuse[-which.max(landuse$area_ha), ]
# verify the operation
plot(landuse_nomed, y = "nature", border = NA)

Click to see the answer 3
cover_ha <- tapply(landuse_nomed$area_ha, landuse_nomed$nature, sum)
# percentage of land cover classes in the terrestrial study area
cover_perc <- cover_ha / sum(cover_ha) * 100
#show rounded values
round(cover_perc, 2)
              Bâti       Broussailles Carrière, décharge          Eau libre 
              5.98              10.52               0.21               3.61 
             Forêt      Marais salant  Marais, tourbière            Prairie 
             29.87               0.37               3.16              22.21 
   Rocher, éboulis     Sable, gravier      Vigne, verger   Zone d'activités 
              0.00               0.25              22.18               1.64 

Points to polygons

Before making extraction, it is recommended to plot the data (if not too big), to make sure the projection systems are the same and the extents match. Do not use mapview (interactive map) because it will automatically project the data.

Extract the land cover class of the points with terra::extract():

pt_landcover <- extract(landuse, pt_otter)

Extract the land cover class of the points with sf::st_join():

pt_landcover <- st_join(st_as_sf(pt_otter), landuse_sf)
# see area per land use classes of the observations
table(pt_landcover$nature)

             Bâti      Broussailles         Eau libre             Forêt 
                8                17                 1                29 
Marais, tourbière           Prairie     Vigne, verger 
                3                16                 9 
NoteYour turn
  • [stat] Which classes are over or under represented compare to expected? (to complex?)
Click to see the answer
# see area per land use classes of the observations
cover_obs_perc <- table(pt_landcover$nature) / nrow(pt_landcover) * 100
# add missing classes
m0 <- match(names(cover_perc), names(cover_obs_perc))
cover <- data.frame(
  class = names(cover_perc),
  all = as.numeric(cover_perc),
  obs = as.numeric(cover_obs_perc[m0])
)
cover[is.na(cover)] <- 0
cover$delta <- cover$obs - cover$all
barplot(cover$delta, horiz = TRUE, names = cover$class, las = 1)

Polygons to polygons

In many cases, we want to characterize the area surrounding the observations. So we will create buffers.

Create buffer

We can create buffer with terra::buffer().

dist_buffer <- 1000 # buffer of 1000 m
poly_otter <- buffer(pt_otter, dist_buffer)

Same here, sf uses s2 with geographic coordinates (e.g. EPSG:4326) which creates buffers with low quality (as in sf 1.0-21 on 03/11/2025).
So it is recommended to use projected coordinates with sf::st_buffer().

dist_buffer <- 1000 # buffer of 1000 m
# transform as UTM : better for area
pt_otter_25832 <- st_as_sf(pt_otter) |> st_transform(25832)
poly_otter_25832 <- st_buffer(pt_otter_25832, dist_buffer)

Visualize buffer

plot(poly_otter[1])
plot(pt_otter[1], add = TRUE)

plot(st_geometry(poly_otter_25832)[1], reset = FALSE, axes = TRUE)
plot(st_geometry(pt_otter_25832)[1], add = TRUE)

NoteYour turn
  • What are the areas of the newly created buffers?
Click to see the answer
# see area per land use classes of the observations
summary(expanse(poly_otter))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3128689 3128689 3128689 3128689 3128689 3128689 
# for sf
summary(st_area(poly_otter_25832))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3140157 3140157 3140157 3140157 3140157 3140157 
The difference with 3.1415927\times 10^{6} is due to the way the circle is defined (in only 40 points in terra and 120 points in sf). But what is more important is that all buffers have the same area.

Intersection

buffer_landcover <- intersect(landuse, poly_otter)

# visualize the intersection (not plotted to lighten the webpage)
# mapview(buffer_landcover, z = "nature")

# show a zoom on the land cover of the first buffer
plot(
  buffer_landcover[buffer_landcover$key %in% poly_otter$key[1]],
  y = "nature"
)

buffer_landcover_sf <- st_intersection(poly_otter_25832, landuse_25832)
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
# visualize the intersection (not plotted to lighten the webpage)
# mapview(buffer_landcover, z = "nature")

Land cover per buffer

We need to recalculate the area of the intersected land cover.

buffer_landcover$area_ha <- expanse(buffer_landcover) * 0.0001

# Calculate area per buffer and nature class
class_area <- tapply(
  buffer_landcover$area_ha,
  list(buffer_landcover$key, buffer_landcover$nature),
  sum
)
#replace NA by 0
class_area[is.na(class_area)] <- 0
# calculate the overlay area
sum_area <- rowSums(class_area)
# calculate the percentage per class
perc_class <- class_area / sum_area * 100

barplot(apply(perc_class, 2, mean), las = 2)