Skip to contents

The funbiogeopackage requires that information is structured in three different datasets:

  • the species x traits data.frame (species_traits in funbiogeo), which contains trait values for several traits (in columns) for several species (in rows).
  • the site x species data.frame (site_species in funbiogeo), which contains the presence/absence, abundance, or cover information for species (in columns) by sites (in rows).
  • the site x locations object (site_locations in funbiogeo), which contains the physical locations of the sites of interest

Optionally, an additional dataset can be provided:

  • a species x categories data.frame (species_categories in funbiogeo), which contains two-columns: one for species, one for potential categorization of species (whether it’s taxonomic classes, specific diets, or any arbitrary classification)

Wide vs long format

In funbiogeo these datasets must be in a wide format (where one row hosts several variables across columns), but sometimes information is structured in a long format (one observation per row, also called tidy format).

For instance, the following dataset illustrates the wider format (the presence/absence of all species is spread across columns).

Wide format dataset (used in funbiogeo)
site species_1 species_2 species_3 species_4
A 1 0 1 1
B 0 0 1 1
C 1 1 1 0

The following dataset illustrates the long format (the column species contains the name of the species and the column occurrence contains the presence/absence of species).

Long format dataset
site species occurrence
A species_1 1
B species_1 0
C species_1 1
A species_2 0
B species_2 0
C species_2 1
A species_3 1
B species_3 1
C species_3 1
A species_4 1
B species_4 1
C species_4 0

The fb_format_*() functions

If your data are not split into these wider datasets, you can use the functions fb_format_*() to create these specific objects from a long format dataset.

All these functions take a long dataset as input (argument data), where one row corresponds to the occurrence/abundance/coverage of one species at one site and output a wider object.

Usage

funbiogeo provides a small excerpt of long format data to show how to use the functions. This data sits at system.file("extdata", "raw_mammals_data.csv", package = "funbiogeo").

Let’s import the long format dataset provided by funbiogeo:

# Define the path to long format dataset ----
file_name <- system.file("extdata", "raw_mammals_data.csv", package = "funbiogeo")


# Read the file ----
all_data <- read.csv(file_name)
Long table example
species order site longitude latitude count adult_body_mass gestation_length litter_size max_longevity sexual_maturity_age diet_breadth
sp_001 Cetartiodactyla fb_103 7.27182 59.09736 1 461900.76 235.00 1.25 324 668.20 1
sp_001 Cetartiodactyla fb_1001 20.77182 52.59736 1 461900.76 235.00 1.25 324 668.20 1
sp_001 Cetartiodactyla fb_102 6.77182 59.09736 1 461900.76 235.00 1.25 324 668.20 1
sp_001 Cetartiodactyla fb_104 7.77182 59.09736 1 461900.76 235.00 1.25 324 668.20 1
sp_001 Cetartiodactyla fb_101 6.27182 59.09736 1 461900.76 235.00 1.25 324 668.20 1
sp_001 Cetartiodactyla fb_1000 20.27182 52.59736 1 461900.76 235.00 1.25 324 668.20 1
sp_001 Cetartiodactyla fb_1002 21.27182 52.59736 1 461900.76 235.00 1.25 324 668.20 1
sp_002 Rodentia fb_1000 20.27182 52.59736 1 21.11 19.89 5.64 48 76.04 NA
sp_002 Rodentia fb_1002 21.27182 52.59736 1 21.11 19.89 5.64 48 76.04 NA
sp_002 Rodentia fb_1001 20.77182 52.59736 1 21.11 19.89 5.64 48 76.04 NA

Extracting species x traits data

The function fb_format_species_traits() extracts species traits values from this long table to create the species x traits dataset. Note that one species must have one unique trait value (no trait variation across sites is allowed).

# Extract species x traits data ----
species_traits <- fb_format_species_traits(
  data    = all_data, 
  species = "species", 
  traits  = c("adult_body_mass", "gestation_length", "litter_size",
              "max_longevity", "sexual_maturity_age", "diet_breadth")
)

# Preview ----
head(species_traits, 10)
#>    species adult_body_mass gestation_length litter_size max_longevity
#> 1   sp_001       461900.76           235.00        1.25         324.0
#> 2   sp_002           21.11            19.89        5.64          48.0
#> 3   sp_005           31.60            24.50        4.94          48.0
#> 4   sp_006           21.90            23.68        5.16          52.8
#> 5   sp_010            8.31               NA        1.73         252.0
#> 6   sp_013        31756.51            63.50        4.98         354.0
#> 7   sp_016        22502.01           196.00        1.79         204.0
#> 8   sp_017       240867.13           235.61        1.09         321.6
#> 9   sp_022            9.89            29.00        4.04          38.4
#> 10  sp_026        57224.61           230.00        1.00         300.0
#>    sexual_maturity_age diet_breadth
#> 1               668.20            1
#> 2                76.04           NA
#> 3                43.27           NA
#> 4                57.93            4
#> 5                   NA            1
#> 6               679.37            1
#> 7               400.97           NA
#> 8               659.91            5
#> 9                66.88            2
#> 10              543.28            2

Extracting site x species data

The function fb_format_site_species() extracts species occurrence/abundance/coverage from this long table to create the site x species dataset. Note that one species must have been observed one time at one site (the package funbiogeo does not yet consider temporal survey).

# Format site x species data ----
site_species <- fb_format_site_species(data       = all_data, 
                                       site       = "site", 
                                       species    = "species", 
                                       value      = "count",
                                       na_to_zero = TRUE
)

# Preview ----
head(site_species[ , 1:8], 10)
#>       site sp_001 sp_002 sp_005 sp_006 sp_010 sp_013 sp_016
#> 1   fb_103      1      0      0      1      0      1      1
#> 2  fb_1001      1      1      1      1      1      1      1
#> 3   fb_102      1      0      0      1      0      1      1
#> 4   fb_104      1      0      0      1      0      1      1
#> 5   fb_101      1      0      0      1      0      1      1
#> 6  fb_1000      1      1      1      1      1      1      1
#> 7  fb_1002      1      1      1      1      1      1      1
#> 8  fb_1022      0      0      1      1      1      0      1
#> 9  fb_1018      0      0      1      1      1      0      0
#> 10 fb_1024      0      0      1      1      1      0      1

Extracting site x locations data

The function fb_format_site_locations() extracts sites coordinates from this long table to create the site x locations dataset. Note that one site must have one unique longitude x latitude value.

# Format site x locations data ----
site_locations <- fb_format_site_locations(data       =  all_data, 
                                           site       = "site", 
                                           longitude  = "longitude", 
                                           latitude   = "latitude",
                                           na_rm      = FALSE)

# Preview ----
head(site_locations)
#> Simple feature collection with 6 features and 1 field
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 52.59736 ymin: 6.271821 xmax: 59.09736 ymax: 20.77182
#> Geodetic CRS:  WGS 84
#>      site                  geometry
#> 1  fb_103 POINT (59.09736 7.271821)
#> 2 fb_1001 POINT (52.59736 20.77182)
#> 3  fb_102 POINT (59.09736 6.771821)
#> 4  fb_104 POINT (59.09736 7.771821)
#> 5  fb_101 POINT (59.09736 6.271821)
#> 6 fb_1000 POINT (52.59736 20.27182)

Extracting species x categories data

The function fb_format_species_categories() extracts species values for one supra-category (optional) from this long table to create the species x categories dataset. This category (e.g. order, family, endemism status, conservation status, etc.) can be later by several functions in funbiogeo to aggregate metrics at this level.

# Extract species x categories data ----
species_categories <- fb_format_species_categories(data     = all_data, 
                                                   species  = "species",
                                                   category = "order"
)

# Preview ----
head(species_categories, 10)
#>     species           order
#> 1    sp_001 Cetartiodactyla
#> 8    sp_002        Rodentia
#> 11   sp_005        Rodentia
#> 27   sp_006        Rodentia
#> 59   sp_010      Chiroptera
#> 81   sp_013       Carnivora
#> 89   sp_016 Cetartiodactyla
#> 113  sp_017 Cetartiodactyla
#> 132  sp_022    Eulipotyphla
#> 138  sp_026 Cetartiodactyla

Once your data are in the good format, you can get started with funbiogeo.