--- title: "sdmpredictors quickstart guide" author: "Samuel Bosch" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{sdmpredictors quickstart guide} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- The goal of sdmpredictors is to make environmental data, commonly used for species distribution modelling (SDM), also called ecological niche modelling (ENM) or habitat suitability modelling, easy to use in R. It contains methods for getting metadata about the available environmental data for the current climate but also for future and paleo climatic conditions. A way to download the rasters and load them into R and some general statistics about the different layers. ## Getting the metadata Different list_* functions are available in order to find out which datasets and environmental layers can be downloaded. ### list_datasets With the *list_datasets* function you can view all the available datasets. If you only want terrestrial datasets then you have to set the marine parameter to FALSE and vice versa. ```{r, message=F, warning=F} library(sdmpredictors) # exploring the marine datasets datasets <- list_datasets(terrestrial = FALSE, marine = TRUE) ``` ```{r, echo = FALSE} knitr::kable(datasets, row.names = FALSE) ``` ### list_layers Using the *list_layers* function we can view all layer information based on datasets, terrestrial (TRUE/FALSE), marine (TRUE/FALSE) and/or whether it should be monthly data. The table only shows the first 4 columns of the first 3 layers. ```{r} # exploring the marine layers layers <- list_layers(datasets) ``` ```{r, echo = FALSE} knitr::kable(layers[1:3,1:4], row.names = FALSE) ``` ## Citing data With the *dataset_citations* and *layer_citations* functions you can fetch plain text or bibentries for the datasets and layers used, allowing for proper citation of the data. ```{r} # print the Bio-ORACLE citation print(dataset_citations("Bio-ORACLE")) # print the citation for ENVIREM as Bibtex print(lapply(dataset_citations("WorldClim", astext = FALSE), toBibtex)) # print the citation for a MARSPEC paleo layer print(layer_citations("MS_biogeo02_aspect_NS_21kya")) ``` ## Loading the data ### load_layers To be able to use the layers you want in R you have to call the *load_layers* function with ```{r, eval = FALSE} # download pH and Salinity to the temporary directory load_layers(layers[layers$name %in% c("pH", "Salinity") & layers$dataset_code == "Bio-ORACLE",], datadir = tempdir()) # set a default datadir, preferably something different from tempdir() options(sdmpredictors_datadir= tempdir()) # (down)load specific layers specific <- load_layers(c("BO_ph", "BO_salinity")) # equal area data (Behrmann equal area projection) equalarea <- load_layers("BO_ph", equalarea = TRUE) ``` ## Loading future and paleo data Similarly to the current climate layers ```{r} # exploring the available future marine layers future <- list_layers_future(terrestrial = FALSE) # available scenarios unique(future$scenario) unique(future$year) paleo <- list_layers_paleo(terrestrial = FALSE) unique(paleo$epoch) unique(paleo$model_name) ``` Other functions related to layers metadata and future and paleo layers are: ```{r} get_layers_info(c("BO_calcite","BO_B1_2100_sstmax","MS_bathy_21kya"))$common # functions to get the equivalent future layer code for a current climate layer get_future_layers(c("BO_sstmax", "BO_salinity"), scenario = "B1", year = 2100)$layer_code # functions to get the equivalent paleo layer code for a current climate layer get_paleo_layers(c("MS_bathy_5m", "MS_biogeo13_sst_mean_5m"), model_name = c("21kya_geophysical", "21kya_ensemble_adjCCSM"), years_ago = 21000)$layer_code ``` ## Statistics Two types of statistics are available for the current climate layers: - individual layer statistics - summary statistics (minimum, q1, median, q3, maximum, mad, mean and sd) - spatial autocorrelation (Moran's I and Geary's C) - Pearson correlation coefficient between the different layers and their quadratic ```{r, message=F, warning=F} # looking up statistics and correlations for marine annual layers datasets <- list_datasets(terrestrial = FALSE, marine = TRUE) layers <- list_layers(datasets) # filter out monthly layers layers <- layers[is.na(layers$month),] layer_stats(layers)[1:2,] correlations <- layers_correlation(layers) # create groups of layers where no layers in one group # have a correlation > 0.7 with a layer from another group groups <- correlation_groups(correlations, max_correlation=0.7) # group lengths sapply(groups, length) for(group in groups) { if(length(group) > 1) { cat(paste(group, collapse =", ")) cat("\n") } } # plot correlations (requires ggplot2) plot_correlation(correlations) ```