```
library(targets)
library(ggplot2)
library(tidyverse)
library(tidybayes)
```

The Ricker model is a discrete model of population growth. Let’s get to know this model with a few extensions, and fit a demo of it in Stan.

## Simulating the Ricker model

The model in its simplest form looks like this:

\[ N_{t+1} = N_te^{r\left(1 - \frac{N_t}{K}\right)} \] You can see that there is density dependence: when \(N_t << K\), the population grows by a factor of \(e^r\) every time step. When \(N_t = K\), the population hits carrying capacity and the population grows by a factor of 1 : e.g. no change.

To start to draw this function I’ll assign values to the parameters and simulate a curve:

```
<- 0.05
r <- 4
c <- 5
N0 <- 500
K
<- 300
time
<- numeric(time)
N
1] <- N0
N[
for (t in 2:time){
<- rnorm(1, mean = 0, sd = 1)
ri # print(ri)
<- N[t-1]*exp(r * (1 - N[t-1]/K))
N[t]
}
plot(N, type = "l")
```

## Extending it: uncertainty and Allee effects

Now let’s add some uncertainty every time step! This is a kind of process error: growth rates bounce around randomly from one year to the next, perhaps because of unmodelled processes that affect the average growth rate in the population:

\[ \begin{align} N_{t+1} &= N_te^{r_t\left(1 - \frac{N_t}{K}\right)} \\ \log(r_t) &\sim \text{Normal}(\mu,\sigma) \end{align} \] Here I’m using a log link on \(r_t\) because I want to be able to give \(r_t\) a mean and a variance without worrying about the growth rate flipping sign and suddenly becoming negative.

The second modification I want to make is to add Allee effects to the model. This is another kind of density dependence, but at the other extreme of density: instead of the population getting so *large* that it grows ever more slowly, it is also possible for the density to be so *small* that it grows slowly. Its the same kind of thing, and we add it in the same kind of way:

\[ \begin{align} N_{t+1} &= N_te^{r_t\left(1 - \frac{N_t}{K}\right)\left(1 - \frac{C}{N_t}\right)} \\ \log(r_t) &\sim \text{Normal}(\mu,\sigma) \end{align} \]

Now we’re ready to simulate it all again, and this time I want to wrap it in a function for easy repetition later:

```
<- function(
sim_ricker_allee r = 0.05,
C = 9,#44
N0 = 7,
K = 500,
time = 300,
sd_process = 1.2){
<- numeric(time)
N
1] <- N0
N[
for (t in 2:time){
<- rnorm(1, mean = 0, sd = sd_process)
ri # print(ri)
<- N[t-1]*exp(exp(log(r) + ri) * (1 - N[t-1]/K) * (1 - C/N[t-1]))
N[t]
}
return(N)
}
```

We can repeat it many times using a sprinkle of syntatic sugar thanks to the `purrr`

package^{1}:

```
<- map_df(1:50, ~ tibble(
allee_sim N = sim_ricker_allee(
time = 300,C = 6, N0 = 7),
time = 1:300,
7),
.id = "sim") |>
bind_cols(type = "allee")
<- map_df(1:50, ~ tibble(
no_allee_sim N = sim_ricker_allee(
time = 300,
C = 0,
N0 = 7),
time = 1:300,
7),
.id = "sim") |>
bind_cols(type = "no allee")
<- bind_rows(allee_sim, no_allee_sim)
two_sims
|>
two_sims ggplot(aes(x = time, y = N, group = sim)) +
geom_line() +
coord_cartesian(ylim = c(0, 1000)) +
facet_wrap(~type)
```

```
|>
two_sims ggplot(aes(x = time, y = N, group = sim)) +
geom_line() +
facet_wrap(~type) +
coord_cartesian(xlim = c(0, 50), ylim = c(0, 100))
```

and if we look at these another way we can see the difference:

```
<- two_sims |>
deltaN_data filter(type == "no allee") |>
group_by(sim) |>
mutate(deltaN = log(N/lag(N))) |>
drop_na(deltaN)
|>
deltaN_data ggplot(aes(x = N, y = deltaN)) +
geom_point() +
coord_cartesian(ylim = c(-10, 5), xlim = c(0, 600))
```

```
|>
deltaN_data ggplot(aes(x = N, y = deltaN)) +
geom_point() +
coord_cartesian(ylim = c(-.0,1), xlim = c(0, 600))
```

This really doesn’t seem right! The y intercept is supposed to be \(r\), and the X intercept is supposed to be \(K\)! Is the cause the variation in growth rate? that is, the parameter `sd_process`

above?

Let’s make a new simulation where this is set to a very low value: :swea

```
map_df(1:50, ~ tibble(
N = sim_ricker_allee(
time = 300,
C = 0,
N0 = 7,
sd_process = 0.01),
time = 1:300,
7),
.id = "sim") |>
group_by(sim) |>
mutate(deltaN = log(N/lag(N))) |>
drop_na(deltaN) |>
ggplot(aes(x = N, y = deltaN)) +
geom_point() +
coord_cartesian(ylim = c(0, .1), xlim = c(0, 600)) +
geom_hline(yintercept = .05) +
geom_vline(xintercept = 500)
```

Much better! so the process error makes this pretty traditional plot go kind of haywire.

Interestingly, looking back at the previous figures, you can see that the error is not *around* the “correct” line at all but mostly below it. That suggests that trying to model error based on lagged growth is probably not going to give a useful answer for the parameters

## Observation error

So far all of this has been in a perfect, imaginary world where we have perfect information on the population density. In reality, we’ll always have a **sample** of the population density. A simple model for this variation is that it follows a Poisson distribution:

\[ \begin{align} Y_t &\sim \text{Poisson}(N_t)\\ N_{t+1} &= N_t e^{r\left(1 - \frac{N_t}{K}\right)} \\ \end{align} \]

Lets do a simulation of several observations of *one single* time series:

```
set.seed(1618)
<- sim_ricker_allee(C = 5, N0 = 7, time = 120)
avg_dens
<- avg_dens |>
obs_dens imap(
~tibble(
time_id = as.numeric(.y),
obs = rpois(5, lambda = .x),
obs_id = seq_along(obs)
)
|>
) bind_rows()
|>
obs_dens ggplot(aes(x = time_id, y = obs,group = obs_id)) +
geom_line() +
geom_line(aes(x = time_id, y = avg), inherit.aes = FALSE,
col = "red",
data = tibble(
time_id = seq_along(avg_dens),
avg = avg_dens))
```

This shows a few things of interest: the wiggling red line, which shows variation in growth rate at each timestep. This is process error. We can also see variation around this; these is variation coming from a Poisson distribution centered on the true mean population size.

## Coding and validating a model

We’re going to torture the math a little bit, to make it more convenient to write it Stan:

Take the expression for the average and log both sides:

\[ \begin{align} N_{t+1} &= N_te^{r\left(1 - \frac{N_t}{K}\right)} \\ \ln(N_{t+1}) &= \ln{N_t} + r(1 - \frac{N_t}{K}) \\ L_{t+1} &= L_t + e^{\ln{r} + \ln(1 - e^{L_t - \ln{K}})} \\ L_{t+1} &= L_t + e^{s + \ln(1 - e^{L_t - J})} \\ \end{align} \] Here to keep notation simple I’m just writing

- \(s = \ln{r}\)
- \(J = \ln{K}\)
- \(Lt = \ln{N_t}\)

I know what you’re thinking: get help, Andrew!

There’s a couple reasons for this violence:

- Working on the log scale is easier for parameter estimation. it keeps values on similar scales, even though, for example \(r\) and \(K\) have very different magnitudes.
- We can take advantage of Stan’s built-in composed functions
- it will be easier to add hierarchical effects if (when :P ) we want to do that!

Here’s a more complete rendering of the model, which will set us up for writing Stan code in the next section:

\[ \begin{align} Y_i &\sim \text{Poisson\_log}(L_{t[i]}) \\ L_{t+1} &= L_t + e^{\left(s + \ln\left(1 - e^{L_t - J}\right)\right)} \\ L_0 &\sim \text{Normal}(2,1) \\ s &\sim \text{Normal}(-3, 0.5) \\ J &\sim \text{Normal}(6, 1) \end{align} \]

## Footnotes

I’m old enough to remember when we did everything with for loops and the *apply family and I just hope the youth are grateful for the advances they have now!↩︎