HowToMizer/toyexample1Old.Rmd at master · juliablanchard/HowToMizer · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
---
title: "Example #1: Calibration Protocol - Time-Averaged"
author: Julia L. Blanchard
date: July 2020
place: Hobart, Australia
#output: pdf_document
output: html_document
always_allow_html: true
#runtime: shiny
---

# Calibrating a multi-species model to time-averaged species' catches

In this example we will explore how to calibrate a size spectrum model to data using the "mizer" R package.

Recall, there are three different kinds of size spectrum models in mizer, of increasing complexity:

1) community model: purely size-based and representative of a single but  "average" species across the whole community.

2) trait-based model, which disaggregates the size spectrum into different groups with different life-histories, through differences in each "species" asymptotic which determines other life-history parameters such as the size at maturity (Hartvig et al. 2011, Andersen & Pedersen, 2010).

3) multispecies model - which has the same equations and parameters as the trait-based model but is parameterised to represent multiple species in a real system, where each species can have many differing species-specific traits (Blanchard et al. 2014).

Here we focus on multispecies size spectrum models. In practice, these models have been parametrised in a few different ways depending on data availability for a system or research questions.

Some studies have focused on  many species-specific values, for example where each species have different values of life-history, size-selective feeding trait parameters (e.g. $\beta$ and $\sigma$), and details of species interactions (Blanchard et al. 2014, Reum et al. 208) to better capture the dynamics of marine food webs.

Others, such as Jacobsen et al. (2014, 2016), have represented variation in only a couple of the most important life history parameters for each species - asymptotic size (which links to other parameters such as maturation size), growth, vulnerability and recruitment parameters ($R_{max}$, $eRepro$) to broadly capture fished communities or carry out across ecosystem comparisons.

Once you have parametrised the multispecies model for your system (section 1), you may find that species do not coexist or the biomass or catches are very different from your observations. After the model is parameterised and assessed for basic principles and coexistence (section 3), further calibration to observational data is used to ensure the relative abundance of each species reflects the system (section 4), at least during a stable period, which is time-averaged.

The background resource parameters and the recruitment parameters, $R{max}$ (maximum recruitment) and $erepro$ (reproductive efficiency) greatly affect the biomasses of species in your system.

The recruitment parameters capture those density dependent effects that are not explicitly modelled. Here, we use a Beverton-Holt type stock recruitment relationship to represent these effects.

As a starting point, we will estimate these parameters as a means of fitting the modelled species catches to the observed catches. This could similarly be carried out with biomasses. Other model detailed approaches also exist, see the main paper, but this approach has been used to get models in the right "ball-park", which can them be further evaluated using diagnostics (example X) and fitted to time series data (example XX).

TODO new structure

1: Set up the model and initial simulation (as now). Use eRepro = 0.01 and Rmax = inf.
2: Calibrating Rmax only. Do it by reducing Rmax of the larger species “by hand” (as now). Show the RDI/RDD plot. Show the plotPredObsYield plot
3: Calibrating Rmax with optimisation (the current section 3). show the plotPredObsYield plot.
4: Calibrate growth. Plot the growth curves. Introduce how they can (to some degree) be affected by kappa.
5: Calibrate recruitment (eRepro). Explain what eRepro does, and use the app to manually play around with eRepro. Show how eRepro affects the Fmsy. Go through adjustments of eRepro of individual species to obtain a better predicted Fmsy.
[6: Calibrate food competition (rPP) — to be discussed/written]


### A Simple Protocol for Multispecies Model Calibration

We will adapt the "recipe" for calibration in Jacobsen et al 2014 (see supp. mat.) and Blanchard et al (2014), into the following steps:

0. Run the model with the chosen species-specific parameters. This will relate some of the missing parameters to $w_{inf}$ ($h$ and $\gamma$ - explain through simple example of how the model works?). $R_{max}$ (see example that explains $R_{max}$?) could also be automatically calculated based on equilbrium assumptions (Andersen et al. 2016) but by default it is "$Inf$", which means there is no density dependence associated with spawner-recruit dynamics (RF: default is the 2016 method at the moment but setting up at $Inf$ to start with something that doesn't coexist below).

1. Obtain the time-averaged data (e.g. catches or biomasses for each species) and the time-averaged fishing mortalty inputs (e.g. from stock assessments). Typically this should be over a stable part of the time series for your system.

2. Start with the chosen parameters for $\kappa$ and $\lambda$ of the resource spectrum that are obtained from the literature regarding the community size spectrum. These can be very uncertain and sometimes are not available. Calibrate the carrying capacity of the background resource spectrum, $\kappa$, by examining the feeding level, biomass through time, and overall size spectrum.

3. Calibrate the maximum recruitment, $R_{max}$, which will affect the relative biomass of each species (and, combined with the fishing parameters, the catches) by minimising the error between observed and estimated catches (again or biomasses).

4. Check that the physiological recruitment, $RDI$, is much higher than the realised recruitment, $RDD$. This can be done using the `getRDD()` and `getRDI()` functions and calculating the ratio which should be around 100 for a species with $w_{inf} = 1500g$ (e.g. Whiting/Plaice), but varies with asymptotic size and fishing mortality (Andersen 2019). High $RDI/RDD$ ratio indicates the carrying capacity is controlling the population rather than predation or competition. Larger species often require more of this density dependent control than smaller ones. If $RDI/RDD$ is too high, the efficiency of reproduction ($erepro$) can be lowered to ensure species do not outcompete others or or over-resilient to fishing. Lowering $erepro$ biologically means higher egg mortality rate or wasteful energy invested into gonads. If $RDI/RDD = 1$ the species is in the linear part of the stock recruitment relationship (no spawner-recruit density dependence).

5. Verify the model after the above step by comparing the model with: species biomass or abundance distributions, feeding level, natural mortality, growth, vulnerablity to fishing (fmsy) and catch, diet composition. Many handy functions for plotting these are available here: https://sizespectrum.org/mizer/reference/index.html

6. The final verification step is to force the model with time-varying fishing mortality to assess whether changes in time series in biomassess and catches capture observed trends. The model will not cpature all of the fluctuations from environmental processes (unless some of these are included), but should match the magnitude and general trend in the data. (RF: this is going to be Example2)

## Step 0. Run the model with the chosen species-specific parameters.

In this section you will:

- obtain or create a dataframe of species-specific parameters

- run the dataframe through the `mizer` package and examine the model output


A species-specific dataframe is already stored in `mizer`, which contains the North Sea Model Parameters (RF however is probably not updated so using the .csv in the repository)


```{r step 0 - initial parameters, message= F }

# if user has not installed the requird packages
# install.packages("tidyverse")
# install.packages("plotly")
#devtools::install_github("sizespectrum/mizer")
#devtools::install_github("sizespectrum/mizerExperimental")

library(mizerExperimental) # for projectToSteady()
library(mizer)
library(tidyverse)
library(plotly)

# loading North Sea data
nsParams <- read.csv("data/nsparams.csv")[,-1]

# This data frame already has Rmax values, let's remove them to calibrate them again later
nsParams[,"r_max"] <- Inf
nsParams <- nsParams[order(nsParams$w_inf),] # ordering by asymptotic size for color gradient

# If you want to make it less multi-species and more trait-based model
# nsParams[,"beta"] <-100
# nsParams[,"sigma"] <-1.5


```


```{r step 0 - loading plot function, include = F}
require(cowplot)
require(gridExtra)

g_legend<-function(a.gplot){
  tmp <- ggplot_gtable(ggplot_build(a.gplot))
  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
  legend <- tmp$grobs[[leg]]
  return(legend)}

plotSummary <- function (x, y, power = 2, wlim = c(.1,NA), short = F, ...)
{
  xlim = c(NA,10^log10(max(x@params@species_params$w_inf)))
  font_size = 7

  # need to display the legend at the bottom and only p1 has the background so using that one

  p1 <- plotSpectra(x, power = power, wlim = wlim, ...)
  p1 <- p1  + scale_x_continuous(limits = xlim, trans = "log10", name = "Individual size [g]") +
             theme(axis.title.x=element_blank(),
                    axis.text.x=element_blank(),
                    axis.ticks.x=element_blank(),
                   text = element_text(size=font_size),
                    legend.position = "bottom", legend.key = element_rect(fill = "white"))+
    guides(color = guide_legend(nrow=2))

  mylegend<-g_legend(p1) # save the legend
    p1 <- p1 + theme(legend.position = "none") # now remove it from the plot itself

      p7 <- plotBiomass(x)
  p7 <- p7 + theme(legend.position = "none",
                   text = element_text(size=font_size),)

    if(short)
{
      p1 <- p1 +theme(axis.title.x=element_text(),
                    axis.text.x=element_text(),
                    axis.ticks.x=element_line())
  p10 <- plot_grid(p1, p7, mylegend,
                   rel_heights = c(3,3,1),
                   # rel_widths = c(2,1),
           ncol = 1, align = "v", axis = "l")

} else
{


  p2 <- plotFeedingLevel(x, include_critical = F, ...)
  p2 <- p2 + scale_x_continuous(limits = xlim, trans = "log10") +
    theme(axis.title.x=element_blank(),
          axis.text.x=element_blank(),
          axis.ticks.x=element_blank(),
          text = element_text(size=font_size),
          legend.position = "none")
  p3 <- plotPredMort(x, ...)
  p3 <- p3 + scale_x_continuous(limits = xlim, trans = "log10") +
    theme(axis.title.x=element_blank(),
          axis.text.x=element_blank(),
          axis.ticks.x=element_blank(),
          text = element_text(size=font_size),
          legend.position = "none")
  p4 <- plotFMort(x, ...)
  p4 <- p4 + scale_x_continuous(limits = xlim, trans = "log10", name = "Individual size [g]") +
    theme(legend.position = "none",
          text = element_text(size=font_size),)


# yeild and ssb |

  bm <- getBiomassFrame(x)
  plot_dat <- filter(bm, Year == max(unique(bm$Year)))
  # plot_dat$w_inf <- x@params@species_params$w_inf
  yieldDat <- getYield(x)
  plot_dat$yield <- yieldDat[dim(yieldDat)[1],]
  plot_dat$Year <- NULL
  plot_dat <- melt(plot_dat,"Species")
  p5 <- ggplot(plot_dat) +
    geom_bar(aes(x = Species,y = value, fill = Species, alpha = variable), stat = "identity", position = position_dodge()) +
    coord_cartesian(ylim = c(0.5*min(plot_dat$value),NA)) +
    # geom_text(aes(x = Species, y = value, label = Species), check_overlap = T)+

    # geom_point(aes(x = w_inf, y = Biomass, color = species)) +
    # geom_point(aes(x = w_inf, y = yield, color = species), shape = "+", size = 5) +
    # scale_x_continuous(name = "SSB and Yield") +
    scale_y_continuous(trans = "log10", name = "SSB and Yield") + #, limits = c(0.5*min(plot_dat$value),NA)) +
    scale_fill_manual(name = "Species", values = x@params@linecolour) +
    scale_alpha_manual(name = "Stat", values = c(1, 0.5), labels = c("SSB","Yield")) +
    theme(axis.title.x=element_blank(),
          axis.text.x=element_blank(),
          axis.ticks.x=element_blank(),
          text = element_text(size=font_size),
          legend.position = "bottom", legend.key = element_rect(fill = "white"))

  mylegend<-g_legend(p5) # save the legend
  p5 <- p5 + theme(legend.position = "none")

  # try yield divided by total biomass, we want to see the difference

  # r0

  plot_dat <- as.data.frame(getRDI(x@params)/getRDD(x@params))
  plot_dat$species <- factor(rownames(plot_dat),x@params@species_params$species)
  colnames(plot_dat)[1] <- "ratio"
  plot_dat$w_inf <- as.numeric(x@params@species_params$w_inf)

  # trying to have bars at their w_inf but on a continuous scale
  plot_dat$label <- plot_dat$species
  plot_dat2 <- plot_dat
  plot_dat2$ratio <- 0
  plot_dat2$label <- NA
  plot_dat <- rbind(plot_dat,plot_dat2)


  p6 <- ggplot(plot_dat) +
    geom_line(aes(x = w_inf, y = ratio, color = species), size = 15, alpha = .8) +
    geom_text(aes(x = w_inf, y = ratio, label = label),position = position_stack(vjust = 0.5), angle = 30, size = 3)+
    scale_color_manual(name = "Species", values = x@params@linecolour) +
    scale_y_continuous(name = "RDI/RDD") +
    scale_x_continuous(name = "Asymptotic size (g)", trans = "log10") +
    theme(legend.position = "none",
          text = element_text(size=font_size),)


  p10 <- plot_grid(p1,p2,p3,p4, p5, p6, p7, mylegend, byrow = F,
                   # rel_heights = c(1,1,1,2),
                   rel_widths = c(2,1),
            nrow = 4, align = "v", axis = "l")
}
 # p <- grid.arrange(p10,mylegend, nrow=2,heights=c(9.5,0.5))
return(p10)
}
```


```{r step 0 - parameters}

params_uncalibrated <- newMultispeciesParams(nsParams, inter, kappa = 1e11, max_w=1e6) # inter comes with loading "mizer"

# note the volume of this model is set to the reflect the entire volume of the North Sea - hence the very large kappa value. This is system specific and you may wnat to work with per m^3 as in the defaults.

#  Add other params for info
#  param$Volumecubicmetres=5.5e13    #unit of volume. Here total volume of North sea is used (Andersen & Ursin 1977)

# have a look at species parameters that have been calculated
# params_uncalibrated@species_params

# alternative params without redundant parameters to reduce the size of the dataframe on the screeen
params_uncalibrated@species_params[,-which(colnames(params_uncalibrated@species_params) %in%
                                             c("sel_func","gear","interaction_resource","pred_kernel_type","m","alpha","n","p","q","w_min"))]

```

w_inf: asymptotic size

w_mat: maturation size (determines when 50% of the population has matured / not sure!)

beta: preferred predator/prey mass ratio

sigma: width of the feeding kernel

R_max: Beverton-Holt density dependence parameter

k_vb: von Bertalanffy growth parameter

l25: length at ...

l50: length at ...

a: coefficient for age to size conversion

b: constant for age to size conversion

catchability: fisheries efficiency

h: maximum intake rate

k: metabolism constant

ks: metabolism coefficient

z0: background mortality coefficient

gamma: search volume (obtained from beta and sigma)

w_mat25: weight at which 25% of individuals are mature

erepro: coefficent that weights reproductive output


```{r step 0 - first run,message=F, warning = F}

# lets' change the plotting colours, by far the hardest and trickiest part

# looks good but hard to distinguish some species
library(viridis)
params_uncalibrated@linecolour[1:12] <-viridis(dim(params_uncalibrated@species_params)[1])

# easier to read plots but color meaningless
# library(pals)
# params_uncalibrated@linecolour[1:12] <-glasbey(dim(params_uncalibrated@species_params)[1])

# Gradient over asymptotic size but might get difficult to distinguish species past 9
# colfunc <- colorRampPalette(c("firebrick3","darkturquoise", "orange"))
# colGrad <- colfunc(dim(params_uncalibrated@species_params)[1])
#  #names(colGrad) <- params_uncalibrated@species_params$species[order(params_uncalibrated@species_params$w_inf)] # gradient is ordered per winf
#
# params_uncalibrated@linecolour[1:12] <- colGrad
#
params_uncalibrated@linecolour["Resource"] <-"seagreen3"


# write color routine for color gradient readable but related to winf (can use my usual one but limited to ~9 species for clarity
# maybe have different color depening on species number

# run with fishing
sim_uncalibrated <- project(params_uncalibrated, t_max = 100, effort = 1)

plotSummary(sim_uncalibrated, short = T)

```

The top panel shows the different species size spectrum at the last time step of the simulation while the bottom panel shows the abundance per species through time.
These plots show that all of the species but three have collapsed. This is because there was no external density dependence ($R_{max}$ is set at $Inf$) and the largest species (Cod and Saithe) have outcompeted all of the rest.

## Step 1. Obtain the time-averaged data (e.g. catches or biomasses for each species) and the time-averaged fishing mortalty inputs (e.g. from stock assessments).

In this section you will:

- Download fisheries data and process them in a format comparable to the model output


The following .csv are extracted from the ICES database using "data/getICESFishdata_param.R". Fishing data is averaged over 2014-2019 as it's a relatively stable period in catches.

```{r step 1 - loading data}

# fisheries mortality F
fMat <- read.csv("data/fmat.csv")
fMatWeighted <- read.csv("data/fmatWeighted.csv") # Sandeel and Cod have multiple data base so average their F and weighting by SSB

# read in time-averaged  catches
catchAvg <-read.csv("data/time-averaged-catches.csv") # only that one is used at the moment | catches are estimated from fMatW

# ssb
ssbAvg <- read.csv("data/time-averaged-SSB.csv")

```


```{r step 1 - visual representation of the data, warning=F}

plot_dat <- reshape2::melt(fMatWeighted,"X")
colnames(plot_dat) <- c("Time", "Species", "F")

ggplot(plot_dat)+
  geom_line(aes(x = Time, y = F, color = Species))+
  scale_y_continuous(name = "Fisheries mortality rate") +
  scale_color_manual(values = params_uncalibrated@linecolour)

plot_dat <- data.frame(catchAvg,ssbAvg)
plot_dat$species.1 <- NULL
colnames(plot_dat) <- c("Species", "average catch", "average SSB")
plot_dat$Species <- factor(as.character(plot_dat$Species),levels = c(as.character(params_uncalibrated@species_params$species)))
plot_dat <- reshape2::melt(plot_dat,"Species")


  ggplot(plot_dat) +
    geom_bar(aes(x = Species,y = value, fill = Species, alpha = variable), stat = "identity", position = position_dodge()) +
    # coord_cartesian(ylim = c(0.5*min(plot_dat$value),NA)) +

    scale_y_continuous(trans = "log10", name = "Catch (clear) and SSB (solid)") + #, limits = c(0.5*min(plot_dat$value),NA)) +
    scale_fill_manual(name = "Species", values = params_uncalibrated@linecolour) +
    scale_alpha_manual(name = "Stat", values = c(0.5, 1), labels = c("Catch", "SSB")) +
    theme(
          legend.position = "none", legend.key = element_rect(fill = "white"))


```
RF: Dab and Gurnard have some issues

## Step 2. Calibrate the carrying capacity of the background resource spectrum, $\kappa$, at steady state

In this section you will:

- guess reasonable $erepro$ and $R_{max}$ values which will stop out-competition from a few species

- vary $\kappa$ values until you reach coexistence for all species

- check species' growth curve, which is influenced by $\kappa$ to see if it diverged from the Von Bertalanffy curves (data poor case) or growth data (data rich case)


```{r step 2 - guessing coexistence, warning=F, message=F}

# the fishing mortality rates are already stored in the param object as
params_uncalibrated@species_params$catchability

# let's start again and replace with the initial pre-calibration "guessed" Rmax
params_guessed <- params_uncalibrated
# penalise the large species with higher density dependence
params_guessed@species_params$R_max <- params_guessed@resource_params$kappa*params_guessed@species_params$w_inf^-1
# and reduce erepro
params_guessed@species_params$erepro <- 1e-3

params_guessed <- setParams(params_guessed)
# run with fishing
sim_guessed <- project(params_guessed, t_max = 100, effort =1)
plotSummary(sim_guessed, short = T)


```

Here, Sprat's biomass is orders of magnitude lower than the other species and so are Saithe's largest individuals, but at least species coexist.

In `mizerExperiental`, the `projectToSteady()` function looks for a biomass equilibrium state for a set of intial parameters (RF: is that right?). Let's see how it behaves with our model:

```{r step 2 - project to Steady, message = F, warning = F}

## compare with Gustav's projectToSteady
params_steady <- projectToSteady(params_guessed, t_max = 100, return_sim = F)
sim_steady <- project(params_steady, t_max = 300, effort =1)
plotSummary(sim_steady, short = T)

# zoom on the size spectrum
plotSpectra(sim_steady,power=2,total = T,wlim = c(.1,NA))


```
Species are coexisting (but Sprat's still low). This is in part because we applied a stronger $R_{max}$ effect for larger species. You can play with the above parameters but but it would take a lot of trial an error to achieve the right combination to get the biomass or catches similar to the observations.

We could explore the effects further using Rshiny app, where we also have a plot of the biomass or catch data. First let's look at the basic diagnostics and tune $\kappa$ and $erepro$ to make sure the feeding levels are high enough for each species and that biomasses coexist.

RF: Rshiny section disabled for pdf

```{r step 2 r-shiny, eval = F, cache = F, include = F}
# Optional
# adjust Rmax and/or reproductive efficiency to examine whether improved steady state is achievable that way
library(shiny) # no need if runtime = shiny is in the YAML
# runApp("shiny-equilibrium")
# is there a way to save the final chosen values?
params_shiny <- params_guessed

shinyApp(

  ui=fluidPage(

  # Application title
  titlePanel("North Sea Model Example"),

  fluidRow(
    column(4, wellPanel(
       sliderInput("kappa", "log10 Resource Carrying Capacity:", min = 8, max = 12, value = 10.7,
                   step = 0.1),
    #   sliderInput("Rmax", "log10 Maximum Recruitment:", min = 1, max = 12, value = 12,
    #              step = 0.1),
       sliderInput("erepro", "log10 Reproductive Efficiency:", min = -8, max = 1, value = -3,
                   step = 0.1)
          )),
    column(6,
           plotOutput("distPlot", width = 600, height = 600)
    ))


  ),

  server = function(input, output) {

  output$distPlot <- renderPlot({
    # set up params using values given, need check and change parameter values so units work in days units
    params_shiny@species_params$erepro <- rep(10^input$erepro,12)
   # params@species_params$Rmax <- rep(10^input$Rmax,12)
    params_shiny <- setParams(params_shiny,kappa=10^input$kappa)
    # run without fishing
    sim_shiny <- project(params_shiny, effort = 1, t_max = 100)
    plot(sim_shiny)
     })

},

  options = list(height = 500)
)


```

Rshiny makes it easier to fine tune one parameter at a time, but we need to make some species-specific adjustments.

The shiny app helps with understanding the model but it is tricky to arrive at the best fit especially if we want to change several species parameter combinations at a time.

However, varying $\kappa$ will affect the species' growth curve, we need to check the modelled growth curves agaisnt the “observed” von Bertlanffy growth curves in case varying $\kappa$ divert too much the emergent growth curves.

```{r step 2 - growth curves | function, include = F }

# This function is going to be integrated in sizespectrum/mizer
plotGrowthCurves <- function (object,
                              species,
                              max_age = 20,
                              percentage = FALSE,
                              species_panel = FALSE,
                              highlight = NULL)
{
    if (is(object, "MizerSim")) {
        params <- object@params
        t <- dim(object@n)[1]
        params@initial_n[] <- object@n[t, , ] # Designed to work also with single species
        params@initial_n_pp <- object@n_pp[t, ]
    } else if (is(object, "MizerParams")) {
        params <- validParams(object)
    }
    if (missing(species)) {
        species <- params@species_params$species
    }
    ws <- getGrowthCurves(params, species, max_age, percentage)
    plot_dat <- reshape2::melt(ws)
    plot_dat$Species <- factor(plot_dat$Species, params@species_params$species)
    plot_dat$legend <- "model"

    # creating some VB
    if (all(c("a", "b", "k_vb") %in% names(params@species_params)))
    {
        VBdf <- data.frame("species" = params@species_params$species, "w_inf" = params@species_params$w_inf, "a" = params@species_params$a, "b" = params@species_params$b, "k_vb" = params@species_params$k_vb, "t0" = 0)
        VBdf$L_inf <- (VBdf$w_inf / VBdf$a)^(1/VBdf$b)
        plot_dat2 <- plot_dat
        plot_dat2$value <- apply(plot_dat,1,function(x, VBdf){VBdf[which(VBdf$species == x[1]),]$a *
                (VBdf[which(VBdf$species == x[1]),]$L_inf * (1 - exp(-VBdf[which(VBdf$species == x[1]),]$k_vb *
                                                                         (as.numeric(x[2]) - VBdf[which(VBdf$species == x[1]),]$t0))))^VBdf[which(VBdf$species == x[1]),]$b}, VBdf)
        plot_dat2$legend <- "von Bertalanffy"
        plot_dat <- rbind(plot_dat,plot_dat2)
    }

    p <- ggplot(filter(plot_dat, legend == "model")) + geom_line(aes(x = Age, y = value,
                                                                     colour = Species, linetype = Species, size = Species))
    y_label <- if (percentage)
        "Percent of maximum size"
    else "Size [g]"
    linesize <- rep(0.8, length(params@linetype))
    names(linesize) <- names(params@linetype)
    linesize[highlight] <- 1.6
    p <- p + scale_x_continuous(name = "Age [Years]") +
        scale_y_continuous(name = y_label) +
        scale_colour_manual(values = params@linecolour) +
        scale_linetype_manual(values = params@linetype) +
        scale_size_manual(values = linesize)

    # starting cases now
    if(!percentage)
    {

        if (length(species) == 1) {
            idx <- which(params@species_params$species == species)
            w_inf <- params@species_params$w_inf[idx]
            p <- p + geom_hline(yintercept = w_inf, colour = "grey") +
                annotate("text", 0, w_inf, vjust = -1, label = "Maximum")
            w_mat <- params@species_params$w_mat[idx]
            p <- p + geom_hline(yintercept = w_mat, linetype = "dashed", colour = "grey") +
                annotate("text", 0, w_mat, vjust = -1, label = "Maturity")
            if ("von Bertalanffy" %in% plot_dat$legend)
                p <- p + geom_line(data = filter(plot_dat, legend == "von Bertalanffy"), aes(x = Age, y = value))

        } else if(species_panel) # need to add either no panel if no param for VB or create a panel without VB
        {
            p <- ggplot(plot_dat) +
                geom_line(aes(x = Age, y = value , colour = legend)) +
                scale_x_continuous(name = "Age [years]") +
                scale_y_continuous(name = "Size [g]") +
                geom_hline(aes(yintercept = w_mat),
                           data = tibble(Species = object@params@species_params$species[],
                                         w_mat = object@params@species_params$w_mat[]),
                           linetype = "dashed",
                           colour = "grey") +
                geom_hline(aes(yintercept = w_inf),
                           data = tibble(Species = object@params@species_params$species[],
                                         w_inf = object@params@species_params$w_inf[]),
                           linetype = "solid",
                           colour = "grey") +
                facet_wrap(~Species, scales = "free_y")

        }
    }
    return(p)
}
```

```{r step 2 - growth curves | results}
# All emergent and observed growth curves as pannels
plotGrowthCurves(sim_guessed, species_panel = T)

# All emergent growth curves together
plotGrowthCurves(sim_guessed, percentage = T)

# One by one observed vs emergent
plotGrowthCurves(sim_guessed, species = "Cod")

```


To conclude this section, we are going to choose some values that enable the most species to coexist as a starting point for optimisation. Note we won't vary $erepro$ at the same time as $R_{max}$ (they depend on each other). However we will use the value of $erepro$ selected from the shiny app.

## Step 3. Calibrate the maximum recruitment

In this section you will:

- use a package that will calibrate $R_{max}$ per species

$Rmax$ will affect the relative biomass of each species (and, combined with the fishing parameters, the catches) by minimising the error between observed and estimated catches or biomasses. We could also include $\kappa$ in our estimation here (as in Blanchard et al 2104 & Spence et al 2016) but instead we will use the value that seemed OK in terms of feeding levels in the Rshiny app, roughly $log10(11.5)$. Same goes for $erepro$, a value of $1e-3$ seemed ok.

First let's set up a function running the model and outputing the difference between predicted catches (`getYield()`) and actual catches (`catchAvg`). `err` is the sum of squared errors between the two.

```{r step 3 - getError | function, include=F}


## the following getError function combines the steps of the optimisastion above - this time with the multispecies model and output the predicted size spectrum

## update below with project_steady and saving the state from each iteration
#RF the function takes a bunch of RMax and compare the theoretical catches versus data
getError <- function(vary,params,dat,env=state,data_type="catch", tol = 0.1,timetorun=10) {

  #env$params@species_params$R_max[]<-10^vary[1:12]
  params@species_params$R_max[]<-10^vary[1:12]

  params <- setParams(params)
  # run to steady state and update params
  # env$params<- projectToSteady(env$params, distance_func = distanceSSLogN,
  #                 tol = tol, t_max = 200,return_sim = F)
  params<- projectToSteady(params, distance_func = distanceSSLogN,
                   tol = tol, t_max = 200,return_sim = F)

  # create sim object

  sim <- project(params, effort = 1, t_max = timetorun) #Change t_max to determine how many years the model runs for

  #
  # sim <- project(env$params, effort = 1, t_max = timetorun) #Change t_max to determine how many years the model runs for
  #
  # env$params <-sim@params
  #

          ## what kind of data and output do we have?
          if (data_type=="SSB") {
          output <-getSSB(sim)[timetorun,]   #could change to getBiomass if using survey, also check units.
          }

          if (data_type=="catch") {
         output <-getYield(sim)[timetorun,]/1e6
         #' using n . w . dw so g per year per volume (i.e. North Sea since kappa is set up this way).
         #'The data are in tonnes per year so converting to tonnes.
          }

  pred <- log(output)
  dat  <- log(dat)

  # sum of squared errors, here on log-scale of predictions and data (could change this or use other error or likelihood options)
   discrep <- pred - dat

   discrep <- (sum(discrep^2))

  # can use a strong penalty on the error to ensure we reach a minimum of 10% of the data (biomass or catch) for each species
  # if(any(pred < 0.1*dat)) discrep <- discrep + 1e10

    return(discrep)

   }

```

```{r step 3 - getError | result}


# we need 12 Rmaxs, log10 scale
vary <- log10(params_steady@species_params$R_max)
#vary<-runif(10,3,12) # or use completley made up values, same for each species test for effects of initial values

## set up the enviornment to keep the current state of the simulations
state <- new.env(parent = emptyenv())
state$params <-  params_steady

catchAvg <-read.csv("data/time-averaged-catches.csv") # only that one is used at the moment | catches are estimated from fMatW

## test it
err<-getError(vary = vary, params = params_steady, dat = catchAvg$Catch_1419_tonnes)
# err<-getError(vary,params,dat=rep(100,12),data_type="biomass")
err
```


Now, carry out the optimisation. There are several optimisation methods to choose from - we need to select the most robust one to share here. The R package optimParallel seems to be the most robust general R package and has replaced optim. Often this requires repeateing the proceure several times but the advantage of using parallel run is the speed compared to packages such as optimx.

This might take AWHILE. The output is saved as "optim_para_result" if you wish to skip this block.

```{r step 3 - optimisation, message = F, eval=F}

library("parallel")
library("optimParallel")
library("tictoc")

# change kappa and erepro based on shiny exploration, set up initial values based on "close to" equilibrium values from above sim
# params_steady already set to erepro = 0.001 and kappa = 10^11

params_optim <- params_guessed
vary <-  log10(params_optim@species_params$R_max)


params_optim@resource_params$kappa<-3.2e11 # better kappa estimated from Rshiny
params_optim<-setParams(params_optim)

noCores <- detectCores() - 1 # keep a spare core

cl <- makeCluster(noCores, setup_timeout = 0.5)
setDefaultCluster(cl = cl)
clusterExport(cl, as.list(ls()))
clusterEvalQ(cl, {
  library(mizerExperimental)
  library(optimParallel)
})

tic()
optim_result <-optimParallel(par=vary,getError,params=params_optim, dat = catchAvg$Catch_1419_tonnes, method   ="L-BFGS-B",lower=c(rep(3,12)),upper= c(rep(15,12)),
                            parallel=list(loginfo=TRUE, forward=TRUE))

stopCluster(cl)
toc() # 80'' using 47 cores
saveRDS(optim_result,"optim_para_result.RDS")
```


```{r step3 - results, message=FALSE,warning=FALSE}

# if previous block not evaluated | have some issue enabling "runtime:shiny" and loading .csv somehow
params_optim <- params_guessed
params_optim@resource_params$kappa<-3.2e11

optim_result <- readRDS("optim_para_result.RDS")
# optim values:
params_optim@species_params$R_max <- 10^optim_result$par

# set the param object
params_optim <-setParams(params_optim)
sim_optim <- project(params_optim, effort = 1, t_max = 100, dt=0.1,initial_n = sim_guessed@n[100,,],initial_n_pp = sim_guessed@n_pp[100,])
saveRDS(sim_optim,"optim_para_sim.RDS")
plotSummary(sim_optim)
```

## Step 4. Check the level of density dependence.


In this section you will:

- check if the $RDI/RDD$ ratio and infer consequences on the ecosystem

```{r step 4}


  plot_dat <- as.data.frame(getRDI(sim_optim@params)/getRDD(sim_optim@params))
  plot_dat$species <- factor(rownames(plot_dat),sim_optim@params@species_params$species)
  colnames(plot_dat)[1] <- "ratio"
  plot_dat$w_inf <- as.numeric(sim_optim@params@species_params$w_inf)

  # trying to have bars at their w_inf but on a continuous scale
  plot_dat$label <- plot_dat$species
  plot_dat2 <- plot_dat
  plot_dat2$ratio <- 0
  plot_dat2$label <- NA
  plot_dat <- rbind(plot_dat,plot_dat2)

ggplot(plot_dat) +
    geom_line(aes(x = w_inf, y = ratio, color = species), size = 20, alpha = .8) +
    geom_text(aes(x = w_inf, y = ratio, label = label),position = position_stack(vjust = 0.5), angle = 30)+
    scale_color_manual(name = "Species", values = sim_optim@params@linecolour) +
    scale_y_continuous(name = "R0") +
    scale_x_continuous(name = "Asymptotic size (g)", trans = "log10") +
    theme(legend.position = "none")

getRDI(sim_optim@params)/getRDD(sim_optim@params)


# seems like there is little density dependence

# # if needed change erepro & plug back into model
 # params@species_params$erepro[] <-1e-3
 # params <- setParams(params)
 # sim <- project(params, effort = 1, t_max = 500, dt=0.1)
 # plot(sim)

```

Is the physiological recruitment, $RDI$, much higher than the realised recruitment, $RDD$? High $RDI/RDD$ ratio indicates strong density dependence.


## Step 5. Verify the model after the above step by comparing the model with data.

Eg. species biomass or abundance distrubtions, feeding level, naturality mortality, growth, vulnerablity to fishing (fmsy) and catch, diet composition... Many handy functions for plotting these are available here: https://sizespectrum.org/mizer/reference/index.html


```{r step 5 - diagnostic functions, include = FALSE}

# hopefully all of this will go on sizespectrum/mizer, in the meantime

plotPredObsYield <-function(sim,dat){
## check obs vs. predicted yield
#sim<-newsim
pred_yield <-melt(getYield(sim)[100,]/1e6)
pred_yield$obs <- dat
pred_yield$species <-row.names(pred_yield)

p <- ggplot() + # plot predicted and observed yields
        geom_point(data = pred_yield,
            aes(x = log10(value), y = log10(obs), color = species)) +
  scale_color_manual(values = sim_optim@params@linecolour) +
   # plot optimal fit line
        geom_abline(color = "black", slope = 1, intercept = 0) +
  xlab("log10 Predicted Yield") +
  ylab("log10 Observed Yield") #+
 # scale_fill_manual(values = wes_palette(12, "Zissou"))
return(p)
}

plotDiet2 <- function (object, species, xlim = c(1,NA))
{
    params <- validParams(object)
    if (is.integer(species)) {
        species <- params@species_params$species[species]
    }
    diet <- getDiet(params)[params@species_params$species ==
        species, , ]
    prey <- dimnames(diet)$prey
    prey <- factor(prey, levels = rev(prey))
    plot_dat <- data.frame(Proportion = c(diet), w = params@w,
        Prey = rep(prey, each = length(params@w)))
    plot_dat <- plot_dat[plot_dat$Proportion > 0, ]
    ggplot(plot_dat) + geom_area(aes(x = w, y = Proportion, fill = Prey)) +
        scale_x_log10(limits = xlim) + labs(x = "Size [g]") + scale_fill_manual(values = sim_optim@params@linecolour)
}

```


```{r step 5 - plots, message = F, warning=F}

plotSummary(sim_optim)


plotPredObsYield(sim_optim,catchAvg$Catch_1419_tonnes)

plotDiet2(sim_optim@params,"Cod")

plotGrowthCurves(sim_optim, species_panel = T)
```

```{r step5 - plotly}
# interactive plots / won't be displayed in pdf
plotlyBiomass(sim_optim)
plotlySpectra(sim_optim)
plotlySpectra(sim_optim,power=2,total = T)

plotlyGrowthCurves(sim_optim,percentage = T)
plotlyFeedingLevel(sim_optim)

plotlyPredMort(sim_optim)
plotlyFMort(sim_optim)


# What would happen if we also parameterised the interaction matrix or beta and sigma?


```


Now that our model is calibrated, let's take a look at the $F_{msy}$

```{r step 5 - msy, warning=F}

# need panel plots (per species) of yield at equilibrium (per million) vs effort * catchability

# need catch values / biomass values for different efforts
# sim <- readRDS("~/HowToMizer/optim_para_sim.RDS")

sim <- sim_optim
# plot_datFisheries <-NULL # df that gets biomass/yield/effort/growth
# effortSeq <- seq(0,7,.1)
# # trade-off -> some species have low catchability so need to run the effort really high to get them to drop in biomass
# # but at the same time this can make some species go extinct and crash some functions
# for(iEffort in effortSeq)
# {
#   tempSim <- project(sim, effort = iEffort,t_max = 30)
#   #biomass
#   bm <- getBiomassFrame(tempSim)
#   myDat <- filter(bm, Year == max(unique(bm$Year)))
#   #catch
#   yieldDat <- getYield(tempSim)
#   myDat$yield <- yieldDat[dim(yieldDat)[1],]
#   myDat$Year <- NULL
#   myDat$effort <- iEffort
#   #growth
#   growth_dat <- getEGrowth(tempSim@params, n = tempSim@n[dim(tempSim@n)[1],,], n_pp = tempSim@n_pp[dim(tempSim@n_pp)[1],])
#   myDat$growth <- apply(growth_dat,1,sum)
#   plot_datFisheries <- rbind(plot_datFisheries,myDat)
#   #fisheries
#
# }
# # adjust effort > effort * catchability
# plot_datFisheries$effort <- plot_datFisheries$effort*rep(sim@params@species_params$catchability,length(effortSeq))
# saveRDS(plot_datFisheries, "Fmsy.rds")

plot_datFisheries <- readRDS("Fmsy.rds")

# fishing mortality vs yield
ggplot(plot_datFisheries) +
geom_line(aes(x = effort, y = yield, color = Species))+
  facet_wrap(Species~., scales = "free") +
  scale_x_continuous(limits= c(0,1))+#, limits = c(1e10,NA))+
  scale_y_continuous(trans = "log10") +
  scale_color_manual(name = "Species", values = sim@params@linecolour) +
    theme(legend.position = "bottom", legend.key = element_rect(fill = "white"))


```


### Step 6. The final verification step is to force the model with time-varying fishing mortality to assess whether changes in time series in biomassess and catches capture observed trends. The model will not capture all of the fluctuations from environmental processes ( unless some of these are included), but should match the magnitude and general trend in the data. We explore this in Example # 2 - Changes through time.