-
Notifications
You must be signed in to change notification settings - Fork 11
/
leftover.R
817 lines (693 loc) · 32.7 KB
/
leftover.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
#--------------------------- chapitre Extensions dans Transitions part
## Hidden semi-Markov models
See @ChoquetEtAl2011, choquet2013stopover, @choquet2014stopover, @guerin2017stopover and @king2016semi.
The latter illustrates the approach with the house finch data which we use in the book.
Ruth and Roland say: "We extend the Arnason–Schwarz model by specifying a semi-Markov model for the state
process, where the dwell-time distribution is specified more generally, using, for example, a shifted
Poisson or negative binomial distribution. A state expansion technique is applied in order to represent
the resulting semi-Markov Arnason–Schwarz model in terms of a simpler and computationally tractable
hidden Markov model. Semi-Markov Arnason–Schwarz models come with only a very modest increase in the
number of parameters, yet permit a significantly more flexible state process.". The expansion method
is from @langrock2011expansion, and is very well explained in Chapter 12 of @ZucchiniEtAl2016. Totally
doable I guess. I think to remember that in house finch app, only one state has dwell time non-geometric,
and if so, see Section 12.3.3 in @ZucchiniEtAl2016. There is even an R function to convert the HSMM into HMM.
## Continuous-time HMM
Shouldn't be too difficult to code the matrix exponential, or to use `mexp()` with a `nimbleRcall()`.
The likelihood is the same basically, with the distribution of the dwell-time in there.
#---------------------------- case studies à voir
## Dependence among individuals
@culina_multievent_2013 and @cubaynes_modeling_2021
## Others
Multispecies? Phylogeny? Social networks? Path analysis? Structural Equation Modelling?
## Prevalence
Prevalence estimation with hybrid (@SantostasiEtAl2019) or sex-ratio (@pradel2008sex) example.
Insist on prop in newly marked, how to go to proportion in whole population. See formula derived for C. Duchamp.
## Covariate on multinomial logit link or Dirichlet
Example?
#---------------------------- Dirichelt
```{r dirichlet, echo = FALSE, fig.cap = "Dirichlet prior with parameter alpha"}
library(gtools) # to make the rdirichlet() function available
library(ggtern) # to visually represent multidim prob distribution
set.seed(123)
n <- 500
alpha1 <- c(1, 1, 1)
p1 <- gtools::rdirichlet(n, alpha1)
alpha2 <- c(5, 5, 5)
p2 <- rdirichlet(n, alpha2)
alpha3 <- c(1, 2, 2)
p3 <- rdirichlet(n, alpha3)
alpha4 <- c(2, 4, 8)
p4 <- rdirichlet(n, alpha4)
df <- cbind(rbind(p1, p2, p3, p4), c(rep("alpha = c(1, 1, 1)", n),
rep("alpha = c(5, 5, 5)", n),
rep("alpha = c(1, 2, 2)", n),
rep("alpha = c(2, 4, 8)", n))) %>%
as_tibble() %>%
mutate(x = as.numeric(V1),
y = as.numeric(V2),
z = as.numeric(V3),
alpha = V4)
df %>%
ggtern(aes(x = x, y = y, z = z)) +
stat_density_tern(aes(fill=..level.., alpha=..level..),
geom = 'polygon',
bdl = 0.005) + # a 2D kernel density estimation of the distribution
scale_fill_viridis_b() +
# geom_point(alpha = 0.3, pch = "+") +
theme_light(base_size = 14) +
# theme_showarrows() +
# scale_T_continuous(breaks = seq(0, 1, by = 0.2),
# labels = seq(0, 1, by = 0.2)) +
# scale_L_continuous(breaks = seq(0, 1, by = 0.2),
# labels = seq(0, 1, by = 0.2)) +
# scale_R_continuous(breaks = seq(0, 1, by = 0.2),
# labels = seq(0, 1, by = 0.2)) +
# labs(x = "",
# y = "",
# z = "",
# Tarrow = "psiAA",
# Larrow = "psiAB",
# Rarrow = "psiAC") +
guides(color = "none", fill = "none", alpha = "none") +
facet_wrap(~alpha)
```
#------------------------------ AS model
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'non-detection'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'detection in site A'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'detection in site B'), nudge_x = -0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'alive in site A'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'alive in site B'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1, label = 'dead'), nudge_x = 0.5, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'Observations', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'States', size = 10) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'non-detection'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'detection in site A'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'detection in site B'), nudge_x = -0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'alive in site A'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'alive in site B'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1, label = 'dead'), nudge_x = 0.5, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'Observations', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'States', size = 10) +
geom_segment(aes(x = 1, y = 1.5, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'non-detection'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'detection in site A'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'detection in site B'), nudge_x = -0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'alive in site A'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'alive in site B'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1, label = 'dead'), nudge_x = 0.5, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'Observations', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'States', size = 10) +
geom_segment(aes(x = 1, y = 1.5, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1, xend = 2, yend = 1.5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'non-detection'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'detection in site A'), nudge_x = 0.6, size = 7) +
geom_text(aes(2, 1, label = 'detection in site B'), nudge_x = 0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'alive in site A'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'alive in site B'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'dead'), nudge_x = -0.6, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'Observations', size = 10) +
theme_void()
```
### The model construction: How we should think.
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'non-detection'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'detection in site A'), nudge_x = 0.6, size = 7) +
geom_text(aes(2, 1, label = 'detection in site B'), nudge_x = 0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'alive in site A'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'alive in site B'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'dead'), nudge_x = -0.6, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'Observations', size = 10) +
geom_segment(aes(x = 1, y = 1, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
### The model construction: How we should think.
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(2, 2, label = 'non-detection'), nudge_x = 0.5, size = 7) +
geom_text(aes(2, 1.5, label = 'detection in site A'), nudge_x = 0.6, size = 7) +
geom_text(aes(2, 1, label = 'detection in site B'), nudge_x = 0.6, size = 7) +
geom_point(aes(2, 1), size = 2.5, alpha = .7) +
geom_point(aes(2, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(2, 2), size = 2.5, alpha = .7) +
geom_text(aes(1, 2, label = 'alive in site A'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1.5, label = 'alive in site B'), nudge_x = -0.6, size = 7) +
geom_text(aes(1, 1, label = 'dead'), nudge_x = -0.6, size = 7) +
xlim(0, 3) +
ylim(0.5, 3) +
annotate('text', x = 1, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2, y = 2.6, label = 'Observations', size = 10) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 2, xend = 2, yend = 1.5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1, xend = 2, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1.5, 2, label = 'not encountered (0)'), nudge_x = 1, size = 7) +
geom_text(aes(1.5, 1.5, label = 'found, ascertained as breeder (1)'), nudge_x = 1.5, size = 7) +
geom_text(aes(1.5, 1, label = 'found, ascertained as non-breeder (2)'), nudge_x = 1.7, size = 7) +
geom_text(aes(1.5, 0.5, label = 'found, status unknown (3)'), nudge_x = 1.2, size = 7) +
geom_point(aes(1.5, 0.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 2), size = 2.5, alpha = .7) +
geom_text(aes(.5, 2, label = 'breeding'), nudge_x = 0, size = 7) +
geom_text(aes(.5, 1.5, label = 'non-breeding'), nudge_x = -0.2, size = 7) +
geom_text(aes(.5, 1, label = 'dead'), nudge_x = 0.1, size = 7) +
xlim(0, 4.5) +
ylim(0.5, 3) +
annotate('text', x = .5, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2.5, y = 2.6, label = 'Observations', size = 10) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1.5, 2, label = 'not encountered (0)'), nudge_x = 1, size = 7) +
geom_text(aes(1.5, 1.5, label = 'found, ascertained as breeder (1)'), nudge_x = 1.5, size = 7) +
geom_text(aes(1.5, 1, label = 'found, ascertained as non-breeder (2)'), nudge_x = 1.7, size = 7) +
geom_text(aes(1.5, 0.5, label = 'found, status unknown (3)'), nudge_x = 1.2, size = 7) +
geom_point(aes(1.5, 0.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 2), size = 2.5, alpha = .7) +
geom_text(aes(.5, 2, label = 'breeding'), nudge_x = 0, size = 7) +
geom_text(aes(.5, 1.5, label = 'non-breeding'), nudge_x = -0.2, size = 7) +
geom_text(aes(.5, 1, label = 'dead'), nudge_x = 0.1, size = 7) +
xlim(0, 4.5) +
ylim(0.5, 3) +
annotate('text', x = .5, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2.5, y = 2.6, label = 'Observations', size = 10) +
geom_segment(aes(x = 1, y = 1, xend = 1.5, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1.5, 2, label = 'not encountered (0)'), nudge_x = 1, size = 7) +
geom_text(aes(1.5, 1.5, label = 'found, ascertained as breeder (1)'), nudge_x = 1.5, size = 7) +
geom_text(aes(1.5, 1, label = 'found, ascertained as non-breeder (2)'), nudge_x = 1.7, size = 7) +
geom_text(aes(1.5, 0.5, label = 'found, status unknown (3)'), nudge_x = 1.2, size = 7) +
geom_point(aes(1.5, 0.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 2), size = 2.5, alpha = .7) +
geom_text(aes(.5, 2, label = 'breeding'), nudge_x = 0, size = 7) +
geom_text(aes(.5, 1.5, label = 'non-breeding'), nudge_x = -0.2, size = 7) +
geom_text(aes(.5, 1, label = 'dead'), nudge_x = 0.1, size = 7) +
xlim(0, 4.5) +
ylim(0.5, 3) +
annotate('text', x = .5, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2.5, y = 2.6, label = 'Observations', size = 10) +
geom_segment(aes(x = 1, y = 2, xend = 1.5, yend = .5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 2, xend = 1.5, yend = 1.5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 2, xend = 1.5, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
```{r, echo = FALSE}
ggplot() +
geom_point(aes(1, 1), size = 2.5, alpha = .7) +
geom_point(aes(1, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1, 2), size = 2.5, alpha = .7) +
geom_text(aes(1.5, 2, label = 'not encountered (0)'), nudge_x = 1, size = 7) +
geom_text(aes(1.5, 1.5, label = 'found, ascertained as breeder (1)'), nudge_x = 1.5, size = 7) +
geom_text(aes(1.5, 1, label = 'found, ascertained as non-breeder (2)'), nudge_x = 1.7, size = 7) +
geom_text(aes(1.5, 0.5, label = 'found, status unknown (3)'), nudge_x = 1.2, size = 7) +
geom_point(aes(1.5, 0.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 1.5), size = 2.5, alpha = .7) +
geom_point(aes(1.5, 2), size = 2.5, alpha = .7) +
geom_text(aes(.5, 2, label = 'breeding'), nudge_x = 0, size = 7) +
geom_text(aes(.5, 1.5, label = 'non-breeding'), nudge_x = -0.2, size = 7) +
geom_text(aes(.5, 1, label = 'dead'), nudge_x = 0.1, size = 7) +
xlim(0, 4.5) +
ylim(0.5, 3) +
annotate('text', x = .5, y = 2.6, label = 'States', size = 10) +
annotate('text', x = 2.5, y = 2.6, label = 'Observations', size = 10) +
geom_segment(aes(x = 1, y = 1.5, xend = 1.5, yend = 1), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1.5, xend = 1.5, yend = 2), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
geom_segment(aes(x = 1, y = 1.5, xend = 1.5, yend = .5), alpha = 0.7, arrow = arrow(length = unit(0.02, "npc"))) +
theme_void()
```
#----------------------------------- random effects
#### Temporal
Include temporal covariates, say $x_t$ with $\text{logit}(\phi_t) = \beta_1 + \beta_2 x_t$.
If temporal variation not fully explained by covariates, add random effects $\text{logit}(\phi_t)
= \beta_1 + \beta_2 x_t + \varepsilon_t, \; \varepsilon_t \sim N(0,\sigma^2)$.
We may wish to allow for extra variation in the survival vs. water flow relationship.
To do so, we consider a yearly random effect. The prior on the standard deviation of
the random effect is uniform between 0 and 10. **explain how to pick prior for SD**
```{r eval = FALSE}
hmm.phiflowREpt <- nimbleCode({
delta[1] <- 1 # Pr(alive t = 1) = 1
delta[2] <- 0 # Pr(dead t = 1) = 0
for (t in 1:(T-1)){
logit(phi[t]) <- beta[1] + beta[2] * flow[t] + eps[t] # eps is random effect
eps[t] ~ dnorm(0, sd = sdeps)
gamma[1,1,t] <- phi[t] # Pr(alive t -> alive t+1)
gamma[1,2,t] <- 1 - phi[t] # Pr(alive t -> dead t+1)
gamma[2,1,t] <- 0 # Pr(dead t -> alive t+1)
gamma[2,2,t] <- 1 # Pr(dead t -> dead t+1)
p[t] ~ dunif(0, 1) # prior detection
omega[1,1,t] <- 1 - p[t] # Pr(alive t -> non-detected t)
omega[1,2,t] <- p[t] # Pr(alive t -> detected t)
omega[2,1,t] <- 1 # Pr(dead t -> non-detected t)
omega[2,2,t] <- 0 # Pr(dead t -> detected t)
}
beta[1] ~ dnorm(0, 1.5) # prior intercept
beta[2] ~ dnorm(0, 1.5) # prior slope
sdeps ~ dunif(0,10)
# likelihood
for (i in 1:N){
z[i,first[i]] ~ dcat(delta[1:2])
for (j in (first[i]+1):T){
z[i,j] ~ dcat(gamma[z[i,j-1], 1:2, j-1])
y[i,j] ~ dcat(omega[z[i,j], 1:2, j-1])
}
}
})
```
Initial values.
```{r eval = FALSE}
initial.values <- function() list(beta = rnorm(2,0,1),
p = runif(my.constants$T-1,0,1),
sdeps = runif(1,0,3),
z = zinits)
```
Parameters to be monitored.
```{r eval = FALSE}
parameters.to.save <- c("beta", "p", "phi", "sdeps")
```
MCMC details. Note that we've increased the number of iterations and the length of the burn-in period.
```{r eval = FALSE}
n.iter <- 10000
n.burnin <- 5000
n.chains <- 2
```
Run NIMBLE:
```{r eval = FALSE}
mcmc.phiflowREpt <- nimbleMCMC(code = hmm.phiflowREpt,
constants = my.constants,
data = my.data,
inits = initial.values,
monitors = parameters.to.save,
niter = n.iter,
nburnin = n.burnin,
nchains = n.chains)
```
Display outputs. Seems that the water flow effect is not so important anymore.
```{r eval = FALSE}
MCMCsummary(object = mcmc.phiflowREpt, round = 2)
```
```{r echo = FALSE}
load(here::here("dat/phiflowREpt.RData"))
MCMCsummary(object = mcmc.phiflowREpt, round = 2)
```
Trace plots for the standard deviation of the random effect.
```{r}
MCMCtrace(object = mcmc.phiflowREpt,
params = "sdeps",
pdf = FALSE)
```
#### Individual
#---------------------------------- covariates
Sex effect
+ Let's use a covariate $\text{sex}$ that takes value 0 if male, and 1 if female
+ And write $\text{logit}(\phi_i) = \beta_1 + \beta_2 \; \text{sex}_i$ for bird $i$
+ Then male survival is
$$\text{logit}(\phi_i) = \beta_1$$
+ And female survival is
$$\text{logit}(\phi_i) = \beta_1 + \beta_2$$
Nimble implementation with sex as a covariate
```{r eval = FALSE}
hmm.phisexp <- nimbleCode({
...
for (i in 1:N){
logit(phi[i]) <- beta[1] + beta[2] * sex[i]
gamma[1,1,i] <- phi[i] # Pr(alive t -> alive t+1)
gamma[1,2,i] <- 1 - phi[i] # Pr(alive t -> dead t+1)
gamma[2,1,i] <- 0 # Pr(dead t -> alive t+1)
gamma[2,2,i] <- 1 # Pr(dead t -> dead t+1)
}
beta[1] ~ dnorm(mean = 0, sd = 1.5)
beta[2] ~ dnorm(mean = 0, sd = 1.5)
phi_male <- 1/(1+exp(-beta[1]))
phi_female <- 1/(1+exp(-(beta[1]+beta[2])))
...
# likelihood
for (i in 1:N){
z[i,first[i]] ~ dcat(delta[1:2])
for (j in (first[i]+1):T){
z[i,j] ~ dcat(gamma[z[i,j-1], 1:2, i])
y[i,j] ~ dcat(omega[z[i,j], 1:2])
}
}
})
```
```{r echo = FALSE}
load(here::here("dat/phisexp.RData"))
MCMCsummary(object = mcmc.phisexp, round = 2)
```
#----------------------------------- further examples multievent
#------------------------------------ flexibility of multistate
#------------------------------------- prior elicitation
#---------------------------------------------------- Prior predictive checks
### Linear regression
Unreasonable prior $\beta \sim N(0, 1000^2)$
```{r, echo = FALSE}
df <- rnorm(1000, 0, 1000)
df %>%
as_tibble() %>%
ggplot(aes(x = value)) +
geom_density(size = 2) +
labs(x = "Height (m)") +
theme_light(base_size = 20)
```
Reasonable prior $\beta \sim N(2, 0.5^2)$
```{r, echo = FALSE}
df <- rnorm(1000, 2, 0.5)
df %>%
as_tibble() %>%
ggplot(aes(x = value)) +
geom_density(size = 2) +
labs(x = "Height (m)") +
theme_light(base_size = 20)
```
### Logistic regression
Unreasonable prior $\text{logit}(\phi) = \beta \sim N(0, 10^2)$
```{r, echo = FALSE}
df <- plogis(rnorm(1000, 0, 10))
df %>%
as_tibble() %>%
ggplot(aes(x = value)) +
geom_density(size = 2) +
labs(x = "survival") +
theme_light(base_size = 20)
```
Reasonable prior $\text{logit}(\phi) = \beta \sim N(0, 1.5^2)$
```{r, echo = FALSE}
df <- plogis(rnorm(1000, 0, 1.5))
df %>%
as_tibble() %>%
ggplot(aes(x = value)) +
geom_density(size = 2) +
labs(x = "survival") +
theme_light(base_size = 20)
```
##------------------------------------------------- Parameter-redundancy issue
There are two potential issues, either intrinsic or extrinsic parameter redundacy.
Intrinsic redundancy means that the model likelihood can be expressed by a smaller
number of parameters; it is a feature of the model. Extrinsic redundancymeans that
model structure is fine, but the lack of data makes a parameter non-estimable;
this is a feature of the data.
```{r, echo = FALSE}
load(here::here("dat", "profiledeviance.RData"))
df <- data.frame(last_survival = grid_lastphi,
max_dev = devmax)
mytable <- df %>% slice(c(20, 27, 38, 44)) %>% round(2)
ggplot() +
geom_line(data = df,
aes(x = last_survival, y = max_dev),
size = 1.5,
color = "gray70") +
geom_point(data = df %>% slice(c(20, 27, 38, 44)),
aes(x = last_survival, y = max_dev),
size = 3.5,
pch = 16,
color = "darkblue") +
labs(x = "survival over last time interval", y = "-log-likelihood") +
theme_light(base_size = 14) +
annotation_custom(gridExtra::tableGrob(mytable, rows=NULL), xmin=0.4, xmax=0.9, ymin=350, ymax=380)
```
Last survival and recapture probabilities cannot be estimated separately.
Poor mixing of the chains.
Two issues
Intrinsic redundancy: Likelihood can be expressed by a smaller number of parameters; Feature of the model
Extrinsic redundancy: Model structure is fine, But lack of data makes a parameter non-estimable, Feature of the data.
### Prior-posterior overlap for $\phi_4$ and $\phi_6$
```{r, echo = FALSE}
load(here::here("dat","dipper.RData"))
PR <- runif(1500, 0, 1)
MCMCtrace(mcmc.phitpt,
params = c('phi[4]'),
ISB = FALSE,
exact = TRUE,
priors = PR,
pdf = FALSE,
Rhat = FALSE,
n.eff = FALSE,
ind = FALSE,
type = "density",
lwd_den = 3,
lwd_pr = 3,
col_pr = "gray70",
lty_pr = 2,
main_den = "",
xlab_den = "survival prob. between years 1984 and 1985",
sz_txt = 1.8,
sz_ax = 1.8,
sz_ax_txt = 1.8,
sz_tick_txt = 1.8,
sz_main_txt = 1.8)
```
```{r, echo = FALSE}
load(here::here("dat","dipper.RData"))
PR <- runif(1500, 0, 1)
MCMCtrace(mcmc.phitpt,
params = c('phi[6]'),
ISB = FALSE,
exact = TRUE,
priors = PR,
pdf = FALSE,
Rhat = FALSE,
n.eff = FALSE,
ind = FALSE,
type = "density",
lwd_den = 3,
lwd_pr = 3,
col_pr = "gray70",
lty_pr = 2,
main_den = "",
xlab_den = "survival prob. between years 1986 and 1987",
sz_txt = 1.8,
sz_ax = 1.8,
sz_ax_txt = 1.8,
sz_tick_txt = 1.8,
sz_main_txt = 1.8)
```
### Prior-posterior overlap for $p_3$ and $p_7$
```{r, echo = FALSE}
load(here::here("dat","dipper.RData"))
PR <- runif(1500, 0, 1)
MCMCtrace(mcmc.phitpt,
params = c('p[2]'),
ISB = FALSE,
exact = TRUE,
priors = PR,
pdf = FALSE,
Rhat = FALSE,
n.eff = FALSE,
ind = FALSE,
type = "density",
lwd_den = 3,
lwd_pr = 3,
col_pr = "gray70",
lty_pr = 2,
main_den = "",
xlab_den = "recapture prob. at year 1983",
sz_txt = 1.8,
sz_ax = 1.8,
sz_ax_txt = 1.8,
sz_tick_txt = 1.8,
sz_main_txt = 1.8)
```
```{r, echo = FALSE}
load(here::here("dat","dipper.RData"))
PR <- runif(1500, 0, 1)
MCMCtrace(mcmc.phitpt,
params = c('p[6]'),
ISB = FALSE,
exact = TRUE,
priors = PR,
pdf = FALSE,
Rhat = FALSE,
n.eff = FALSE,
ind = FALSE,
type = "density",
lwd_den = 3,
lwd_pr = 3,
col_pr = "gray70",
lty_pr = 2,
main_den = "",
xlab_den = "recapture prob. at year 1987",
sz_txt = 1.8,
sz_ax = 1.8,
sz_ax_txt = 1.8,
sz_tick_txt = 1.8,
sz_main_txt = 1.8)
```
#----------------------------------------------------- WAIC
Models with smaller DIC values represent more parsimonious descriptions
of the data than models with larger DIC values. The relative parsimony
of the models can be assessed by comparing the difference in the DIC values.
The DIC is a Bayesian equivalent of Akaike's information criterion (AIC;
Akaike 1973; Hilborn & Mangel 1997; Burnham & Anderson 2002) and the rules
of thumb suggested by Burnham & Anderson (2002) for comparing models with
AIC seem to apply for DIC (Spiegelhalter et al. 2002). Therefore, differences
of less than 2 indicate that the two models are indistinguishable, differences
of 4–7 that the poorer model has considerably less support, and differences of
more than 10 that the poorer model
has essentially no support (Burnham & Anderson 2002).
The effective number of parameterŝ pwaic can be used as measure of
complexity of the model, but it should not be overinterpreted, as the
original goal is to estimate the difference between lpd and elpd.
This selection is based on assessing the trade-off between the fit and
complexity of the models, with the aim to find the most parsimonious model
of set of models. This is what the Akaike information criterion (AIC) is
trying to achieve, with $AIC = - 2 \log(L(\hat{\theta}_1,\ldots,\hat{\theta}_K))
+ 2 K$ where $L$ the likelihood and $K$ the number of parameters $\theta_i$.
First term is a measure of goodness-of-fit of the model to the data: the more
parameters you have, the smaller the deviance is (or the bigger the likelihood is).
Second term is a penalty: twice the number of parameters $K$. AIC makes the
balance between *quality of fit* and *complexity* of a model. Best model is
the one with lowest AIC value.
In Bayesian statistics, the relative parsimony of the models can be compared
using the deviance information criterion (DIC). DIC is a measure of the fit of
the model to the data that is penalized for the model's complexity. The measure
of fit is based on the likelihood of obtaining the observed data given the means
of the posterior distribution of the parameters. Parameter values provide a better
fit if they are more likely to have produced the observed data. The complexity
of the model is measured by the effective number of estimated parameters.
**difference between AIC and DIC, and problems with DIC.**
Bayesian version exists with Watanabe-Akaike (Widely-Applicable) Information
Criteria or WAIC (Widely Applicable Information Criterion) given by
$\textrm{WAIC} = -2 \sum_{i = 1}^n \log E[\Pr(y_i \mid \theta)] + 2 p_\text{WAIC}$
where $E[p(y_i \mid \theta)]$ is the posterior mean of the likelihood evaluated
pointwise at each $i$th observation, and $p_\text{WAIC}$ is a penalty computed
using the posterior variance of the likelihood.
Even if all of the models being considered have mismatches with the data
(forward reference to gof section, say relative fit vs absolute fit), it can
be informative to evaluate their predictive accuracy, compare them, and consider
where to go next. The challenge then is to estimate predictive model accuracy,
correcting for the bias inherent in evaluating a model’s predictions of the data
that were used to fit it.
A more general summary of predictive fit is the log predictive density, log p(y|),
which is proportional to the mean squared error if the model is normal with constant
variance. The log predictive density is also sometimes called the log-likelihood.
The log predictive density has an important role in statistical model comparison
because of its connection to the Kullback-Leibler information measure (see Burnham
and Anderson, 2002, and Robert, 1996).
In the limit of large sample sizes, the model with the lowest Kullback-Leibler
information—and thus, the highest expected log predictive density—will have the
highest posterior probability. Thus, it seems reasonable to use expected log
predictive density as a measure of overall model fit.
AIC uses the maximum likelihood as a measure of goodness-of-fit, and the number
of free parameters as a measure of flexibility, with more parameters resulting in
harsher penalties. DIC uses the average log-likelihood over the posterior distribution
as a measure of goodness-of-fit, and the difference between this average and the
log-likelihood at some fixed, central point of the posterior as a measure of
flexibility, with greater differences resulting in harsher penalties. Although
the mean of the parameter values over the joint posterior is often used as the
point estimate in this calculation (Spiegelhalter et al., 2002), I instead use
the point of minimum deviance in the posterior for the point estimate (also
recommended by Spiegelhalter et al., 2002), as the use of the mean results in
the strong assumption that the joint posterior distribution is a multivariate
normal, and can result in negative estimates of flexibility when this assumption
is violated (Vehtari et al., 2017). WAIC uses a similar measure of goodness-of-fit
as DIC, being the log of the average posterior likelihood for each data point,
but uses the variance in log-likelihood over the posterior distribution as a
measure of flexibility, with greater variances resulting in harsher penalties.
A natural way to estimate out-of-sample prediction error is cross-validation (see Geisser and
Eddy, 1979, and Vehtari and Lampinen, 2002, for a Bayesian perspective), but researchers have
always sought alternative measures, as cross-validation requires repeated model fits and can run
into trouble with sparse data. For practical reasons alone, there remains a place for simple bias
corrections such as AIC (Akaike, 1973), DIC (Spiegelhalter et al., 2002, van der Linde, 2005), and,
more recently, WAIC (Watanabe, 2010), and all these can be viewed as approximations to different
versions of cross-validation (Stone, 1977).
All the different measures discussed above are based on adjusting the log predictive density of the
observed data by subtracting an approximate bias correction. The measures differ both in their
starting points and in their adjustments.
These methods penalize models for overfitting, as over-fitting will
result in poor prediction of future data.
In addition to varying in computational tractability, these methods vary on their
theoretical basis for selecting between models. AIC, DIC, and WAIC all attempt
to find the model with the best “predictive accuracy”, which is the model that
is able to best predict future empirical data, given some fixed set of parameter
values or distributions (Akaike, 1974; Spiegelhalter et al., 2002; Vehtari et al., 2017).
#------------------------------------------ model validation
The fit of model to data can be assessed using posterior predictive checks (Rubin,
1984), prior predictive checks (when evaluating potential replications involving new parameter
values)
#------------------------------------------- covariates
Proportion of variance explained. Path analyses, structural equation
models. Splines (more about spatial stats? CAR model?). Imputation
and multistate models to account for missing data. Explain basics
of parametric statistical modeling (linear models, GLMs and random effects).
Talk about ANODEV?