-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.Rmd
150 lines (109 loc) · 5.45 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/"
)
library(compboost)
ggplot2::theme_set(ggthemes::theme_tufte())
ggplot = function(...) ggplot2::ggplot(...) + scale_color_brewer(palette = "Set1")
set.seed(31415)
```
# compboost: Fast and Flexible Component-Wise Boosting Framework <a href='https://danielschalk.com/compboost/'><img src='man/figures/logo.png' align="right" height="139" /></a>
[](https://github.com/schalkdaniel/compboost/actions)
[](https://codecov.io/gh/schalkdaniel/compboost)
[](https://www.gnu.org/licenses/lgpl-3.0) [](https://cran.r-project.org/package=compboost) [](http://joss.theoj.org/papers/94cfdbbfdfc8796c5bdb1a74ee59fcda)
[Documentation](https://danielschalk.com/compboost/) |
[Contributors](CONTRIBUTORS.md) |
[Release Notes](NEWS.md)
## Overview
Component-wise boosting applies the boosting framework to
statistical models, e.g., general additive models using component-wise smoothing
splines. Boosting these kinds of models maintains interpretability and enables
unbiased model selection in high dimensional feature spaces.
The `R` package `compboost` is an alternative implementation of component-wise
boosting written in `C++` to obtain high runtime
performance and full memory control. The main idea is to provide a modular
class system which can be extended without editing the
source code. Therefore, it is possible to use `R` functions as well as
`C++` functions for custom base-learners, losses, logging mechanisms or
stopping criteria.
For an introduction and overview about the functionality visit the [project page](https://schalkdaniel.github.io/compboost/).
## Installation
<!--
#### CRAN version:
```r
install.packages("compboost")
```
-->
#### Developer version:
```r
devtools::install_github("schalkdaniel/compboost")
```
## Examples
The examples are rendered using <code>compboost `r packageVersion("compboost")`</code>.
The fastest way to train a `Compboost` model is to use the wrapper functions `boostLinear()` or `boostSplines()`:
```{r, results="hide", warning=FALSE, fig.width=10, fig.height=2, out.width="100%", }
cboost = boostSplines(data = iris, target = "Sepal.Length",
oob_fraction = 0.3, iterations = 500L, trace = 100L)
ggrisk = plotRisk(cboost)
ggpe = plotPEUni(cboost, "Petal.Length")
ggicont = plotIndividualContribution(cboost, iris[70, ], offset = FALSE)
library(patchwork)
ggrisk + ggpe + ggicont
```
For more extensive examples and how to use the `R6` interface visit the [project page](https://danielschalk.com/compboost/articles/getting_started/use_case.html).
## mlr learner
Compboost also ships an [`mlr3`](https://mlr3.mlr-org.com/) learners for regression and binary classification which can be used to apply `compboost` within the whole [`mlr3verse`](https://mlr3.mlr-org.com/):
```{r}
library(mlr3)
ts = tsk("spam")
lcboost = lrn("classif.compboost", iterations = 500L, bin_root = 2)
lcboost$train(ts)
lcboost$predict_type = "prob"
lcboost$predict(ts)
# Access the `$model` field to access all the `compboost` functionality:
plotBaselearnerTraces(lcboost$model) +
plotPEUni(lcboost$model, "charDollar")
```
## Save and load models
Because of the usage of `C++` objects as backend, it is not possible to use `R`s `save()` method to save models. Instead, use `$saveToJson("mymodel.json")` to save the model to `mymodel.json` and `Compboost$new(file = "mymodel.json")` to load the model:
```{r, eval=FALSE}
cboost = boostSplines(iris, "Sepal.Width")
cboost$saveToJson("mymodel.json")
cboost_new = Compboost$new(file = "mymodel.json")
# Save the model without data:
cboost$saveToJson("mymodel_without_data.json", rm_data = TRUE)
```
```{r, include=FALSE}
file.remove("mymodel.json", "mymodel_without_data.json")
```
## Benchmark
- A small benchmark was conducted to compare `compboost` with [`mboost`](https://cran.r-project.org/web/packages/mboost/index.html). For this purpose, the runtime behavior and memory consumption of the two packages were compared. The results of the benchmark can be read [here](https://github.com/schalkdaniel/compboost/tree/master/benchmark).
- A bigger benchmark with adaptions to increase the runtime and memory efficiency can be found [here](https://doi.org/10.1080/10618600.2022.2116446).
## Citing
To cite `compboost` in publications, please use:
> Schalk et al., (2018). compboost: Modular Framework for Component-Wise Boosting. Journal of Open Source Software, 3(30), 967, https://doi.org/10.21105/joss.00967
```
@article{schalk2018compboost,
author = {Daniel Schalk, Janek Thomas, Bernd Bischl},
title = {compboost: Modular Framework for Component-Wise Boosting},
URL = {https://doi.org/10.21105/joss.00967},
year = {2018},
publisher = {Journal of Open Source Software},
volume = {3},
number = {30},
pages = {967},
journal = {JOSS}
}
```
## Testing
### On your local machine
In order to test the package functionality you can use devtools to test the package on your local machine:
```{r, eval=FALSE}
devtools::test()
```