Default to dropping data information if kept?

Some models literally retain copies of data frames etc in order to make predictions. This can be convenient but has at least two downsides (described below). This issue proposes that, in cases where such info is not needed, models that store data by default have this information removed from the fitted model. E.g., by default, `lm` should set the arg `model = FALSE` (and look into all `model`, `x`, `y`).

## Downsides to default case of keeping original data.frame

1. It creates a memory overhead. E.g., for `lm`:

```
object.size(lm(mpg ~ ., mtcars))
#> 45768 bytes
object.size(lm(mpg ~ ., mtcars, model = FALSE))
#> 28152 bytes
```

Given that twidlr requires a data frame for `predict`, if the only reason this info is retained is to call `predict`, then it can be dropped.

2. It is inconsistent between models and thus misleading. For example, `lm` stores the original data by default making `predict` work properly. However, other models do not, and point to the original data frame in the global environment. E.g., see examples [here](https://gist.github.com/drsimonj/5b2cfc428fce350676db5dc77c059052). A similar thing can be done when `lm` is used with `model = FALSE`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Default to dropping data information if kept? #27

Downsides to default case of keeping original data.frame

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Default to dropping data information if kept? #27

Description

Downsides to default case of keeping original data.frame

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions