Create a distinction between habitual and goal-directed control algorithms

Model-free (habitual) and model-based (goal-directed) control algorithms will most likely have a different interface for the agent (internally). If both algorithms propose an action to take, how do we know which one to trust?

Confidence is the key: habitual control algorithms can be trusted in high-confidence subsections of the environment. For example, if the agent has mined 1,000 blocks of iron ore, it probably does not need planning or a model of the environment to mine the 1,001st block. However, if the agent is exploring a never before visited biome or underground mine, then goal-directed control algorithms would be more useful. The lack of experience in this new portion of the environment requires the agent to depend on the _dynamics_ of the environment it has learned thus far.

The Intrinsic Curiosity Module (ICM) already introduces models that predict the dynamics of the environment. How can we extend the idea of this model-based approach to introduce planning to the agent?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a distinction between habitual and goal-directed control algorithms #26

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Create a distinction between habitual and goal-directed control algorithms #26

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions