Skip to content

Create a distinction between habitual and goal-directed control algorithms #26

@thopkins32

Description

@thopkins32

Model-free (habitual) and model-based (goal-directed) control algorithms will most likely have a different interface for the agent (internally). If both algorithms propose an action to take, how do we know which one to trust?

Confidence is the key: habitual control algorithms can be trusted in high-confidence subsections of the environment. For example, if the agent has mined 1,000 blocks of iron ore, it probably does not need planning or a model of the environment to mine the 1,001st block. However, if the agent is exploring a never before visited biome or underground mine, then goal-directed control algorithms would be more useful. The lack of experience in this new portion of the environment requires the agent to depend on the dynamics of the environment it has learned thus far.

The Intrinsic Curiosity Module (ICM) already introduces models that predict the dynamics of the environment. How can we extend the idea of this model-based approach to introduce planning to the agent?

Metadata

Metadata

Assignees

No one assigned

    Labels

    ideaSomething to considerquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions