Skip to content

[Feature request] Nuke the Assistant Axis #163

@kabachuha

Description

@kabachuha

The "Assistant Axis" refers to a concept in language models that represents the default helpful persona these models adopt. It helps stabilize their behavior by preventing them from drifting into less desirable character archetypes during interactions.

The research behind this stems from Anthropic. https://arxiv.org/abs/2601.10387

Basically, the alignment of the model to this spectrum represents the generic "Helpful assistant" persona. This hinders role-playing and leads to refusals.

I don't know whether this in the scope of this project, but eliminating the impact of this "Assistant" in models could be great for role-playing in general and it will definitely save resources for community fine-tuners when they are fine-tuning models on RP data.


Edit: Seems like this axis is forced, and not in-built. However, if it will be in-built somehow, we will need to look into this again

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions