Skip to content

Intermediate Rewards Implementation #1

@mayug

Description

@mayug

Hi, great work on LogicPuzzleRL,

Quick question: In Equation 1 (Section 3.2), you describe an intermediate reward r_int that evaluates reasoning steps. I can find r_fmt and r_final in the code, but couldn't locate where r_int is implemented. Could you point me to it?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions