## PR #378 : LLM Prisoner's Dilemma: when agents reason about trust #379
abhinavk0220
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The Prisoner's Dilemma has fascinated game theorists, economists,
and biologists for 70 years because it captures something fundamental:
individual rationality and collective optimality are often in conflict.
Classical ABM tackles this beautifully with fixed strategies
tit-for-tat, always-defect, Pavlov. Axelrod's tournaments showed
that tit-for-tat wins in repeated games. Clean, elegant, reproducible.
But here's what fixed strategies can't do: change their mind based
on context.
In PR #378, agents use LLM Chain-of-Thought reasoning at each round.
They receive their full interaction history score, last action,
partner's recent moves and reason about what to do next. Not
lookup. Reasoning.
This means an agent can recognize a partner who defected once
out of apparent caution vs one who is systematically exploiting.
It can choose to forgive. It can bluff. It can build reputation.
The cooperation rate and average score plots tell the story
you can watch trust emerge or collapse in real time based purely
on how agents reason about each other.
Would love feedback on the reasoning prompt design specifically
whether the internal state passed to the LLM gives enough context
for meaningful strategic reasoning.
PR: #378
Beta Was this translation helpful? Give feedback.
All reactions