AI agent system for mathematical reasoning using Ollama local language models.
- Python 3.8+
- Ollama
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve
# Pull model
ollama pull llama3.2:3b
# Setup Python environment
python3 -m venv .venv
source .venv/bin/activate
pip install pandas requests langgraphEdit agent/nodes_ollama.py line 10:
MODEL_NAME = "llama3.2:3b"python3 evaluate.pypython3 predict.pyfrom agent.graph_ollama import app
result = app.invoke({'query': 'What is 7 times 8?'}, config={'recursion_limit': 50})
print(result['final_answer'])agent/
├── few_shot.py # Few-shot learning
├── graph_ollama.py # Workflow graph
├── nodes_ollama.py # Core reasoning
└── state.py # State management
tools/
├── python_repl_enhanced.py # Python execution
└── symbolic_solver.py # Symbolic math
dataset/
├── train.csv # Training examples
├── test.csv # Test problems
└── output.csv # Expected outputs
| Model | Accuracy | Speed | RAM |
|---|---|---|---|
| llama3.2:1b | ~40% | Fast | 2GB |
| llama3.2:3b | ~65% | Medium | 4GB |
| llama3.1:8b | ~80% | Slow | 8GB |
- Spatial reasoning (room navigation, positioning)
- Optimization (shortest path, resource allocation)
- Logic puzzles (switches, riddles)
- Arithmetic (calculations, error rates)
MIT - RARES Agent for Logic Problem Solving
A multi-agent reasoning system implementing the RARES (Reasoning, Action, Reflection, Execution, Self-correction) framework for solving complex logic problems.
- Tool-Integrated Reasoning (TIR): Structured approach to problem-solving with tools
- Chain-of-Draft (CoD): Efficient token usage for intermediate steps
- Rejection Sampling: Validates tool outputs before accepting them
- Self-Correction: Reflection module for error recovery
- Multiple LLM Backends: Support for both HuggingFace API and local Ollama
ETHOOS/
├── agent/
│ ├── state.py # Agent state definition
│ ├── nodes.py # Original nodes (basic version)
│ ├── nodes_enhanced.py # Enhanced nodes with HuggingFace API
│ ├── nodes_ollama.py # Ollama nodes for local testing
│ ├── graph.py # Main graph (uses nodes_enhanced)
│ └── graph_ollama.py # Ollama graph for local testing
├── tools/
│ ├── python_repl.py # Safe Python code execution
│ └── symbolic_solver.py # SymPy equation solver
├── dataset/
│ ├── train.csv # Training data
│ ├── test.csv # Test data
│ └── output.csv # Prediction outputs
├── main.py # Single problem testing
├── evaluate.py # Training set evaluation (HuggingFace)
├── predict.py # Full test set prediction (HuggingFace)
├── evaluate_ollama.py # Training set evaluation (Ollama)
└── predict_ollama.py # Full test set prediction (Ollama)
Using uv (recommended):
uv syncOr using pip:
pip install -r requirements.txtCreate a .env file with your HuggingFace API tokens:
HUGGINGFACE_HUB_TOKEN_1=your_token_here
HUGGINGFACE_HUB_TOKEN_2=your_token_here # Optional: for round-robin
HUGGINGFACE_HUB_TOKEN_3=your_token_here # Optional
HUGGINGFACE_HUB_TOKEN_4=your_token_here # Optional
HUGGINGFACE_HUB_TOKEN_5=your_token_here # OptionalRun with HuggingFace:
# Test on a single problem
python3 main.py
# Evaluate on first 5 training problems
python3 evaluate.py
# Run on full test set
python3 predict.pyInstall Ollama:
Visit ollama.ai or install via:
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
# macOS
brew install ollama
# Windows
# Download installer from ollama.aiStart Ollama:
ollama servePull a Llama model:
For best results (recommended):
ollama pull llama3.2:3bFor fastest inference (may struggle with complex problems):
ollama pull llama3.2:1bFor highest quality (slower, requires more RAM):
ollama pull llama3.1:8bRun with Ollama:
# Evaluate on first 5 training problems
python3 evaluate_ollama.py
# Run on full test set
python3 predict_ollama.pyNote: Ollama outputs to dataset/output_ollama.csv to avoid overwriting HuggingFace results.
| Feature | HuggingFace (Llama 3.1 8B) | Ollama (Llama 3.2 3B) | Ollama (Llama 3.2 1B) |
|---|---|---|---|
| Speed | Moderate (API latency) | Moderate (local) | Fast (local) |
| Quality | Highest (8B parameters) | Good (3B parameters) | Basic (1B parameters) |
| Cost | API calls (rate limits) | Free (local) | Free (local) |
| Setup | Need API tokens | Need Ollama installed | Need Ollama installed |
| Use Case | Production/Final runs | Testing/Development | Quick testing only |
| Recommended | ✅ Production | ✅ Development |
-
Python REPL: Execute single-line Python code with access to:
math,itertools,Counter,defaultdict- List comprehensions and calculations
- Example:
list(itertools.permutations([1,2,3]))
-
SymbolicSolver: Solve algebraic equations using SymPy
- Example:
x**2 - 4(solves x²-4=0)
- Example:
-
Finish: Provide the final answer when problem is solved
The agent expects responses in this format:
Rationale: [One concise sentence about what to do next]
Action: [Python REPL | SymbolicSolver | Finish]
Action_Input: [executable code or final answer]
Edit the NUM_PROBLEMS_TO_EVALUATE variable in evaluate.py or evaluate_ollama.py:
NUM_PROBLEMS_TO_EVALUATE = 10 # Test on first 10 problemsEdit the max retry limit in the verification modules:
if iteration >= 15: # Change this number
print("❌ MAX RETRIES REACHED")To use HuggingFace:
from agent.graph import app # Uses nodes_enhanced.pyTo use Ollama:
from agent.graph_ollama import app # Uses nodes_ollama.py"Cannot connect to Ollama"
# Make sure Ollama is running
ollama serve"Model not found"
# Pull the model
ollama pull llama3.2:1b
# Verify it's installed
ollama list"Request timed out"
- The 1B model might be slow on some hardware
- Try increasing timeout in
nodes_ollama.py:timeout=120
"Model is currently loading"
- Wait a few moments and retry
- The round-robin system will automatically try other keys
Rate limit errors
- Add more API tokens to
.env(TOKEN_2, TOKEN_3, etc.) - The system rotates between available keys
dataset/output.csv- Predictions using HuggingFace APIdataset/output_ollama.csv- Predictions using Ollama- Both files have format:
predictioncolumn with one answer per row
- Copy
agent/nodes_enhanced.pyas a template - Modify the
call_*_llm()function for your LLM backend - Create corresponding graph file:
agent/graph_custom.py - Update imports to use your new nodes
- Create tool file in
tools/directory - Add tool execution logic in
tool_execution()function - Update
tool_router()to recognize new tool - Update system prompt with tool description
This project is for educational and research purposes.
Built on the RARES framework with LangGraph for agent orchestration.