Skip to content
Merged

Dev #23

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ jobs:
run: uv sync --extra dev

- name: Run tests
run: uv run pytest --cov --cov-report=xml --cov-fail-under=79 -v -m "not integration"
run: uv run pytest --cov --cov-report=xml --cov-fail-under=80 -v -m "not integration"

- name: Upload coverage
uses: codecov/codecov-action@v4
Expand Down
4 changes: 1 addition & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,4 @@ deriva/adapters/neo4j/logs/*
deriva/adapters/database/sql.db
deriva/adapters/database/sql.db.wal
.coverage
coverage.xml
.export/*
todo/*
coverage.xml
40 changes: 35 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,47 @@ Deriving ArchiMate models from code using knowledge graphs, heuristics and LLM's

# v0.6.x - Deriva (December 2025 - January 2026)

## v0.6.8 - PydanticAI Migration (Unreleased)
## v0.6.8 - Library Migration & Overall Cleanup (Unreleased)

Big migration replacing 6 custom implementations with off-the-shelf libraries, reducing the amount of code and improving maintainability.

### LLM Adapter Rewrite
- **PydanticAI Integration**: Replaced custom REST provider implementations with PydanticAI library
- **Code Reduction**: Same (or better) llm adapter with way less code, deleted entire `providers.py`
- **PydanticAI Integration**: Replaced custom REST provider implementations with `pydantic-ai` library
- **Model Registry**: New `model_registry.py` maps Deriva config to PydanticAI model identifiers with URL normalization for Azure/LM Studio
- **Code Reduction**: Same (or better) LLM adapter with way less code, deleted entire `providers.py`
- **Native Structured Output**: PydanticAI handles validation and retry automatically
- **Removed ClaudeCode Provider**: Use `anthropic` provider directly instead (CLI subprocess no longer supported)

### Configuration
- **Pydantic Settings**: New `config_models.py` with type-safe environment validation using `pydantic-settings`
- **Standard API Keys**: Added PydanticAI standard env vars (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `MISTRAL_API_KEY`, `AZURE_OPENAI_*`)
- **Updated .env.example**: Removed `claudecode` provider, added Anthropic direct API configs

### Caching
- **diskcache Integration**: Replaced custom SQLite-based caching with `diskcache` library
- **Simplified Cache Utils**: Rewrote `cache_utils.py` to wrap diskcache with `BaseDiskCache` class
- **Preserved Features**: Kept `hash_inputs()`, `bench_hash` isolation, and `export_to_json()` functionality
- **LLM & Graph Caches**: Updated both adapters to use new base cache class

### Retry Logic
- **backoff Library**: Replaced custom retry implementation with `backoff` library
- **New retry.py**: Centralized retry decorator with exponential backoff and jitter
- **Simplified Rate Limiter**: Token bucket rate limiting now separate from retry logic

### Small CLI refactor
- **Typer Framework**: Replaced argparse-based CLI with `typer`
- **Command Modules**: Split CLI into `deriva/cli/commands/` with separate files for `benchmark.py`, `config.py`, `repo.py`, `run.py`
- **Modern CLI Features**: Auto-completion, better help generation, type hints via `Annotated`
- **Subcommand Groups**: `config`, `repo`, `benchmark` as typer subapps

### Logging

- **structlog Integration**: Rewrote `logging.py` with `structlog` for structured logging
- **Preserved API**: Same RunLogger, StepContext, and JSONL output format
- **OCEL Unchanged**: OCEL module kept intact for benchmark process mining

### Tests & Quality
- **CLI Tests Rewritten**: Updated all 51 CLI tests to use typer's `CliRunner`
- **Tree-sitter Test Consolidation**: Merged per-language test files into single `test_languages.py`
- **Coverage Threshold**: Updated CI coverage threshold to 80%

---

Expand Down
4 changes: 4 additions & 0 deletions deriva/adapters/llm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ class Concept(BaseModel):
ValidationError,
)
from .rate_limiter import RateLimitConfig, RateLimiter, get_default_rate_limit
from .retry import create_retry_decorator, retry_on_rate_limit

__all__ = [
# Main service
Expand All @@ -70,6 +71,9 @@ class Concept(BaseModel):
"RateLimitConfig",
"RateLimiter",
"get_default_rate_limit",
# Retry
"create_retry_decorator",
"retry_on_rate_limit",
# Exceptions
"LLMError",
"ConfigurationError",
Expand Down
63 changes: 45 additions & 18 deletions deriva/adapters/llm/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,11 @@ class Concept(BaseModel):
from dotenv import load_dotenv
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings

from .cache import CacheManager
from .model_registry import VALID_PROVIDERS, get_pydantic_ai_model
from .models import (
APIError,
BenchmarkModelConfig,
CachedResponse,
ConfigurationError,
Expand Down Expand Up @@ -208,7 +208,9 @@ def from_config(
load_dotenv(override=True)

effective_temperature = (
temperature if temperature is not None else float(os.getenv("LLM_TEMPERATURE", "0.7"))
temperature
if temperature is not None
else float(os.getenv("LLM_TEMPERATURE", "0.7"))
)

instance = object.__new__(cls)
Expand Down Expand Up @@ -265,8 +267,12 @@ def _load_config_from_env(self) -> dict[str, Any]:
if default_model:
benchmark_models = load_benchmark_models()
if default_model not in benchmark_models:
available = ", ".join(benchmark_models.keys()) if benchmark_models else "none"
raise ConfigurationError(f"LLM_DEFAULT_MODEL '{default_model}' not found. Available: {available}")
available = (
", ".join(benchmark_models.keys()) if benchmark_models else "none"
)
raise ConfigurationError(
f"LLM_DEFAULT_MODEL '{default_model}' not found. Available: {available}"
)
config = benchmark_models[default_model]
provider = config.provider
api_url = config.get_api_url()
Expand All @@ -288,15 +294,19 @@ def _load_config_from_env(self) -> dict[str, Any]:
api_key = os.getenv("LLM_ANTHROPIC_API_KEY")
model = os.getenv("LLM_ANTHROPIC_MODEL", "claude-sonnet-4-20250514")
elif provider == "ollama":
api_url = os.getenv("LLM_OLLAMA_API_URL", "http://localhost:11434/api/chat")
api_url = os.getenv(
"LLM_OLLAMA_API_URL", "http://localhost:11434/api/chat"
)
api_key = None
model = os.getenv("LLM_OLLAMA_MODEL", "llama3.2")
elif provider == "mistral":
api_url = "https://api.mistral.ai/v1/chat/completions"
api_key = os.getenv("LLM_MISTRAL_API_KEY")
model = os.getenv("LLM_MISTRAL_MODEL", "mistral-large-latest")
elif provider == "lmstudio":
api_url = os.getenv("LLM_LMSTUDIO_API_URL", "http://localhost:1234/v1/chat/completions")
api_url = os.getenv(
"LLM_LMSTUDIO_API_URL", "http://localhost:1234/v1/chat/completions"
)
api_key = None
model = os.getenv("LLM_LMSTUDIO_MODEL", "local-model")
else:
Expand Down Expand Up @@ -325,7 +335,9 @@ def _validate_config(self) -> None:
"""Validate configuration has required fields."""
provider = self.config.get("provider", "")
if provider not in VALID_PROVIDERS:
raise ConfigurationError(f"Invalid provider: {provider}. Must be one of {VALID_PROVIDERS}")
raise ConfigurationError(
f"Invalid provider: {provider}. Must be one of {VALID_PROVIDERS}"
)

# Ollama and LM Studio don't require api_key
if provider in ("ollama", "lmstudio"):
Expand All @@ -335,7 +347,9 @@ def _validate_config(self) -> None:

missing = [f for f in required_fields if not self.config.get(f)]
if missing:
raise ConfigurationError(f"Missing required config fields: {', '.join(missing)}")
raise ConfigurationError(
f"Missing required config fields: {', '.join(missing)}"
)

@overload
def query(
Expand Down Expand Up @@ -394,7 +408,9 @@ def query(
If response_model is provided: Validated Pydantic model instance or FailedResponse
Otherwise: LiveResponse, CachedResponse, or FailedResponse
"""
effective_temperature = temperature if temperature is not None else self.temperature
effective_temperature = (
temperature if temperature is not None else self.temperature
)
effective_max_tokens = max_tokens if max_tokens is not None else self.max_tokens

# Generate cache key
Expand Down Expand Up @@ -446,12 +462,12 @@ def query(
)

# Run query
settings: ModelSettings = {"temperature": effective_temperature}
if effective_max_tokens is not None:
settings["max_tokens"] = effective_max_tokens
result = agent.run_sync(
prompt,
model_settings={
"temperature": effective_temperature,
"max_tokens": effective_max_tokens,
},
model_settings=settings,
)

self._rate_limiter.record_success()
Expand All @@ -461,21 +477,30 @@ def query(
if hasattr(result, "usage") and result.usage:
usage = {
"prompt_tokens": getattr(result.usage, "request_tokens", 0) or 0,
"completion_tokens": getattr(result.usage, "response_tokens", 0) or 0,
"completion_tokens": getattr(result.usage, "response_tokens", 0)
or 0,
"total_tokens": getattr(result.usage, "total_tokens", 0) or 0,
}

# Handle response
if response_model:
# Cache the serialized model
if write_cache:
content = result.output.model_dump_json() if hasattr(result.output, "model_dump_json") else str(result.output)
self.cache.set_response(cache_key, content, prompt, self.model, usage)
content = (
result.output.model_dump_json()
if hasattr(result.output, "model_dump_json")
else str(result.output)
)
self.cache.set_response(
cache_key, content, prompt, self.model, usage
)
return result.output
else:
content = str(result.output) if result.output else ""
if write_cache:
self.cache.set_response(cache_key, content, prompt, self.model, usage)
self.cache.set_response(
cache_key, content, prompt, self.model, usage
)
return LiveResponse(
prompt=prompt,
model=self.model,
Expand Down Expand Up @@ -541,7 +566,9 @@ def get_token_usage_stats(self) -> dict[str, Any]:
"total_tokens": total_prompt + total_completion,
"total_calls": total_calls,
"avg_prompt_tokens": total_prompt / total_calls if total_calls else 0,
"avg_completion_tokens": total_completion / total_calls if total_calls else 0,
"avg_completion_tokens": total_completion / total_calls
if total_calls
else 0,
}

def __repr__(self) -> str:
Expand Down
12 changes: 9 additions & 3 deletions deriva/adapters/llm/model_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@
from pydantic_ai.models import Model

# Valid provider names
VALID_PROVIDERS = frozenset({"azure", "openai", "anthropic", "ollama", "mistral", "lmstudio"})
VALID_PROVIDERS = frozenset(
{"azure", "openai", "anthropic", "ollama", "mistral", "lmstudio"}
)


def get_pydantic_ai_model(config: dict[str, Any]) -> "Model | str":
Expand Down Expand Up @@ -63,12 +65,16 @@ def get_pydantic_ai_model(config: dict[str, Any]) -> "Model | str":
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

base_url = _normalize_openai_url(api_url) if api_url else "http://localhost:1234/v1"
base_url = (
_normalize_openai_url(api_url) if api_url else "http://localhost:1234/v1"
)
openai_provider = OpenAIProvider(base_url=base_url)
return OpenAIChatModel(model, provider=openai_provider)

else:
raise ValueError(f"Unknown provider: {provider}. Valid providers: {VALID_PROVIDERS}")
raise ValueError(
f"Unknown provider: {provider}. Valid providers: {VALID_PROVIDERS}"
)


def _normalize_azure_url(url: str) -> str:
Expand Down
4 changes: 3 additions & 1 deletion deriva/adapters/llm/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,9 @@ class BenchmarkModelConfig:
def __post_init__(self):
"""Validate provider."""
if self.provider not in VALID_PROVIDERS:
raise ValueError(f"Invalid provider: {self.provider}. Must be one of {VALID_PROVIDERS}")
raise ValueError(
f"Invalid provider: {self.provider}. Must be one of {VALID_PROVIDERS}"
)

def get_api_key(self) -> str | None:
"""Get API key from direct value or environment variable."""
Expand Down
Loading
Loading