Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 80 additions & 18 deletions aieng-agents/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,18 +29,55 @@ by the AI Engineering team.

## Installation

### Using uv (recommended)
### Core library only

```bash
uv pip install aieng-agents
```

### Using pip
#### What the core package gives you

Installing **`aieng-agents`** with **no extras** still gets you a usable toolkit for agent demos:

- **`Configs`** — typed settings from environment variables (`pydantic-settings`).
- **`AsyncClientManager`** — shared **`openai_client`**; Weaviate-backed helpers load only if you use them **and** install **`[weaviate]`**.
- **Async utilities** — **`gather_with_progress`**, **`rate_limited`**, **`register_async_cleanup`** (Rich progress output).
- **Logging / printing** — **`set_up_logging`**, **`pretty_print`**.
- **`agent_session`** — SQLite-backed session helpers for the OpenAI Agents SDK (no Gradio import at runtime).
- **Tools that only need core deps** — e.g. **`GeminiGroundingWithGoogleSearch`** (HTTP client to your grounding proxy).

Imports for heavier integrations live in **subpackages** (for example **`aieng.agents.tools.weaviate_kb`**, **`aieng.agents.langfuse`**) so optional stacks are loaded only when you import them or call into code that needs them.

### Full bootcamp stack

For implementations that use Weaviate, Gradio, Langfuse, Hugging Face datasets, E2B, etc., install everything at once:

```bash
pip install aieng-agents
uv pip install 'aieng-agents[all]'
```

### Optional extras (pick what you need)

Pin versions are listed in **`pyproject.toml`**; names below match **`uv pip install 'aieng-agents[<extra>]'`**.

| Extra | Purpose |
| -------- | --------- |
| **`data`** | Hugging Face **`datasets`** / **`pandas`** / **`transformers`** / **`click`** / **`python-dotenv`** / **`pymupdf`**. Needed for **`get_dataset()`** when loading HF data, dataset chunking/PDF CLIs, and heavy data scripts. |
| **`weaviate`** | **`weaviate-client`**. RAG search against a Weaviate collection (**`AsyncWeaviateKnowledgeBase`**, **`get_weaviate_async_client`**). |
| **`code-interpreter`** | **`e2b-code-interpreter`**. Sandboxed Python execution (**`CodeInterpreter`**). |
| **`gemini-proxy`** | **FastAPI**, **Google Gen AI**, **Firestore**. Running the **Gemini grounding HTTP proxy** under **`aieng.agents.web_search`** (deployable service), not required for the **`GeminiGroundingWithGoogleSearch`** client tool alone. |
| **`news`** | **BeautifulSoup** + **lxml**. Wikipedia current-events scraping (**`get_news_events`**). |
| **`gradio`** | **Gradio** (+ typical image deps). Chat UI helpers under **`aieng.agents.gradio`**. |
| **`observability`** | **Langfuse**, **Logfire** / OpenTelemetry wiring. Tracing helpers under **`aieng.agents.langfuse`**. |

Compose extras explicitly when you do not want **`all`**:

```bash
uv pip install 'aieng-agents[weaviate,observability,gradio]'
```

Examples below call out which extras each snippet assumes.

## Quick Start

### Environment Setup
Expand Down Expand Up @@ -74,13 +111,15 @@ EMBEDDING_BASE_URL=https://your-embedding-service

#### Using Tools with OpenAI Agents SDK

**Requires:** `[weaviate]` (knowledge base) and `[code-interpreter]` (sandbox tool). Install with e.g. `uv pip install 'aieng-agents[weaviate,code-interpreter]'` or `[all]`.

```python
from aieng.agents.tools import (
CodeInterpreter,
from aieng.agents import AsyncClientManager
from aieng.agents.tools.code_interpreter import CodeInterpreter
from aieng.agents.tools.weaviate_kb import (
AsyncWeaviateKnowledgeBase,
get_weaviate_async_client,
)
from aieng.agents import AsyncClientManager
import agents

# Initialize client manager
Expand Down Expand Up @@ -112,11 +151,13 @@ await manager.close()

#### Using the Code Interpreter

**Requires:** `[code-interpreter]` (E2B sandbox). Install with `uv pip install 'aieng-agents[code-interpreter]'` or `[all]`.

```python
from aieng.agents.tools import CodeInterpreter
from aieng.agents.tools.code_interpreter import CodeInterpreter

interpreter = CodeInterpreter(
template="<your template ID",
template="<your template ID>",
timeout=300,
)

Expand All @@ -141,8 +182,10 @@ print(result.results) # Contains base64 PNG data

#### Fetching News Events

**Requires:** `[news]`. Install with `uv pip install 'aieng-agents[news]'` or `[all]`.

```python
from aieng.agents.tools import get_news_events
from aieng.agents.tools.news_events import get_news_events

news_events = await get_news_events()

Expand All @@ -155,8 +198,10 @@ for category, events in news_events.root.items():

#### Using Gemini Grounding with Google Search

**Core install is enough** for this client module (HTTP + Pydantic). You still need a running grounding proxy URL unless you mock it; deploying the proxy service uses **`[gemini-proxy]`** (see **`aieng.agents.web_search`**).

```python
from aieng.agents.tools import GeminiGroundingWithGoogleSearch
from aieng.agents.tools.gemini_grounding import GeminiGroundingWithGoogleSearch

search_tool = GeminiGroundingWithGoogleSearch(
base_url="https://your-search-proxy",
Expand All @@ -173,6 +218,8 @@ print(f"Citations: {response.citations}")

#### Knowledge Base Search

**Requires:** `[weaviate]`. Install with `uv pip install 'aieng-agents[weaviate]'` or `[all]`.

```python
from aieng.agents import AsyncClientManager

Expand All @@ -193,8 +240,11 @@ await manager.close()

#### Langfuse Tracing

**Requires:** `[observability]`. Install with `uv pip install 'aieng-agents[observability]'` or `[all]`.

```python
from aieng.agents import setup_langfuse_tracer, set_up_logging
from aieng.agents import set_up_logging
from aieng.agents.langfuse import setup_langfuse_tracer
from dotenv import load_dotenv

load_dotenv()
Expand All @@ -208,6 +258,8 @@ tracer = setup_langfuse_tracer(service_name="my_agent_app")

#### Async Operations with Progress

**Core install only** — no optional extra required.

```python
from aieng.agents import gather_with_progress, rate_limited
import asyncio
Expand All @@ -234,7 +286,9 @@ results = await gather_with_progress(

## Command-Line Tools

The package includes console scripts for data processing:
The package includes console scripts for data processing. Both entrypoints use a thin CLI so missing extras produce a clear install message.

**Requires:** `[data]` for either script.

### Convert PDFs to HuggingFace Dataset

Expand All @@ -259,6 +313,8 @@ Key options:

### Chunk Existing Dataset

Same **`[data]`** extra as above.

```bash
chunk_hf_dataset \
--hf_dataset_path_or_name my-org/my-dataset \
Expand All @@ -272,6 +328,8 @@ chunk_hf_dataset \

### Custom Client Configuration

**Core install** loads **`Configs`** and **`AsyncClientManager`**. Touching **`manager.weaviate_client`** / **`knowledgebase`** still needs **`[weaviate]`**.

```python
from aieng.agents import Configs, AsyncClientManager

Expand All @@ -287,8 +345,10 @@ manager = AsyncClientManager(configs=configs)

### Gradio Integration

**Requires:** `[gradio]`. Install with `uv pip install 'aieng-agents[gradio]'` or `[all]`.

```python
from aieng.agents import (
from aieng.agents.gradio.messages import (
gradio_messages_to_oai_chat,
oai_agent_stream_to_gradio_messages,
)
Expand All @@ -312,6 +372,8 @@ with gr.Blocks() as demo:

### Session Persistence

**Core install** for **`get_or_create_agent_session`**. Full Gradio chat flows that pass **`ChatMessage`** history typically use **`[gradio]`** where those types come from the UI.

```python
from aieng.agents import get_or_create_agent_session

Expand All @@ -326,12 +388,12 @@ response = await agents.Runner.run(agent, input=message, session=session)

### Running Tests

```bash
# Install with dev dependencies
uv pip install -e ".[dev]"
From this directory (`aieng-agents/`), dev dependencies are included via the `dev` dependency group (`uv sync` enables them by default),
but all extras are needed to run the tests:

# Run tests
uv run --env-file .env pytest tests/
```bash
uv sync --all-extras
uv run --env-file .env pytest
```

### Project Layout
Expand Down
11 changes: 0 additions & 11 deletions aieng-agents/aieng/agents/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,27 +8,16 @@
)
from aieng.agents.client_manager import AsyncClientManager
from aieng.agents.env_vars import Configs
from aieng.agents.gradio.messages import (
gradio_messages_to_oai_chat,
oai_agent_items_to_gradio_messages,
oai_agent_stream_to_gradio_messages,
)
from aieng.agents.langfuse.oai_sdk_setup import setup_langfuse_tracer
from aieng.agents.logging import set_up_logging
from aieng.agents.pretty_printing import pretty_print
from aieng.agents.tools import *


__all__ = [
"AsyncClientManager",
"Configs",
"gather_with_progress",
"get_or_create_agent_session",
"gradio_messages_to_oai_chat",
"oai_agent_items_to_gradio_messages",
"oai_agent_stream_to_gradio_messages",
"set_up_logging",
"setup_langfuse_tracer",
"pretty_print",
"rate_limited",
"register_async_cleanup",
Expand Down
30 changes: 30 additions & 0 deletions aieng-agents/aieng/agents/_optional_extras.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
"""Helpers for errors when optional dependency extras are not installed."""

from __future__ import annotations


EXTRA_DATA = "data"
EXTRA_NEWS = "news"
EXTRA_GRADIO = "gradio"
EXTRA_WEAVIATE = "weaviate"
EXTRA_OBSERVABILITY = "observability"
EXTRA_CODE_INTERPRETER = "code-interpreter"


def install_hint(extra: str) -> str:
"""Human-readable pip/uv install line for the given extra."""
return f"pip install 'aieng-agents[{extra}]'"


def raise_missing_optional(
extra: str,
*,
missing: str | None = None,
from_exc: BaseException | None = None,
) -> None:
"""Raise ``ImportError`` naming the extra and how to install it."""
suffix = f" (missing module: {missing!r})" if missing else ""
raise ImportError(
f"This feature requires the '{extra}' optional dependency{suffix}. "
f"Install with: {install_hint(extra)}"
) from from_exc
10 changes: 6 additions & 4 deletions aieng-agents/aieng/agents/agent_session.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
"""Session management utilities for agent conversations."""

import uuid
from typing import Any

from gradio.components.chatbot import ChatMessage
from typing import TYPE_CHECKING, Any

import agents


if TYPE_CHECKING:
from gradio.components.chatbot import ChatMessage


def get_or_create_agent_session(
history: list[ChatMessage], session_state: dict[str, Any]
history: list["ChatMessage"], session_state: dict[str, Any]
) -> agents.SQLiteSession:
"""Get existing session or create a new one for conversation persistence."""
if len(history) == 0:
Expand Down
33 changes: 33 additions & 0 deletions aieng-agents/aieng/agents/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
"""Console entrypoints."""

import importlib
from types import ModuleType
from typing import Any


def _run_with_extra(module_path: str, attr: str, extra: str) -> None:
"""Import a data script module and invoke its Click-wrapped ``main``."""
try:
module: ModuleType = importlib.import_module(module_path)
except ModuleNotFoundError as exc:
missing = getattr(exc, "name", None) or str(exc)
raise SystemExit(
f"Missing optional dependency ({missing!r}). "
f"Install with: pip install 'aieng-agents[{extra}]'"
) from exc
except ImportError as exc:
# Scripts may raise ``raise_missing_optional`` while importing optional deps.
raise SystemExit(str(exc)) from exc

target: Any = getattr(module, attr)
target()


def pdf_to_hf_dataset_main() -> None:
"""Entry for ``pdf_to_hf_dataset`` console script."""
_run_with_extra("aieng.agents.data.pdf_to_hf_dataset", "main", "data")


def chunk_hf_dataset_main() -> None:
"""Entry for ``chunk_hf_dataset`` console script."""
_run_with_extra("aieng.agents.data.chunk_hf_dataset", "main", "data")
Loading
Loading