Claude/build ai system project eb6ui by JusSil501 · Pull Request #27 · codepath/ai110-module2show-pawpal-starter

JusSil501 · 2026-04-15T00:10:35Z

No description provided.

- Add pawpal_system.py: Task/Pet/Owner dataclasses + Scheduler with sorting, filtering, daily/weekly recurrence, and conflict detection - Add main.py: CLI demo verifying data flow end-to-end - Add tests/test_pawpal.py: 17 tests covering all core behaviors and edge cases - Update app.py: full Streamlit UI wired to logic layer with Claude AI schedule explanation via Anthropic API - Update README.md: architecture overview, UML diagram, features, testing docs - Update reflection.md: design decisions, tradeoffs, AI collaboration notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The reliability logger writes to pawpal.log; don't check it in. Also drop the accidentally-tracked .pyc from the starter. https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

logger_setup.py: - get_logger() wires a shared console + file handler (pawpal.log) - sanitize_user_text() screens user input before any LLM call: empty / oversized / prompt-injection / secret-leak patterns - Raises GuardrailError so callers can surface a clean UI message. knowledge_base.py: - 15 curated pet-care snippets covering exercise, feeding, medication, grooming, enrichment, seniors, puppies, hydration, training. - Deterministic keyword + tag retriever (TF-style scorer, no deps). - min_score guard returns [] for off-topic queries rather than grounding the model on noise. - format_context() renders [n]-numbered context for prompt injection. https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

ai_agent.py: - answer_question(): RAG-grounded Q&A via Claude Haiku. Retrieves top-k snippets, forces inline [n] citations, returns an AgentResult with confidence derived from citation coverage. - ScheduleReviewAgent: plan -> act -> check loop. PLAN retrieves guidance from the KB. ACT asks the LLM to emit JSON listing at most 3 concrete issues. CHECK runs each issue through a deterministic validator against the real Scheduler. Hallucinated conflicts are dropped. - Confidence decays with every validator rejection; AgentResult carries the full trace so the UI can show what happened. - _extract_json tolerates prose wrapping + code fences (Haiku sometimes wraps JSON in "Sure! Here you go:" preamble). evaluator.py: - 8 offline checks: KB size, retriever relevance, retriever noise rejection, guardrail block/allow paths, scheduler conflict detection, empty-owner safety, agent JSON parse tolerance. - All deterministic / no network -- safe for CI and runs from the Streamlit reliability tab. - EvalReport exposes passed/total/score + markdown rendering. https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

34 new tests (all deterministic, no network): - test_knowledge_base.py: canonical-query coverage, noise rejection, k-respect, determinism, empty-context safety. - test_guardrails.py: each block pattern (injection, secret leak, system-prompt exfil, empty/None, length cap) plus happy-path. - test_agent.py: JSON extractor (strict / prose-wrapped / code-fenced / empty), conflict validator (accepts real / rejects fake times / rejects when no real conflicts exist), and a stubbed plan-act-check run that confirms hallucinated claims are dropped. - test_evaluator.py: meta-tests that the evaluator itself passes and exposes a sane score. Total suite: 51 tests, ~0.2s, 100% pass. https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

app.py: - Four tabs mapping 1:1 to the architecture: Schedule, Ask PawPal (RAG), Review Agent, Reliability. - Every AI response shows a traffic-light confidence indicator (green >=0.7, yellow >=0.4, red otherwise) and the sources used. - Sidebar indicates whether ANTHROPIC_API_KEY is set so users know which tabs will work offline. - Guardrail errors surface as clean UI messages, not stack traces. main.py: - CLI demo now runs the full applied-AI surface: scheduler, retriever, guardrails, evaluator, and (if key present) live LLM calls for Q&A + review agent. https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

- README identifies the Module-2 base project, documents the four new AI features (RAG, agentic workflow, reliability eval, guardrails), setup + sample interactions, design tradeoffs, testing summary. - model_card.md covers intended use, limitations, biases (corpus skew, keyword-retriever gaps), misuse mitigations, testing surprises, and two concrete AI-collaboration examples (one helpful, one flawed). - assets/system_architecture.md contains the Mermaid source for the architecture diagram (exported to PNG via the Mermaid Live Editor). https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

JusSil501 and others added 7 commits March 18, 2026 12:22

chore: ignore pawpal.log and stop tracking __pycache__

5c4204d

The reliability logger writes to pawpal.log; don't check it in. Also drop the accidentally-tracked .pyc from the starter. https://claude.ai/code/session_01NdDTc7eYWfQ8b1Fmsc1Dtp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude/build ai system project eb6ui#27

Claude/build ai system project eb6ui#27
JusSil501 wants to merge 7 commits into
codepath:mainfrom
JusSil501:claude/build-ai-system-project-Eb6ui

JusSil501 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JusSil501 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants