EvalLab is a responsive Bootstrap-based web UI for evaluating Large Language Model (LLM) outputs.
It provides a clean interface to manage projects, run evaluations, view results, and inspect LLM-as-judge scores.
Goal:
Provide an intuitive and extensible frontend to visualize LLM evaluation results — including execution tests, static analysis, and LLM-as-judge feedback.
Stack:
- HTML5 / CSS3 / JavaScript (ES6)
- Bootstrap 5 for layout and components
- Bootstrap Icons for icons
- Mock JSON data (replaceable with real APIs)
🧩 The app can later be integrated with LangSmith, OpenAI Evals, or custom REST APIs.
eval-lab/ ├── index.html # Main dashboard page ├── styles.css # Custom styles overriding Bootstrap ├── app.js # Core JS logic: state mgmt, events, mock data loading ├── mock-data.json # Sample project/evaluation data ├── assets/ │ ├── logo.svg │ └── icons/ │ └── *.svg ├── README.md # You're here └── /docs/ └── ui-spec.md # Detailed UI & component spec (optional)
- Download the CloudWiz logo or use your company logo
- Save it as
logo.pngin theeval_ui/directory - Recommended size: 200x200px or similar square format
- The logo will automatically appear in the navbar with proper styling
Alternative: If you prefer the robot icon, remove the <img> tag in index.html line 24 and uncomment the icon.
# Navigate to project directory
cd /Users/praveenghuge/go/src/github.com/cloudwizio/agentic/eval_ui
# Start a local server (Python)
python3 -m http.server 8000
# Open browser
open http://localhost:8000For detailed setup instructions, see HELPER.md.
- ✅ Responsive Dashboard — Works on desktop, tablet, and mobile
- ✅ Collapsible Sidebar — Save screen space (Ctrl+B)
- ✅ Three-Column Layout — Evaluation list, results viewer, metadata
- ✅ Mock Data Integration — Test without backend
- ✅ Tabbed Results — Output, Tests, Logs, LLM Judge scores
- ✅ Keyboard Shortcuts — Power user friendly
- ✅ Accessibility — ARIA labels, keyboard navigation
The app is designed to work with REST APIs. See app.js for integration points:
GET /api/projects → List all projects
GET /api/evaluations → List evaluations (with filters)
POST /api/evaluations/:id/run → Run evaluation
GET /api/evaluations/:id/results → Test results
GET /api/evaluations/:id/logs → Execution logs
GET /api/evaluations/:id/judge → LLM judge scores
GET /api/models → Available models
In app.js, replace the mock implementation:
// Before (mock)
async getProjects() {
const data = await this.loadMockData();
return data ? data.projects : [];
}
// After (real API)
async getProjects() {
return fetch('/api/projects').then(res => res.json());
}- HELPER.md — Local setup and troubleshooting
- mock-data.json — Sample data structure
- Bootstrap 5 Docs — https://getbootstrap.com/docs/5.3/
Edit styles.css:
:root {
--primary-indigo: #6366f1; /* Your brand color */
--secondary-slate: #64748b;
}Edit index.html in the sidebar section:
<li class="nav-item">
<a class="nav-link" href="#" data-section="analytics">
<i class="bi bi-graph-up me-2"></i>
<span class="nav-text">Analytics</span>
</a>
</li>- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - feel free to use this in your projects.
Having issues? Check:
- Browser console (F12) for error messages
- HELPER.md for common issues
- Network tab to verify API calls
Built with ❤️ for the LLM evaluation community