⚙️ dashboard for observability + rerun option #10

baptistecolle · 2025-02-06T09:37:28Z

Changes

Added capability to rerun individual benchmarks
- Modularized backend architecture to support selective benchmark reruns
- Addresses cases where benchmarks produced erroneous results due to outdated dependencies
Implemented observability dashboard for LLM Performance Leaderboard
- Provides monitoring of benchmark execution status
- Tracks failed benchmark configurations to facilitate debugging
- Enhances visibility into the overall health of the leaderboard system

Motivation

These changes improve the maintainability and reliability of the LLM Performance Leaderboard by enabling operators to:

Quickly identify and rerun problematic benchmarks (previously old data would just stay in the leaderboard)
Monitor benchmark execution status through a centralized dashboard
Understand failed configurations better in order to find root cause of a bug

baptistecolle added 15 commits January 30, 2025 15:38

feat(ci): add option to rerun benchmarks

aba6149

feat(ci): add option to rerun benchmarks

3dbeeb7

feat(ci): add option to rerun benchmarks

f69c7cd

feat(ci): add option to rerun benchmarks

d174d1d

feat(ci): add option to rerun benchmarks

ec17c20

feat(ci): add option to rerun benchmarks

5999aed

feat(ci): add option to rerun benchmarks

baf4201

feat(ci): add option to rerun benchmarks

7efa7d0

wip

f952484

feat: add process isolation

beb394d

feat: add process isolation

467134d

feat: add process isolation

38ba938

feat: add process isolation

f1e45c8

feat(dashboard): add dashboard to the ui to monitor the benchamrks

9f99032

fix(ci): fix cron scheduling

ebc8409

baptistecolle added the all_benchmarks [CI] Requires and enables running all benchmark workflows label Feb 6, 2025

baptistecolle added 6 commits February 6, 2025 09:43

feat(dashboard): add dashboard to the ui to monitor the benchamrks

b0a96e0

fix(ci): add label for benchmarks

32ad9e3

fix(ci): add label for benchmarks

7a7649c

all_benchmarks

819b49c

retriger ci

35c612f

retriger ci

a359162

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚙️ dashboard for observability + rerun option #10

⚙️ dashboard for observability + rerun option #10

Uh oh!

baptistecolle commented Feb 6, 2025

Uh oh!

Uh oh!

⚙️ dashboard for observability + rerun option #10

Are you sure you want to change the base?

⚙️ dashboard for observability + rerun option #10

Uh oh!

Conversation

baptistecolle commented Feb 6, 2025

Changes

Motivation

Uh oh!

Uh oh!