Skip to content

Add perf metrics for 2.56.0#64196

Open
khluu wants to merge 1 commit into
masterfrom
2.56.0_perf_metrics
Open

Add perf metrics for 2.56.0#64196
khluu wants to merge 1 commit into
masterfrom
2.56.0_perf_metrics

Conversation

@khluu

@khluu khluu commented Jun 18, 2026

Copy link
Copy Markdown
Contributor
REGRESSION 46.02%: single_client_put_gigabytes (THROUGHPUT) regresses from 12.831166033501425 to 6.9262999947537205 in microbenchmark.json
REGRESSION 18.77%: multi_client_put_gigabytes (THROUGHPUT) regresses from 40.8627833341568 to 33.19310764876294 in microbenchmark.json
REGRESSION 4.48%: 1_n_actor_calls_async (THROUGHPUT) regresses from 7370.550357182028 to 7040.378512116067 in microbenchmark.json
REGRESSION 3.91%: 1_1_actor_calls_concurrent (THROUGHPUT) regresses from 5678.956390335692 to 5456.76183608015 in microbenchmark.json
REGRESSION 1.76%: multi_client_tasks_async (THROUGHPUT) regresses from 21137.109008698124 to 20764.825419627614 in microbenchmark.json
REGRESSION 1.73%: 1_n_async_actor_calls_async (THROUGHPUT) regresses from 6907.905927393101 to 6788.316704875271 in microbenchmark.json
REGRESSION 1.61%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1417.1846351735794 to 1394.4268791918364 in microbenchmark.json
REGRESSION 0.93%: placement_group_create/removal (THROUGHPUT) regresses from 657.0579674498558 to 650.9409444317282 in microbenchmark.json
REGRESSION 0.85%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1066.5581344588645 to 1057.5113866970246 in microbenchmark.json
REGRESSION 0.83%: multi_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 12844.074948208083 to 12736.988267796403 in microbenchmark.json
REGRESSION 0.74%: single_client_tasks_async (THROUGHPUT) regresses from 7096.786888013178 to 7044.008888726476 in microbenchmark.json
REGRESSION 0.65%: 1_1_actor_calls_async (THROUGHPUT) regresses from 8396.874033285541 to 8342.568803508693 in microbenchmark.json
REGRESSION 0.25%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 1879.6455165405484 to 1875.0279275238897 in microbenchmark.json
REGRESSION 127.38%: dashboard_p95_latency_ms (LATENCY) regresses from 241.548 to 549.229 in benchmarks/many_tasks.json
REGRESSION 27.23%: dashboard_p99_latency_ms (LATENCY) regresses from 1879.035 to 2390.781 in benchmarks/many_pgs.json
REGRESSION 22.63%: dashboard_p50_latency_ms (LATENCY) regresses from 5.651 to 6.93 in benchmarks/many_nodes.json
REGRESSION 5.07%: avg_pg_create_time_ms (LATENCY) regresses from 1.6259311876874045 to 1.7084268978985828 in stress_tests/stress_test_placement_group.json
REGRESSION 3.28%: 107374182400_large_object_time (LATENCY) regresses from 24.263247010999976 to 25.059670065000006 in scalability/single_node.json
REGRESSION 2.33%: 10000_get_time (LATENCY) regresses from 15.200454125999997 to 15.554129597 in scalability/single_node.json
REGRESSION 1.73%: dashboard_p99_latency_ms (LATENCY) regresses from 3389.923 to 3448.611 in benchmarks/many_actors.json

Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates several performance benchmark metric files to reflect results for version 2.56.0. While some metrics show improvements, the review highlights three severe regressions: a 46.02% throughput drop in single-client object store write operations, a 127.38% increase in P95 dashboard latency under task load, and a 329% spike in peak memory usage during the many_nodes benchmark. These regressions require immediate investigation before proceeding.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 263 to +265
"single_client_put_gigabytes": [
12.831166033501425,
9.770703025046103
6.9262999947537205,
4.926803289933447

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The throughput for single_client_put_gigabytes has regressed significantly by 46.02% (from 12.83 to 6.93). This represents a major performance degradation in single-client object store write operations.

Comment on lines 24 to 27
"perf_metric_name": "dashboard_p95_latency_ms",
"perf_metric_type": "LATENCY",
"perf_metric_value": 241.548
"perf_metric_value": 549.229
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The P95 latency for the dashboard under task load has regressed by 127.38%, increasing from 241.55 ms to 549.23 ms. This indicates a substantial degradation in dashboard responsiveness.

"_dashboard_test_success": true,
"_peak_memory": 2.36,
"_peak_process_memory": "PID\tMEM\tCOMMAND\n3547\t0.6GiB\t/home/ray/anaconda3/lib/python3.10/site-packages/ray/core/src/ray/gcs/gcs_server --log_dir=/tmp/ray/\n2994\t0.32GiB\tvector --watch-config --log-format json --config-yaml /etc/vector/vector.yaml\n5416\t0.31GiB\taide --config=/etc/aide/aide.conf --update --after=report_url=file:/var/lib/aide/dailyaidecheck/arun\n4911\t0.17GiB\tpython distributed/test_many_tasks.py --num-tasks=1000\n3762\t0.17GiB\tray-dashboard-NodeHead-0 (/home/ray/anaconda3/bin/python3.10 -c \"from multiprocessing.spawn import s\n4243\t0.13GiB\tray::DashboardAgent\n3105\t0.11GiB\t/app/infra/dataplane/webterminal/webterminal_sidecar_image.binary.runfiles/python3_9_x86_64-unknown-\n1364\t0.1GiB\t/app/product/go/infra/anyscaled/anyscaled_/anyscaled startv2 --control_plane_url=https://console.any\n3680\t0.09GiB\t/home/ray/anaconda3/bin/python3.10 /home/ray/anaconda3/lib/python3.10/site-packages/ray/dashboard/da\n5117\t0.09GiB\tray::StateAPIGeneratorActor.start",
"_peak_memory": 10.13,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Peak memory usage has increased significantly from 2.36 GB to 10.13 GB (a 329% increase) during the many_nodes benchmark. This represents a severe memory regression.

@elliot-barn elliot-barn requested a review from edoakes June 18, 2026 00:25
@ray-gardener ray-gardener Bot added core Issues that should be addressed in Ray Core release-test release test community-contribution Contributed by the community labels Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core release-test release test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants