Add perf metrics for 2.56.0 by khluu · Pull Request #64196 · ray-project/ray

khluu · 2026-06-18T00:19:42Z

REGRESSION 46.02%: single_client_put_gigabytes (THROUGHPUT) regresses from 12.831166033501425 to 6.9262999947537205 in microbenchmark.json
REGRESSION 18.77%: multi_client_put_gigabytes (THROUGHPUT) regresses from 40.8627833341568 to 33.19310764876294 in microbenchmark.json
REGRESSION 4.48%: 1_n_actor_calls_async (THROUGHPUT) regresses from 7370.550357182028 to 7040.378512116067 in microbenchmark.json
REGRESSION 3.91%: 1_1_actor_calls_concurrent (THROUGHPUT) regresses from 5678.956390335692 to 5456.76183608015 in microbenchmark.json
REGRESSION 1.76%: multi_client_tasks_async (THROUGHPUT) regresses from 21137.109008698124 to 20764.825419627614 in microbenchmark.json
REGRESSION 1.73%: 1_n_async_actor_calls_async (THROUGHPUT) regresses from 6907.905927393101 to 6788.316704875271 in microbenchmark.json
REGRESSION 1.61%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1417.1846351735794 to 1394.4268791918364 in microbenchmark.json
REGRESSION 0.93%: placement_group_create/removal (THROUGHPUT) regresses from 657.0579674498558 to 650.9409444317282 in microbenchmark.json
REGRESSION 0.85%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1066.5581344588645 to 1057.5113866970246 in microbenchmark.json
REGRESSION 0.83%: multi_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 12844.074948208083 to 12736.988267796403 in microbenchmark.json
REGRESSION 0.74%: single_client_tasks_async (THROUGHPUT) regresses from 7096.786888013178 to 7044.008888726476 in microbenchmark.json
REGRESSION 0.65%: 1_1_actor_calls_async (THROUGHPUT) regresses from 8396.874033285541 to 8342.568803508693 in microbenchmark.json
REGRESSION 0.25%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 1879.6455165405484 to 1875.0279275238897 in microbenchmark.json
REGRESSION 127.38%: dashboard_p95_latency_ms (LATENCY) regresses from 241.548 to 549.229 in benchmarks/many_tasks.json
REGRESSION 27.23%: dashboard_p99_latency_ms (LATENCY) regresses from 1879.035 to 2390.781 in benchmarks/many_pgs.json
REGRESSION 22.63%: dashboard_p50_latency_ms (LATENCY) regresses from 5.651 to 6.93 in benchmarks/many_nodes.json
REGRESSION 5.07%: avg_pg_create_time_ms (LATENCY) regresses from 1.6259311876874045 to 1.7084268978985828 in stress_tests/stress_test_placement_group.json
REGRESSION 3.28%: 107374182400_large_object_time (LATENCY) regresses from 24.263247010999976 to 25.059670065000006 in scalability/single_node.json
REGRESSION 2.33%: 10000_get_time (LATENCY) regresses from 15.200454125999997 to 15.554129597 in scalability/single_node.json
REGRESSION 1.73%: dashboard_p99_latency_ms (LATENCY) regresses from 3389.923 to 3448.611 in benchmarks/many_actors.json

Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

gemini-code-assist

Code Review

This pull request updates several performance benchmark metric files to reflect results for version 2.56.0. While some metrics show improvements, the review highlights three severe regressions: a 46.02% throughput drop in single-client object store write operations, a 127.38% increase in P95 dashboard latency under task load, and a 329% spike in peak memory usage during the many_nodes benchmark. These regressions require immediate investigation before proceeding.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-18T00:20:28Z

    "single_client_put_gigabytes": [
-        12.831166033501425,
-        9.770703025046103
+        6.9262999947537205,
+        4.926803289933447


The throughput for single_client_put_gigabytes has regressed significantly by 46.02% (from 12.83 to 6.93). This represents a major performance degradation in single-client object store write operations.

gemini-code-assist · 2026-06-18T00:20:28Z

            "perf_metric_name": "dashboard_p95_latency_ms",
            "perf_metric_type": "LATENCY",
-            "perf_metric_value": 241.548
+            "perf_metric_value": 549.229
        },


The P95 latency for the dashboard under task load has regressed by 127.38%, increasing from 241.55 ms to 549.23 ms. This indicates a substantial degradation in dashboard responsiveness.

gemini-code-assist · 2026-06-18T00:20:28Z

    "_dashboard_test_success": true,
-    "_peak_memory": 2.36,
-    "_peak_process_memory": "PID\tMEM\tCOMMAND\n3547\t0.6GiB\t/home/ray/anaconda3/lib/python3.10/site-packages/ray/core/src/ray/gcs/gcs_server --log_dir=/tmp/ray/\n2994\t0.32GiB\tvector --watch-config --log-format json --config-yaml /etc/vector/vector.yaml\n5416\t0.31GiB\taide --config=/etc/aide/aide.conf --update --after=report_url=file:/var/lib/aide/dailyaidecheck/arun\n4911\t0.17GiB\tpython distributed/test_many_tasks.py --num-tasks=1000\n3762\t0.17GiB\tray-dashboard-NodeHead-0 (/home/ray/anaconda3/bin/python3.10 -c \"from multiprocessing.spawn import s\n4243\t0.13GiB\tray::DashboardAgent\n3105\t0.11GiB\t/app/infra/dataplane/webterminal/webterminal_sidecar_image.binary.runfiles/python3_9_x86_64-unknown-\n1364\t0.1GiB\t/app/product/go/infra/anyscaled/anyscaled_/anyscaled startv2 --control_plane_url=https://console.any\n3680\t0.09GiB\t/home/ray/anaconda3/bin/python3.10 /home/ray/anaconda3/lib/python3.10/site-packages/ray/dashboard/da\n5117\t0.09GiB\tray::StateAPIGeneratorActor.start",
+    "_peak_memory": 10.13,


Peak memory usage has increased significantly from 2.36 GB to 10.13 GB (a 329% increase) during the many_nodes benchmark. This represents a severe memory regression.

Update performance metrics for 2.56.0

af52b22

Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

gemini-code-assist Bot reviewed Jun 18, 2026

View reviewed changes

elliot-barn requested a review from edoakes June 18, 2026 00:25

edoakes approved these changes Jun 18, 2026

View reviewed changes

ray-gardener Bot added core Issues that should be addressed in Ray Core release-test release test community-contribution Contributed by the community labels Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add perf metrics for 2.56.0#64196

Add perf metrics for 2.56.0#64196
khluu wants to merge 1 commit into
masterfrom
2.56.0_perf_metrics

khluu commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

khluu commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants