Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TensorDictPrimer with single default_value callable #2732

Open
wants to merge 2 commits into
base: gh/vmoens/84/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 30, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 30, 2025
ghstack-source-id: 172825a4bf036c332c9012e45d070fc0fe348a0d
Pull Request resolved: #2732
Copy link

pytorch-bot bot commented Jan 30, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2732

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Cancelled Job, 7 Unrelated Failures

As of commit 7178f09 with merge base 20a19fe (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 30, 2025
@vmoens
Copy link
Contributor Author

vmoens commented Jan 30, 2025

Example script:

from torchrl.envs import TensorDictPrimer, PendulumEnv, StepCounter, TrajCounter
import torch
from tensordict import assert_close
# let's build a primer that outputs the same Pendulum initial state 4 times in a row
env = PendulumEnv()

def val_iterator(N = 4, env=env):
    i = 0
    while True:
        i += 1
        r = env.reset()
        for _ in range(N):
            torch.manual_seed(i)
            yield r

iterator = val_iterator()

def val_generator(iterator=iterator):
    return next(iterator)

print('observation_spec', env.observation_spec)
primer = TensorDictPrimer(primers=env.observation_spec, default_value=val_generator, single_default_value=True)
env = env.append_transform(primer)
env = env.append_transform(StepCounter(max_steps=50))
env = env.append_transform(TrajCounter())
r = env.rollout(1000, break_when_any_done=False)
print(r[0]["th"], r[0]["thdot"])
print(r[50]["th"], r[50]["thdot"])

# To compute the GAE with empirical adv, we fist need to collapse the trajs
r["traj_count"] = r["traj_count"]//4
r["next", "traj_count"] = r["next", "traj_count"]//4

assert_close(r[:50], r[50:100])
assert_close(r[:50], r[100:150])

# Reshape r
print(r.shape)
r = r.reshape(-1, 4, 50)
assert r[0]["traj_count"].unique().numel() == 1

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 3, 2025
ghstack-source-id: b9f7df7bf2abd312dc8de56cac757c4b2975c62c
Pull Request resolved: #2732
@vmoens vmoens added the enhancement New feature or request label Feb 3, 2025
Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}24$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5370s 0.4572s 2.1873 Ops/s 2.1756 Ops/s $\color{#35bf28}+0.54\%$
test_transformed 1.0255s 0.9428s 1.0607 Ops/s 1.0605 Ops/s $\color{#35bf28}+0.01\%$
test_serial 1.4737s 1.4038s 0.7123 Ops/s 0.7153 Ops/s $\color{#d91a1a}-0.42\%$
test_parallel 1.2885s 1.2108s 0.8259 Ops/s 0.8069 Ops/s $\color{#35bf28}+2.35\%$
test_step_mdp_speed[True-True-True-True-True] 0.6065ms 30.3399μs 32.9599 KOps/s 32.0911 KOps/s $\color{#35bf28}+2.71\%$
test_step_mdp_speed[True-True-True-True-False] 0.1369ms 18.2656μs 54.7476 KOps/s 55.1928 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-True-True-False-True] 69.9710μs 17.2174μs 58.0809 KOps/s 56.2887 KOps/s $\color{#35bf28}+3.18\%$
test_step_mdp_speed[True-True-True-False-False] 46.1550μs 10.1210μs 98.8041 KOps/s 96.8864 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[True-True-False-True-True] 83.9770μs 31.9906μs 31.2592 KOps/s 29.6831 KOps/s $\textbf{\color{#35bf28}+5.31\%}$
test_step_mdp_speed[True-True-False-True-False] 42.9600μs 19.5746μs 51.0867 KOps/s 49.8312 KOps/s $\color{#35bf28}+2.52\%$
test_step_mdp_speed[True-True-False-False-True] 0.1289ms 19.1172μs 52.3089 KOps/s 50.3952 KOps/s $\color{#35bf28}+3.80\%$
test_step_mdp_speed[True-True-False-False-False] 74.9100μs 11.8323μs 84.5145 KOps/s 81.1043 KOps/s $\color{#35bf28}+4.20\%$
test_step_mdp_speed[True-False-True-True-True] 67.6860μs 33.7509μs 29.6288 KOps/s 28.3283 KOps/s $\color{#35bf28}+4.59\%$
test_step_mdp_speed[True-False-True-True-False] 61.7550μs 21.4804μs 46.5542 KOps/s 46.1230 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[True-False-True-False-True] 47.5480μs 18.8502μs 53.0497 KOps/s 50.9696 KOps/s $\color{#35bf28}+4.08\%$
test_step_mdp_speed[True-False-True-False-False] 60.5020μs 11.8922μs 84.0890 KOps/s 81.7890 KOps/s $\color{#35bf28}+2.81\%$
test_step_mdp_speed[True-False-False-True-True] 0.1039ms 35.5348μs 28.1414 KOps/s 26.9406 KOps/s $\color{#35bf28}+4.46\%$
test_step_mdp_speed[True-False-False-True-False] 54.4310μs 23.2764μs 42.9619 KOps/s 42.2686 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[True-False-False-False-True] 47.1580μs 20.6272μs 48.4798 KOps/s 46.8684 KOps/s $\color{#35bf28}+3.44\%$
test_step_mdp_speed[True-False-False-False-False] 40.5460μs 13.7519μs 72.7172 KOps/s 71.7470 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[False-True-True-True-True] 76.7830μs 33.6989μs 29.6745 KOps/s 28.5031 KOps/s $\color{#35bf28}+4.11\%$
test_step_mdp_speed[False-True-True-True-False] 92.7530μs 21.3216μs 46.9007 KOps/s 45.7379 KOps/s $\color{#35bf28}+2.54\%$
test_step_mdp_speed[False-True-True-False-True] 0.5038ms 22.2471μs 44.9496 KOps/s 44.7209 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-True-True-False-False] 61.6450μs 13.3816μs 74.7294 KOps/s 73.4990 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-True-False-True-True] 83.0150μs 35.2569μs 28.3632 KOps/s 27.3044 KOps/s $\color{#35bf28}+3.88\%$
test_step_mdp_speed[False-True-False-True-False] 61.1740μs 23.1782μs 43.1439 KOps/s 42.4958 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[False-True-False-False-True] 2.5766ms 23.7565μs 42.0937 KOps/s 41.4829 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[False-True-False-False-False] 44.5030μs 15.0758μs 66.3316 KOps/s 65.2148 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-True-True-True] 83.3950μs 37.1674μs 26.9053 KOps/s 26.2121 KOps/s $\color{#35bf28}+2.64\%$
test_step_mdp_speed[False-False-True-True-False] 68.6480μs 24.7612μs 40.3858 KOps/s 39.3113 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-False-True-False-True] 54.5210μs 23.2405μs 43.0284 KOps/s 41.6086 KOps/s $\color{#35bf28}+3.41\%$
test_step_mdp_speed[False-False-True-False-False] 63.4790μs 15.0751μs 66.3345 KOps/s 64.9892 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[False-False-False-True-True] 81.2110μs 38.1602μs 26.2053 KOps/s 25.0647 KOps/s $\color{#35bf28}+4.55\%$
test_step_mdp_speed[False-False-False-True-False] 75.7510μs 26.6050μs 37.5870 KOps/s 37.4793 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-False-False-False-True] 74.4160μs 25.2264μs 39.6410 KOps/s 39.3019 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-False-False-False-False] 76.8030μs 16.6656μs 60.0038 KOps/s 58.8961 KOps/s $\color{#35bf28}+1.88\%$
test_values[generalized_advantage_estimate-True-True] 11.4737ms 9.9273ms 100.7326 Ops/s 98.3392 Ops/s $\color{#35bf28}+2.43\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.1351ms 23.7788ms 42.0542 Ops/s 37.8606 Ops/s $\textbf{\color{#35bf28}+11.08\%}$
test_values[td0_return_estimate-False-False] 0.2339ms 0.1766ms 5.6613 KOps/s 5.2885 KOps/s $\textbf{\color{#35bf28}+7.05\%}$
test_values[td1_return_estimate-False-False] 26.2026ms 24.7141ms 40.4627 Ops/s 39.7427 Ops/s $\color{#35bf28}+1.81\%$
test_values[vec_td1_return_estimate-False-False] 25.9435ms 23.8213ms 41.9792 Ops/s 37.4614 Ops/s $\textbf{\color{#35bf28}+12.06\%}$
test_values[td_lambda_return_estimate-True-False] 37.9495ms 35.2126ms 28.3990 Ops/s 27.7186 Ops/s $\color{#35bf28}+2.45\%$
test_values[vec_td_lambda_return_estimate-True-False] 25.8590ms 23.6017ms 42.3699 Ops/s 37.3763 Ops/s $\textbf{\color{#35bf28}+13.36\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.4607ms 8.6759ms 115.2614 Ops/s 115.3388 Ops/s $\color{#d91a1a}-0.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4118ms 1.9228ms 520.0712 Ops/s 505.7994 Ops/s $\color{#35bf28}+2.82\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4383ms 0.3586ms 2.7889 KOps/s 2.7420 KOps/s $\color{#35bf28}+1.71\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 40.0483ms 38.2045ms 26.1749 Ops/s 22.8692 Ops/s $\textbf{\color{#35bf28}+14.46\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 6.1073ms 3.4029ms 293.8686 Ops/s 288.0446 Ops/s $\color{#35bf28}+2.02\%$
test_dqn_speed[False-None] 5.8363ms 1.3715ms 729.1438 Ops/s 702.4553 Ops/s $\color{#35bf28}+3.80\%$
test_dqn_speed[False-backward] 1.9628ms 1.7978ms 556.2265 Ops/s 494.2204 Ops/s $\textbf{\color{#35bf28}+12.55\%}$
test_dqn_speed[True-None] 0.6657ms 0.4710ms 2.1230 KOps/s 2.0575 KOps/s $\color{#35bf28}+3.19\%$
test_dqn_speed[True-backward] 0.9710ms 0.8901ms 1.1234 KOps/s 1.0809 KOps/s $\color{#35bf28}+3.93\%$
test_dqn_speed[reduce-overhead-None] 0.6752ms 0.4733ms 2.1130 KOps/s 2.0210 KOps/s $\color{#35bf28}+4.55\%$
test_dqn_speed[reduce-overhead-backward] 0.9090ms 0.8833ms 1.1321 KOps/s 1.0879 KOps/s $\color{#35bf28}+4.06\%$
test_ddpg_speed[False-None] 3.5714ms 2.8134ms 355.4453 Ops/s 340.3457 Ops/s $\color{#35bf28}+4.44\%$
test_ddpg_speed[False-backward] 5.8859ms 4.0719ms 245.5883 Ops/s 244.5467 Ops/s $\color{#35bf28}+0.43\%$
test_ddpg_speed[True-None] 1.7869ms 1.2025ms 831.6014 Ops/s 801.3074 Ops/s $\color{#35bf28}+3.78\%$
test_ddpg_speed[True-backward] 2.1337ms 2.0823ms 480.2426 Ops/s 459.3448 Ops/s $\color{#35bf28}+4.55\%$
test_ddpg_speed[reduce-overhead-None] 1.3736ms 1.2046ms 830.1198 Ops/s 798.5232 Ops/s $\color{#35bf28}+3.96\%$
test_ddpg_speed[reduce-overhead-backward] 2.1442ms 2.0859ms 479.4014 Ops/s 460.2175 Ops/s $\color{#35bf28}+4.17\%$
test_sac_speed[False-None] 8.5441ms 7.8425ms 127.5107 Ops/s 120.6183 Ops/s $\textbf{\color{#35bf28}+5.71\%}$
test_sac_speed[False-backward] 11.0182ms 10.6000ms 94.3393 Ops/s 89.8334 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_sac_speed[True-None] 2.7164ms 2.0704ms 482.9942 Ops/s 464.6057 Ops/s $\color{#35bf28}+3.96\%$
test_sac_speed[True-backward] 3.8617ms 3.6568ms 273.4643 Ops/s 257.2688 Ops/s $\textbf{\color{#35bf28}+6.30\%}$
test_sac_speed[reduce-overhead-None] 2.3496ms 2.0719ms 482.6544 Ops/s 461.8493 Ops/s $\color{#35bf28}+4.50\%$
test_sac_speed[reduce-overhead-backward] 3.8584ms 3.8017ms 263.0401 Ops/s 258.3322 Ops/s $\color{#35bf28}+1.82\%$
test_redq_speed[False-None] 14.2609ms 13.2760ms 75.3239 Ops/s 73.7989 Ops/s $\color{#35bf28}+2.07\%$
test_redq_speed[False-backward] 24.4285ms 22.2579ms 44.9279 Ops/s 43.0226 Ops/s $\color{#35bf28}+4.43\%$
test_redq_speed[True-None] 6.1308ms 4.8162ms 207.6310 Ops/s 195.5878 Ops/s $\textbf{\color{#35bf28}+6.16\%}$
test_redq_speed[True-backward] 13.5893ms 12.4635ms 80.2342 Ops/s 75.0209 Ops/s $\textbf{\color{#35bf28}+6.95\%}$
test_redq_speed[reduce-overhead-None] 5.8441ms 4.7420ms 210.8815 Ops/s 190.6437 Ops/s $\textbf{\color{#35bf28}+10.62\%}$
test_redq_speed[reduce-overhead-backward] 13.2904ms 12.5845ms 79.4627 Ops/s 72.5300 Ops/s $\textbf{\color{#35bf28}+9.56\%}$
test_redq_deprec_speed[False-None] 21.8954ms 12.9881ms 76.9935 Ops/s 70.4230 Ops/s $\textbf{\color{#35bf28}+9.33\%}$
test_redq_deprec_speed[False-backward] 21.0991ms 18.3637ms 54.4552 Ops/s 50.2185 Ops/s $\textbf{\color{#35bf28}+8.44\%}$
test_redq_deprec_speed[True-None] 4.3672ms 3.7986ms 263.2554 Ops/s 252.4313 Ops/s $\color{#35bf28}+4.29\%$
test_redq_deprec_speed[True-backward] 9.6157ms 8.8912ms 112.4709 Ops/s 114.6193 Ops/s $\color{#d91a1a}-1.87\%$
test_redq_deprec_speed[reduce-overhead-None] 4.2345ms 3.7932ms 263.6270 Ops/s 256.5345 Ops/s $\color{#35bf28}+2.76\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.2555ms 8.1122ms 123.2704 Ops/s 117.6078 Ops/s $\color{#35bf28}+4.81\%$
test_td3_speed[False-None] 8.3804ms 7.9105ms 126.4144 Ops/s 121.8477 Ops/s $\color{#35bf28}+3.75\%$
test_td3_speed[False-backward] 11.2274ms 10.5886ms 94.4410 Ops/s 92.3180 Ops/s $\color{#35bf28}+2.30\%$
test_td3_speed[True-None] 1.9395ms 1.7545ms 569.9591 Ops/s 541.6924 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_td3_speed[True-backward] 3.4295ms 3.3273ms 300.5450 Ops/s 290.7584 Ops/s $\color{#35bf28}+3.37\%$
test_td3_speed[reduce-overhead-None] 1.9398ms 1.7431ms 573.6971 Ops/s 532.9649 Ops/s $\textbf{\color{#35bf28}+7.64\%}$
test_td3_speed[reduce-overhead-backward] 3.3765ms 3.3234ms 300.8944 Ops/s 288.9115 Ops/s $\color{#35bf28}+4.15\%$
test_cql_speed[False-None] 38.2115ms 35.9678ms 27.8027 Ops/s 27.1469 Ops/s $\color{#35bf28}+2.42\%$
test_cql_speed[False-backward] 56.5996ms 46.3481ms 21.5758 Ops/s 20.9351 Ops/s $\color{#35bf28}+3.06\%$
test_cql_speed[True-None] 17.1171ms 15.6022ms 64.0935 Ops/s 61.4701 Ops/s $\color{#35bf28}+4.27\%$
test_cql_speed[True-backward] 23.9265ms 22.4226ms 44.5978 Ops/s 44.6315 Ops/s $\color{#d91a1a}-0.08\%$
test_cql_speed[reduce-overhead-None] 17.2033ms 16.2386ms 61.5816 Ops/s 62.7387 Ops/s $\color{#d91a1a}-1.84\%$
test_cql_speed[reduce-overhead-backward] 24.0505ms 22.9193ms 43.6313 Ops/s 44.1385 Ops/s $\color{#d91a1a}-1.15\%$
test_a2c_speed[False-None] 8.0252ms 7.2446ms 138.0333 Ops/s 137.6785 Ops/s $\color{#35bf28}+0.26\%$
test_a2c_speed[False-backward] 16.0550ms 14.5071ms 68.9315 Ops/s 70.3090 Ops/s $\color{#d91a1a}-1.96\%$
test_a2c_speed[True-None] 4.0301ms 3.6938ms 270.7224 Ops/s 264.0574 Ops/s $\color{#35bf28}+2.52\%$
test_a2c_speed[True-backward] 11.3707ms 10.2580ms 97.4847 Ops/s 99.5490 Ops/s $\color{#d91a1a}-2.07\%$
test_a2c_speed[reduce-overhead-None] 4.3363ms 3.6945ms 270.6738 Ops/s 265.8888 Ops/s $\color{#35bf28}+1.80\%$
test_a2c_speed[reduce-overhead-backward] 11.5562ms 10.4238ms 95.9343 Ops/s 94.2815 Ops/s $\color{#35bf28}+1.75\%$
test_ppo_speed[False-None] 8.3358ms 7.7413ms 129.1771 Ops/s 131.5276 Ops/s $\color{#d91a1a}-1.79\%$
test_ppo_speed[False-backward] 16.1681ms 15.2918ms 65.3946 Ops/s 67.9459 Ops/s $\color{#d91a1a}-3.75\%$
test_ppo_speed[True-None] 4.9752ms 4.1180ms 242.8363 Ops/s 249.6087 Ops/s $\color{#d91a1a}-2.71\%$
test_ppo_speed[True-backward] 10.7724ms 9.7997ms 102.0439 Ops/s 100.6081 Ops/s $\color{#35bf28}+1.43\%$
test_ppo_speed[reduce-overhead-None] 4.5466ms 4.0243ms 248.4880 Ops/s 243.3015 Ops/s $\color{#35bf28}+2.13\%$
test_ppo_speed[reduce-overhead-backward] 10.1620ms 9.7080ms 103.0079 Ops/s 100.6798 Ops/s $\color{#35bf28}+2.31\%$
test_reinforce_speed[False-None] 7.2622ms 6.4215ms 155.7276 Ops/s 150.0031 Ops/s $\color{#35bf28}+3.82\%$
test_reinforce_speed[False-backward] 10.0216ms 9.7093ms 102.9938 Ops/s 102.1383 Ops/s $\color{#35bf28}+0.84\%$
test_reinforce_speed[True-None] 4.4548ms 3.0048ms 332.8022 Ops/s 327.7918 Ops/s $\color{#35bf28}+1.53\%$
test_reinforce_speed[True-backward] 9.6299ms 8.8839ms 112.5634 Ops/s 111.2535 Ops/s $\color{#35bf28}+1.18\%$
test_reinforce_speed[reduce-overhead-None] 4.1229ms 3.0469ms 328.2019 Ops/s 326.1207 Ops/s $\color{#35bf28}+0.64\%$
test_reinforce_speed[reduce-overhead-backward] 9.7664ms 8.8130ms 113.4693 Ops/s 116.7472 Ops/s $\color{#d91a1a}-2.81\%$
test_iql_speed[False-None] 32.8721ms 31.8482ms 31.3990 Ops/s 30.7339 Ops/s $\color{#35bf28}+2.16\%$
test_iql_speed[False-backward] 46.9000ms 44.8536ms 22.2947 Ops/s 21.8098 Ops/s $\color{#35bf28}+2.22\%$
test_iql_speed[True-None] 12.1253ms 11.0298ms 90.6636 Ops/s 89.2252 Ops/s $\color{#35bf28}+1.61\%$
test_iql_speed[True-backward] 23.6948ms 22.0947ms 45.2597 Ops/s 44.4527 Ops/s $\color{#35bf28}+1.82\%$
test_iql_speed[reduce-overhead-None] 11.6895ms 11.0031ms 90.8835 Ops/s 87.5244 Ops/s $\color{#35bf28}+3.84\%$
test_iql_speed[reduce-overhead-backward] 26.6375ms 21.7179ms 46.0449 Ops/s 45.5421 Ops/s $\color{#35bf28}+1.10\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9870ms 4.7709ms 209.6041 Ops/s 206.7335 Ops/s $\color{#35bf28}+1.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9516ms 0.5156ms 1.9394 KOps/s 1.9240 KOps/s $\color{#35bf28}+0.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8909ms 0.4924ms 2.0309 KOps/s 2.0557 KOps/s $\color{#d91a1a}-1.21\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1169ms 4.5602ms 219.2900 Ops/s 211.4604 Ops/s $\color{#35bf28}+3.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2147ms 0.4985ms 2.0060 KOps/s 1.9705 KOps/s $\color{#35bf28}+1.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7930ms 0.4802ms 2.0826 KOps/s 1.9649 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3863ms 1.6516ms 605.4654 Ops/s 603.3908 Ops/s $\color{#35bf28}+0.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2047ms 1.5601ms 640.9724 Ops/s 640.5607 Ops/s $\color{#35bf28}+0.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2301ms 4.7383ms 211.0481 Ops/s 204.0253 Ops/s $\color{#35bf28}+3.44\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0400ms 0.6442ms 1.5523 KOps/s 1.5297 KOps/s $\color{#35bf28}+1.47\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9379ms 0.6221ms 1.6075 KOps/s 1.5943 KOps/s $\color{#35bf28}+0.83\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1341ms 4.5958ms 217.5910 Ops/s 216.3944 Ops/s $\color{#35bf28}+0.55\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1710ms 0.5137ms 1.9467 KOps/s 1.9747 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8563ms 0.4876ms 2.0509 KOps/s 2.0209 KOps/s $\color{#35bf28}+1.49\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8572ms 4.4708ms 223.6720 Ops/s 219.7085 Ops/s $\color{#35bf28}+1.80\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5777ms 0.5011ms 1.9955 KOps/s 2.0057 KOps/s $\color{#d91a1a}-0.51\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8065ms 0.4858ms 2.0583 KOps/s 2.0992 KOps/s $\color{#d91a1a}-1.95\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.8423ms 4.6300ms 215.9830 Ops/s 209.0774 Ops/s $\color{#35bf28}+3.30\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.4239s 1.2324ms 811.4574 Ops/s 1.5304 KOps/s $\textbf{\color{#d91a1a}-46.98\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9960ms 0.6281ms 1.5922 KOps/s 1.6065 KOps/s $\color{#d91a1a}-0.89\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4046ms 4.1291ms 242.1827 Ops/s 246.8512 Ops/s $\color{#d91a1a}-1.89\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.2651ms 2.1626ms 462.4063 Ops/s 399.6599 Ops/s $\textbf{\color{#35bf28}+15.70\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.5022ms 1.3460ms 742.9619 Ops/s 738.6173 Ops/s $\color{#35bf28}+0.59\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.9669ms 4.2166ms 237.1563 Ops/s 34.7059 Ops/s $\textbf{\color{#35bf28}+583.33\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.3652s 9.5798ms 104.3861 Ops/s 405.4902 Ops/s $\textbf{\color{#d91a1a}-74.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.0470ms 1.2999ms 769.2853 Ops/s 765.8278 Ops/s $\color{#35bf28}+0.45\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.5440ms 4.3188ms 231.5438 Ops/s 220.2945 Ops/s $\textbf{\color{#35bf28}+5.11\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.0573ms 2.5771ms 388.0331 Ops/s 403.8816 Ops/s $\color{#d91a1a}-3.92\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.2526ms 1.4837ms 673.9992 Ops/s 543.7319 Ops/s $\textbf{\color{#35bf28}+23.96\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.7642ms 11.3782ms 87.8877 Ops/s 83.0002 Ops/s $\textbf{\color{#35bf28}+5.89\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.3066ms 14.1071ms 70.8863 Ops/s 71.1650 Ops/s $\color{#d91a1a}-0.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.1093ms 20.3026ms 49.2547 Ops/s 47.5573 Ops/s $\color{#35bf28}+3.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.3719ms 14.2867ms 69.9950 Ops/s 69.9574 Ops/s $\color{#35bf28}+0.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.8418ms 20.0986ms 49.7548 Ops/s 49.2738 Ops/s $\color{#35bf28}+0.98\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.4296ms 15.4354ms 64.7861 Ops/s 64.0706 Ops/s $\color{#35bf28}+1.12\%$

Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8172s 0.7251s 1.3792 Ops/s 1.3833 Ops/s $\color{#d91a1a}-0.30\%$
test_transformed 1.3795s 1.2884s 0.7762 Ops/s 0.7809 Ops/s $\color{#d91a1a}-0.61\%$
test_serial 2.1178s 2.0899s 0.4785 Ops/s 0.4714 Ops/s $\color{#35bf28}+1.50\%$
test_parallel 1.7925s 1.7704s 0.5648 Ops/s 0.5489 Ops/s $\color{#35bf28}+2.91\%$
test_step_mdp_speed[True-True-True-True-True] 0.2518ms 37.5630μs 26.6219 KOps/s 26.2708 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[True-True-True-True-False] 0.1834ms 22.1458μs 45.1553 KOps/s 44.9412 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[True-True-True-False-True] 0.2034ms 21.1679μs 47.2413 KOps/s 46.7968 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-True-True-False-False] 0.2016ms 12.3301μs 81.1021 KOps/s 79.8097 KOps/s $\color{#35bf28}+1.62\%$
test_step_mdp_speed[True-True-False-True-True] 0.2334ms 41.3258μs 24.1979 KOps/s 24.9788 KOps/s $\color{#d91a1a}-3.13\%$
test_step_mdp_speed[True-True-False-True-False] 68.3810μs 24.2255μs 41.2788 KOps/s 41.0169 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[True-True-False-False-True] 0.1246ms 23.5144μs 42.5272 KOps/s 42.9827 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[True-True-False-False-False] 71.1420μs 14.3546μs 69.6643 KOps/s 68.3085 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[True-False-True-True-True] 0.1395ms 42.3615μs 23.6064 KOps/s 23.3460 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[True-False-True-True-False] 83.4810μs 26.9222μs 37.1440 KOps/s 37.4513 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-False-True-False-True] 49.7810μs 23.4667μs 42.6136 KOps/s 42.6793 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-False-True-False-False] 0.1616ms 14.5917μs 68.5319 KOps/s 68.5231 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[True-False-False-True-True] 0.1052ms 44.4909μs 22.4765 KOps/s 22.2980 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-False-False-True-False] 60.5310μs 28.7759μs 34.7513 KOps/s 34.6445 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-False-False-False-True] 0.1201ms 24.5798μs 40.6838 KOps/s 39.0446 KOps/s $\color{#35bf28}+4.20\%$
test_step_mdp_speed[True-False-False-False-False] 0.1980ms 16.5516μs 60.4170 KOps/s 59.5810 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[False-True-True-True-True] 0.2641ms 41.7423μs 23.9565 KOps/s 23.4457 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[False-True-True-True-False] 62.4010μs 26.6977μs 37.4564 KOps/s 37.8669 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[False-True-True-False-True] 2.9496ms 27.3905μs 36.5090 KOps/s 36.1520 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-True-True-False-False] 57.3310μs 16.4441μs 60.8122 KOps/s 61.4903 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-True-False-True-True] 0.1471ms 45.5189μs 21.9689 KOps/s 22.4644 KOps/s $\color{#d91a1a}-2.21\%$
test_step_mdp_speed[False-True-False-True-False] 65.4210μs 29.1788μs 34.2715 KOps/s 34.8928 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[False-True-False-False-True] 79.6420μs 29.0778μs 34.3905 KOps/s 34.1168 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[False-True-False-False-False] 84.2110μs 17.6944μs 56.5150 KOps/s 53.8394 KOps/s $\color{#35bf28}+4.97\%$
test_step_mdp_speed[False-False-True-True-True] 73.7510μs 46.9310μs 21.3079 KOps/s 21.4017 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[False-False-True-True-False] 66.3020μs 31.3713μs 31.8763 KOps/s 32.1558 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[False-False-True-False-True] 59.2810μs 29.0934μs 34.3720 KOps/s 34.1845 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[False-False-True-False-False] 45.3610μs 18.6295μs 53.6784 KOps/s 54.2272 KOps/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[False-False-False-True-True] 0.1281ms 49.3231μs 20.2745 KOps/s 20.8053 KOps/s $\color{#d91a1a}-2.55\%$
test_step_mdp_speed[False-False-False-True-False] 59.7010μs 33.4738μs 29.8741 KOps/s 30.0829 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-False-False-False-True] 67.7910μs 31.0489μs 32.2073 KOps/s 32.4229 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[False-False-False-False-False] 71.8210μs 20.0905μs 49.7747 KOps/s 49.0153 KOps/s $\color{#35bf28}+1.55\%$
test_values[generalized_advantage_estimate-True-True] 24.7141ms 24.2135ms 41.2993 Ops/s 40.8422 Ops/s $\color{#35bf28}+1.12\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1043s 2.9762ms 336.0021 Ops/s 333.3675 Ops/s $\color{#35bf28}+0.79\%$
test_values[td0_return_estimate-False-False] 0.1028ms 76.9824μs 12.9900 KOps/s 13.1163 KOps/s $\color{#d91a1a}-0.96\%$
test_values[td1_return_estimate-False-False] 54.4968ms 53.9173ms 18.5469 Ops/s 18.4481 Ops/s $\color{#35bf28}+0.54\%$
test_values[vec_td1_return_estimate-False-False] 1.2649ms 1.0668ms 937.3688 Ops/s 941.2106 Ops/s $\color{#d91a1a}-0.41\%$
test_values[td_lambda_return_estimate-True-False] 85.4329ms 84.4750ms 11.8378 Ops/s 11.6209 Ops/s $\color{#35bf28}+1.87\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3649ms 1.0664ms 937.7763 Ops/s 940.8227 Ops/s $\color{#d91a1a}-0.32\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.6740ms 24.0806ms 41.5273 Ops/s 41.2462 Ops/s $\color{#35bf28}+0.68\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0388ms 0.7368ms 1.3572 KOps/s 1.3627 KOps/s $\color{#d91a1a}-0.40\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7829ms 0.6533ms 1.5307 KOps/s 1.5351 KOps/s $\color{#d91a1a}-0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6136ms 1.4662ms 682.0469 Ops/s 679.5105 Ops/s $\color{#35bf28}+0.37\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8340ms 0.6648ms 1.5041 KOps/s 1.5003 KOps/s $\color{#35bf28}+0.26\%$
test_dqn_speed[False-None] 1.6397ms 1.4601ms 684.8623 Ops/s 684.9954 Ops/s $\color{#d91a1a}-0.02\%$
test_dqn_speed[False-backward] 8.3973ms 2.2174ms 450.9749 Ops/s 476.5799 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_dqn_speed[True-None] 0.7562ms 0.5493ms 1.8206 KOps/s 1.8340 KOps/s $\color{#d91a1a}-0.73\%$
test_dqn_speed[True-backward] 1.2548ms 1.2046ms 830.1764 Ops/s 810.3201 Ops/s $\color{#35bf28}+2.45\%$
test_dqn_speed[reduce-overhead-None] 0.7543ms 0.5600ms 1.7857 KOps/s 1.7965 KOps/s $\color{#d91a1a}-0.60\%$
test_dqn_speed[reduce-overhead-backward] 1.1690ms 1.0795ms 926.3736 Ops/s 947.3650 Ops/s $\color{#d91a1a}-2.22\%$
test_ddpg_speed[False-None] 3.1450ms 2.8125ms 355.5554 Ops/s 359.5885 Ops/s $\color{#d91a1a}-1.12\%$
test_ddpg_speed[False-backward] 4.4272ms 4.1694ms 239.8438 Ops/s 240.1006 Ops/s $\color{#d91a1a}-0.11\%$
test_ddpg_speed[True-None] 1.5469ms 1.3094ms 763.7345 Ops/s 767.1423 Ops/s $\color{#d91a1a}-0.44\%$
test_ddpg_speed[True-backward] 2.5692ms 2.3909ms 418.2466 Ops/s 393.7191 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_ddpg_speed[reduce-overhead-None] 1.5145ms 1.3162ms 759.7417 Ops/s 731.7025 Ops/s $\color{#35bf28}+3.83\%$
test_ddpg_speed[reduce-overhead-backward] 2.0280ms 1.8723ms 534.0999 Ops/s 502.1332 Ops/s $\textbf{\color{#35bf28}+6.37\%}$
test_sac_speed[False-None] 8.6181ms 8.1195ms 123.1596 Ops/s 125.9813 Ops/s $\color{#d91a1a}-2.24\%$
test_sac_speed[False-backward] 11.5757ms 10.9945ms 90.9548 Ops/s 89.8603 Ops/s $\color{#35bf28}+1.22\%$
test_sac_speed[True-None] 2.2883ms 1.8158ms 550.7300 Ops/s 552.5517 Ops/s $\color{#d91a1a}-0.33\%$
test_sac_speed[True-backward] 3.9256ms 3.6269ms 275.7141 Ops/s 274.7669 Ops/s $\color{#35bf28}+0.34\%$
test_sac_speed[reduce-overhead-None] 21.1556ms 11.9022ms 84.0180 Ops/s 84.4872 Ops/s $\color{#d91a1a}-0.56\%$
test_sac_speed[reduce-overhead-backward] 1.8046ms 1.6817ms 594.6490 Ops/s 554.7657 Ops/s $\textbf{\color{#35bf28}+7.19\%}$
test_redq_speed[False-None] 7.8111ms 7.3399ms 136.2416 Ops/s 135.3839 Ops/s $\color{#35bf28}+0.63\%$
test_redq_speed[False-backward] 11.7532ms 11.1602ms 89.6042 Ops/s 86.3778 Ops/s $\color{#35bf28}+3.74\%$
test_redq_speed[True-None] 2.4189ms 2.2246ms 449.5189 Ops/s 443.9333 Ops/s $\color{#35bf28}+1.26\%$
test_redq_speed[True-backward] 4.1210ms 3.8979ms 256.5493 Ops/s 243.3884 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_redq_speed[reduce-overhead-None] 2.5465ms 2.2436ms 445.7037 Ops/s 433.4613 Ops/s $\color{#35bf28}+2.82\%$
test_redq_speed[reduce-overhead-backward] 4.0168ms 3.8935ms 256.8407 Ops/s 250.7750 Ops/s $\color{#35bf28}+2.42\%$
test_redq_deprec_speed[False-None] 9.3842ms 8.9012ms 112.3445 Ops/s 111.3027 Ops/s $\color{#35bf28}+0.94\%$
test_redq_deprec_speed[False-backward] 12.6072ms 11.8453ms 84.4217 Ops/s 84.2094 Ops/s $\color{#35bf28}+0.25\%$
test_redq_deprec_speed[True-None] 2.9103ms 2.5642ms 389.9912 Ops/s 383.6772 Ops/s $\color{#35bf28}+1.65\%$
test_redq_deprec_speed[True-backward] 4.7006ms 4.4251ms 225.9829 Ops/s 223.6979 Ops/s $\color{#35bf28}+1.02\%$
test_redq_deprec_speed[reduce-overhead-None] 2.8638ms 2.5562ms 391.1987 Ops/s 369.7957 Ops/s $\textbf{\color{#35bf28}+5.79\%}$
test_redq_deprec_speed[reduce-overhead-backward] 4.6062ms 4.3431ms 230.2511 Ops/s 227.2308 Ops/s $\color{#35bf28}+1.33\%$
test_td3_speed[False-None] 8.1256ms 7.8128ms 127.9950 Ops/s 127.9262 Ops/s $\color{#35bf28}+0.05\%$
test_td3_speed[False-backward] 11.0667ms 10.3338ms 96.7699 Ops/s 94.4554 Ops/s $\color{#35bf28}+2.45\%$
test_td3_speed[True-None] 1.6406ms 1.6025ms 624.0120 Ops/s 602.6840 Ops/s $\color{#35bf28}+3.54\%$
test_td3_speed[True-backward] 3.5927ms 3.2570ms 307.0344 Ops/s 304.6199 Ops/s $\color{#35bf28}+0.79\%$
test_td3_speed[reduce-overhead-None] 54.6115ms 25.2670ms 39.5774 Ops/s 37.8291 Ops/s $\color{#35bf28}+4.62\%$
test_td3_speed[reduce-overhead-backward] 1.6565ms 1.4901ms 671.1014 Ops/s 661.3626 Ops/s $\color{#35bf28}+1.47\%$
test_cql_speed[False-None] 16.7505ms 16.3300ms 61.2370 Ops/s 60.7781 Ops/s $\color{#35bf28}+0.76\%$
test_cql_speed[False-backward] 22.7060ms 21.7904ms 45.8918 Ops/s 45.5263 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[True-None] 3.6648ms 3.3109ms 302.0297 Ops/s 311.9028 Ops/s $\color{#d91a1a}-3.17\%$
test_cql_speed[True-backward] 5.6981ms 5.3670ms 186.3233 Ops/s 186.8464 Ops/s $\color{#d91a1a}-0.28\%$
test_cql_speed[reduce-overhead-None] 21.6321ms 13.1089ms 76.2839 Ops/s 58.2004 Ops/s $\textbf{\color{#35bf28}+31.07\%}$
test_cql_speed[reduce-overhead-backward] 2.1038ms 1.7843ms 560.4367 Ops/s 545.3497 Ops/s $\color{#35bf28}+2.77\%$
test_a2c_speed[False-None] 3.3551ms 3.1152ms 321.0110 Ops/s 318.2558 Ops/s $\color{#35bf28}+0.87\%$
test_a2c_speed[False-backward] 6.5982ms 6.0284ms 165.8817 Ops/s 166.1581 Ops/s $\color{#d91a1a}-0.17\%$
test_a2c_speed[True-None] 1.4967ms 1.3162ms 759.7673 Ops/s 740.4702 Ops/s $\color{#35bf28}+2.61\%$
test_a2c_speed[True-backward] 3.0204ms 2.8100ms 355.8718 Ops/s 333.7884 Ops/s $\textbf{\color{#35bf28}+6.62\%}$
test_a2c_speed[reduce-overhead-None] 15.5844ms 8.7859ms 113.8185 Ops/s 116.2041 Ops/s $\color{#d91a1a}-2.05\%$
test_a2c_speed[reduce-overhead-backward] 1.5667ms 1.4315ms 698.5524 Ops/s 686.0584 Ops/s $\color{#35bf28}+1.82\%$
test_ppo_speed[False-None] 3.8843ms 3.6180ms 276.3952 Ops/s 276.0148 Ops/s $\color{#35bf28}+0.14\%$
test_ppo_speed[False-backward] 7.0899ms 6.6689ms 149.9505 Ops/s 149.7202 Ops/s $\color{#35bf28}+0.15\%$
test_ppo_speed[True-None] 1.5786ms 1.3659ms 732.1061 Ops/s 714.0477 Ops/s $\color{#35bf28}+2.53\%$
test_ppo_speed[True-backward] 3.1875ms 3.0088ms 332.3594 Ops/s 313.2139 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_ppo_speed[reduce-overhead-None] 1.1136ms 0.9408ms 1.0629 KOps/s 1.0423 KOps/s $\color{#35bf28}+1.97\%$
test_ppo_speed[reduce-overhead-backward] 1.4552ms 1.3745ms 727.5246 Ops/s 631.6025 Ops/s $\textbf{\color{#35bf28}+15.19\%}$
test_reinforce_speed[False-None] 2.4729ms 2.2346ms 447.5156 Ops/s 446.9893 Ops/s $\color{#35bf28}+0.12\%$
test_reinforce_speed[False-backward] 3.8468ms 3.2438ms 308.2845 Ops/s 291.2132 Ops/s $\textbf{\color{#35bf28}+5.86\%}$
test_reinforce_speed[True-None] 1.4210ms 1.2569ms 795.5847 Ops/s 782.7913 Ops/s $\color{#35bf28}+1.63\%$
test_reinforce_speed[True-backward] 3.0408ms 2.8326ms 353.0354 Ops/s 335.8739 Ops/s $\textbf{\color{#35bf28}+5.11\%}$
test_reinforce_speed[reduce-overhead-None] 18.4065ms 9.8767ms 101.2487 Ops/s 103.3917 Ops/s $\color{#d91a1a}-2.07\%$
test_reinforce_speed[reduce-overhead-backward] 1.4826ms 1.4339ms 697.4082 Ops/s 613.4721 Ops/s $\textbf{\color{#35bf28}+13.68\%}$
test_iql_speed[False-None] 9.4867ms 9.0494ms 110.5048 Ops/s 109.1239 Ops/s $\color{#35bf28}+1.27\%$
test_iql_speed[False-backward] 12.9601ms 12.6150ms 79.2705 Ops/s 76.0725 Ops/s $\color{#35bf28}+4.20\%$
test_iql_speed[True-None] 2.3691ms 2.1806ms 458.5861 Ops/s 438.2439 Ops/s $\color{#35bf28}+4.64\%$
test_iql_speed[True-backward] 5.0505ms 4.6611ms 214.5425 Ops/s 206.6827 Ops/s $\color{#35bf28}+3.80\%$
test_iql_speed[reduce-overhead-None] 17.7977ms 10.6192ms 94.1689 Ops/s 90.6976 Ops/s $\color{#35bf28}+3.83\%$
test_iql_speed[reduce-overhead-backward] 2.0339ms 1.8574ms 538.3835 Ops/s 509.0294 Ops/s $\textbf{\color{#35bf28}+5.77\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0165ms 6.1513ms 162.5668 Ops/s 160.1391 Ops/s $\color{#35bf28}+1.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6415ms 0.3531ms 2.8321 KOps/s 2.8584 KOps/s $\color{#d91a1a}-0.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5971ms 0.3306ms 3.0250 KOps/s 3.0677 KOps/s $\color{#d91a1a}-1.39\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3196ms 5.8781ms 170.1234 Ops/s 168.8652 Ops/s $\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3092ms 0.3658ms 2.7335 KOps/s 3.7622 KOps/s $\textbf{\color{#d91a1a}-27.34\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5834ms 0.3113ms 3.2125 KOps/s 3.0685 KOps/s $\color{#35bf28}+4.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6593ms 1.4323ms 698.1735 Ops/s 778.1935 Ops/s $\textbf{\color{#d91a1a}-10.28\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6363ms 1.3443ms 743.8922 Ops/s 832.3532 Ops/s $\textbf{\color{#d91a1a}-10.63\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4484ms 6.0800ms 164.4724 Ops/s 160.6413 Ops/s $\color{#35bf28}+2.38\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3577ms 0.4861ms 2.0572 KOps/s 2.0989 KOps/s $\color{#d91a1a}-1.99\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7464ms 0.4633ms 2.1586 KOps/s 2.3974 KOps/s $\textbf{\color{#d91a1a}-9.96\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2079ms 5.9046ms 169.3585 Ops/s 165.8481 Ops/s $\color{#35bf28}+2.12\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7962ms 0.3201ms 3.1238 KOps/s 2.9462 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5394ms 0.3050ms 3.2782 KOps/s 3.3805 KOps/s $\color{#d91a1a}-3.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3084ms 5.8605ms 170.6348 Ops/s 167.7409 Ops/s $\color{#35bf28}+1.73\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6869ms 0.3189ms 3.1358 KOps/s 3.5965 KOps/s $\textbf{\color{#d91a1a}-12.81\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6369ms 0.2821ms 3.5442 KOps/s 3.7547 KOps/s $\textbf{\color{#d91a1a}-5.61\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4166ms 6.0340ms 165.7276 Ops/s 161.1844 Ops/s $\color{#35bf28}+2.82\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0432ms 0.4220ms 2.3694 KOps/s 2.2017 KOps/s $\textbf{\color{#35bf28}+7.62\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7034ms 0.4066ms 2.4594 KOps/s 2.2237 KOps/s $\textbf{\color{#35bf28}+10.60\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9955ms 5.4923ms 182.0736 Ops/s 180.0290 Ops/s $\color{#35bf28}+1.14\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.1909ms 2.0928ms 477.8382 Ops/s 434.1396 Ops/s $\textbf{\color{#35bf28}+10.07\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.8940ms 1.2362ms 808.9145 Ops/s 780.7633 Ops/s $\color{#35bf28}+3.61\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.2478ms 5.5726ms 179.4509 Ops/s 180.2471 Ops/s $\color{#d91a1a}-0.44\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.4124ms 2.0690ms 483.3339 Ops/s 433.2961 Ops/s $\textbf{\color{#35bf28}+11.55\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.3338ms 1.1569ms 864.4057 Ops/s 824.6951 Ops/s $\color{#35bf28}+4.82\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5381s 16.3806ms 61.0478 Ops/s 30.2359 Ops/s $\textbf{\color{#35bf28}+101.90\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.5130ms 2.1034ms 475.4115 Ops/s 449.2521 Ops/s $\textbf{\color{#35bf28}+5.82\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.5058ms 1.3488ms 741.3967 Ops/s 743.7767 Ops/s $\color{#d91a1a}-0.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.8552ms 12.5217ms 79.8613 Ops/s 76.2563 Ops/s $\color{#35bf28}+4.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.1636ms 16.5517ms 60.4167 Ops/s 58.5825 Ops/s $\color{#35bf28}+3.13\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.5164ms 17.4734ms 57.2299 Ops/s 56.1536 Ops/s $\color{#35bf28}+1.92\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.2142ms 17.1996ms 58.1408 Ops/s 57.6511 Ops/s $\color{#35bf28}+0.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7133ms 17.2850ms 57.8535 Ops/s 55.1135 Ops/s $\color{#35bf28}+4.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.5861ms 18.2467ms 54.8046 Ops/s 53.4862 Ops/s $\color{#35bf28}+2.46\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants