Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Missing metrics when logging hyperparameters on tensorboard #1298

Open
4 tasks done
rogierz opened this issue Jan 26, 2023 · 4 comments · May be fixed by #1299
Open
4 tasks done

[Bug]: Missing metrics when logging hyperparameters on tensorboard #1298

rogierz opened this issue Jan 26, 2023 · 4 comments · May be fixed by #1299
Labels
bug Something isn't working

Comments

@rogierz
Copy link

rogierz commented Jan 26, 2023

🐛 Bug

When I try to log metrics related to some hyperparameters on tensorboard, the values of metrics are not stored.

To Reproduce

from stable_baselines3.common.logger import configure, HParam

tmp_path = "log_bug"
# set up logger
new_logger = configure(tmp_path, ["tensorboard"])
hp = HParam({"hparam": 1.0}, {"missing_metric": 2.0})

new_logger.record("hparams", hp)
new_logger.dump()

Relevant log output / Error message

No response

System Info

OS: Linux-5.15.79.1-microsoft-standard-WSL2-x86_64-with-glibc2.29 #1 SMP Wed Nov 23 01:01:46 UTC 2022
Python: 3.8.10
Stable-Baselines3: 1.6.2
PyTorch: 1.13.1+cu117
GPU Enabled: False
Numpy: 1.24.1
Gym: 0.21.0

Checklist

  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • I have provided a minimal working example to reproduce the bug
  • I've used the markdown code blocks for both code and stack traces.
@rogierz rogierz added the bug Something isn't working label Jan 26, 2023
@riccardosepe
Copy link

I have the same issue when trying to log hparams metrics

rogierz added a commit to rogierz/stable-baselines3 that referenced this issue Jan 26, 2023
Co-authored-by: Riccardo Sepe <[email protected]>
Co-authored-by: Francesco Scalera <[email protected]>
rogierz added a commit to rogierz/stable-baselines3 that referenced this issue Jan 26, 2023
rogierz added a commit to rogierz/stable-baselines3 that referenced this issue Jan 26, 2023
Co-authored-by: Riccardo Sepe <[email protected]>
Co-authored-by: Francesco Scalera <[email protected]>
@araffin
Copy link
Member

araffin commented Jan 26, 2023

@timothe-chaumont as you did the implementation in #984, could you have a look?

new_logger.dump()

I would expect dump(num_timesteps) there

@rogierz rogierz linked a pull request Jan 26, 2023 that will close this issue
16 tasks
rogierz added a commit to rogierz/stable-baselines3 that referenced this issue Feb 17, 2023
Co-authored-by: Riccardo Sepe <[email protected]>
Co-authored-by: Francesco Scalera <[email protected]>
@timothe-chaumont
Copy link
Contributor

You are right @rogierz, metric values that are passed to HParam through the metric_dict won't be saved. They are supposed to reference metrics that have been logged separately (otherwise they won't be displayed in HPARAMS).

In the documentation, the example mentions:

# define the metrics that will appear in the `HPARAMS` Tensorboard tab by referencing their tag
# Tensorbaord will find & display metrics from the `SCALARS` tab
metric_dict = {
    "rollout/ep_len_mean": 0,
    "train/value_loss": 0.0,
}


So in your example you would need to log your custom metric with

new_logger.record("missing_metric", 2.0)

so that, when referenced in HParam, TensorBoard will find it and add it to the HPARAMS tab:


Screenshot 2023-02-20 at 00 35 21

@kingjin94
Copy link

kingjin94 commented Jan 15, 2024

IMHO I might have an idea where this bug comes from: E.g. in HumanOutputFormat one iterates over all keys to output during logger.dump. But the iteration happens together with key_excluded. It is to note that nowhere is it ensured that a loggers.name_to_value and name_to_exclude are the same length (they are public and one might want to use them to do more specific logging then possible just with log_value/mean)..

I would suggest the following patch @

for (key, value), (_, excluded) in zip(sorted(key_values.items()), sorted(key_excluded.items())):
:

- for (key, value), (_, excluded) in zip(sorted(key_values.items()), sorted(key_excluded.items())):
+ for (key, value) in sorted(key_values.items():
+     excluded = key_excluded.get(key, ('',))

Similar changes are needed, e.g., for the TensorboardOutputFormat @

for (key, value), (_, excluded) in zip(sorted(key_values.items()), sorted(key_excluded.items())):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants