Bug in catalyst/callbacks/backward.py if the grad_clip_fn value is set. #1445

AleksandrMinin · 2023-04-05T08:44:58Z

🐛 Bug Report

Bug in catalyst/callbacks/backward.py if the grad_clip_fn value is set.

How To Reproduce

Steps to reproduce the behavior:

Create a callback with a BackwardCallback in which grad_clip_fn is not empty.
Launch runner.train with this callback.
The output will be an error:

/python_envs/kaggle-env/lib/python3.8/site-packages/catalyst/callbacks/backward.py:55                                                                                                 
   52 │   │   │   
   53 │   │   │   if self.grad_clip_fn is not None:
   54 │   │   │   │   runner.engine.unscale_gradients()
-->55 │   │   │   │   norm = self.grad_clip_fn(self.model.parameters())
   56 │   │   │   │   if self._log_gradient:
   57 │   │   │   │   │   runner.batch_metrics[f"{self._prefix_gradient}/norm"] = norm
   58                                                                                             

AttributeError: 'BackwardCallback' object has no attribute 'model'

Code sample

import torch
from torch.nn.utils import clip_grad_norm_
from catalyst import dl
from catalyst.core.callback import Callback
from catalyst.engines.torch import CPUEngine, GPUEngine

from src.config import config
from src.base_config import Config
from src.tools import set_global_seed, get_code
from src.dataset import get_loaders
from src.crnn import CRNN
from src.runners import SupervisedOCRRunner

callbacks= [     
    dl.CriterionCallback(
        input_key=dict(output="log_probs", output_size="input_lengths"),
        target_key=dict(target="targets", target_len="target_lengths"),     
        metric_key="loss",
        criterion_key="ctc_loss_fn",
    ),
    dl.BackwardCallback(
        metric_key="loss",
        grad_clip_fn=clip_grad_norm_,
        grad_clip_params={"max_norm": 0.5,
                          "norm_type": 2},   
    ),
]


loaders, infer_loader = get_loaders(config)  
model = CRNN(**config.model_kwargs)

optimizer = config.optimizer(params=model.parameters(), **config.optimizer_kwargs)
scheduler = config.scheduler(optimizer=optimizer, **config.scheduler_kwargs)


if torch.cuda.is_available():
    engine = GPUEngine()
else:
    engine = CPUEngine()

runner = SupervisedOCRRunner(
    input_key="image", 
    target_key="target", 
    output_key="output",
)

criterion = {"ctc_loss_fn": config.ctc_loss}

runner.train(
    model=model,
    engine=engine,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    callbacks=callbacks,
    num_epochs=config.n_epochs,
    valid_loader="valid",
    valid_metric=config.valid_metric,
    minimize_valid_metric=config.minimize_metric,
    seed=config.seed,
    verbose=True,
    load_best_on_end=True,
)

Expected behavior

You need to replace

norm = self.grad_clip_fn(self.model.parameters())

with

norm = self.grad_clip_fn(runner.model.parameters())

in catalyst/callbacks/backward.py line 55.

Then there will be no mistake and the training will be successful.

Environment

Catalyst version: 22.04
PyTorch version: 1.13.0+cu117
Is debug build: No
CUDA used to build PyTorch: 11.7
TensorFlow version: N/A
TensorBoard version: 2.9.1

OS: Ubuntu 20.04.3 LTS
GCC version: (Ubuntu 7.5.0-6ubuntu2) 7.5.0
CMake version: version 3.10.3

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: NVIDIA GeForce GTX 1080
GPU 1: NVIDIA GeForce GTX 1080

Nvidia driver version: 470.82.01
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] catalyst==22.4
[pip3] efficientnet-pytorch==0.7.1
[pip3] numpy==1.23.5
[pip3] pytorch-ignite==0.4.11
[pip3] segmentation-models-pytorch==0.3.2
[pip3] tensorboard==2.9.1
[pip3] tensorboard-data-server==0.6.1
[pip3] tensorboard-plugin-wit==1.8.1
[pip3] tensorboardX==2.5.1
[pip3] torch==1.13.0
[pip3] torchvision==0.14.0
[conda] blas                      1.0                         mkl  
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py39h7f8727e_0  
[conda] mkl_fft                   1.3.1            py39hd3c417c_0  
[conda] mkl_random                1.2.2            py39h51133e4_0  
[conda] numpy                     1.21.5           py39h6c91a56_3  
[conda] numpy-base                1.21.5           py39ha15fc14_3  
[conda] numpydoc                  1.4.0            py39h06a4308_0

Checklist

FAQ

Please review the FAQ before submitting an issue:

I have read the documentation and FAQ
I have reviewed the minimal examples section
I have checked the changelog for main framework updates
I have read the contribution guide
I have joined Catalyst slack (#__questions channel) for issue discussion

bagxi · 2023-05-25T20:56:29Z

Duplicate of #1444

AleksandrMinin added bug Something isn't working help wanted Extra attention is needed labels Apr 5, 2023

AleksandrMinin assigned Scitator and bagxi Apr 5, 2023

bagxi added the duplicate This issue or pull request already exists label May 25, 2023

bagxi marked this as a duplicate of #1444 May 25, 2023

bagxi closed this as completed May 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in catalyst/callbacks/backward.py if the grad_clip_fn value is set. #1445

Bug in catalyst/callbacks/backward.py if the grad_clip_fn value is set. #1445

AleksandrMinin commented Apr 5, 2023

bagxi commented May 25, 2023

Bug in catalyst/callbacks/backward.py if the grad_clip_fn value is set. #1445

Bug in catalyst/callbacks/backward.py if the grad_clip_fn value is set. #1445

Comments

AleksandrMinin commented Apr 5, 2023

🐛 Bug Report

How To Reproduce

Code sample

Expected behavior

Environment

Checklist

FAQ

bagxi commented May 25, 2023