GPU Memory Leak (cuda) during Forward Pass #63
Replies: 1 comment 2 replies
-
Training SNNs at the scale of Inception Net using this PyTorch backend approach is extremely expensive, and unfortunately, is one of those things that the SNN research community is still trying to understand better. For every time-step in the for-loop, a computational graph is being constructed, with the gradients of all hidden states stored so that backprop through time (BPTT) works as intended. Another way to think about it is that every time step is creating another Inception Net and storing that in memory. Potential solutions include performing a backwards pass at every time step (there are methods available in I am currently working on several alternatives to BPTT that do not require the storage of a full computational graph, and approximate BPTT well enough. This might be useful for you. Though even if you were to get around the memory issue, I haven't seen any networks as deep as Inception Net successfully trained using backprop directly on the SNN itself. While vanishing gradients have been addressed in non-spiking networks thanks to good parameter initialization strategies, and batch norm etc., we still don't have equivalent methods that translate perfectly to SNNs. For deep nets, you might have more luck if you pre-train the ANN and convert that into an SNN: https://snntoolbox.readthedocs.io/en/latest/ |
Beta Was this translation helpful? Give feedback.
-
I am exploring the effects of spiking in an inception-based broadcast network. Without spiking, the network runs and trains normally. However, when I add spiking the memory drastically increases. I have narrowed down the leak to a loop in the forward pass, which executes the 3 inception modules and concatenates their output. Here is my code:
The output is:
Has anyone ever experienced this before? I have been stuck on this problem for many hours now.
Beta Was this translation helpful? Give feedback.
All reactions