You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apply optimizer to model weights without data copy (#222)
* WIP optimizer refactor w/ pointers
* WIP optimizer optimization
* Send the data to optimizer without a copy works for dense layers
* Get weights and weight gradients as 1d
* get_params_ptr and get_gradients_ptr for conv1d, conv2d, and locally_connected1d
* Define optimizer instance per layer to preserve memory across layers
* Initialization of network-wide optimizer no longer needed now that we switched to per-layer optimizer instances
* Bookkeeping for velocity, rms_gradient, etc.; optimizer tests now pass
* Update optimizer flow for linear2d
* Update optimizer flow for layernorm
* Previous bookkeeping for successive calls to optim % minimize() assumed 2 calls per batch; this is now generalized to allow any number of calls until size(params) is exhausted
* Remove get_gradients from network, layer, dense, conv1d, conv2d
* Remove optimizer as component to the network class
0 commit comments