You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This generates random models and then tests different concurrencies
of batches to check if the output is consistent.
This can detect when e.g. the recurrent cache has been broken,
or anything else which would affect the consistency of the output
when inferencing multiple distinct sequences.
More architectures will be added, but for now this starts with Mamba.
Eventually, consistency of pooled embeddings will also be tested.
The goal is to reduce accidental regressions
by making it easy to quickly test a lot of edge cases
on the supported architectures,
without having to download any model.
0 commit comments