You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the addition of flex attention support through #36643, encoder only models still lack this feature.
XLMRoberta, ModernBERT (and EuroBERT in the future) are very common for RAG setups (embedding + reranker).
Allowing them to support arbitrary attention patterns can be useful.
Motivation
Support for arbitrary attention patterns can be useful for research/production.