You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Check out our latest research:
AutoLibra, a metric induction method for automatically inducing agent evaluation metrics from human open-ended feedback.
EgoNormia, a new vision-language benchmark for evaluating LLM/VLM's understanding of physical social norm from ego-centric videos.