You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
Thank you for your contributions to the community. We have a few questions regarding your work:
1. Token Count in LLaVA-Next
LLaVA-Next utilizes dynamic resolution with grid configurations (e.g., 1:2, 2:1, 2:2, 1:3, 3:1), resulting in variable token counts rather than a fixed value of 2880. The token count of 2880 appears specific to the 2:2 grid configuration. Question: Could you clarify the basis for reporting 2880 tokens for LLaVA-Next in your paper?
2. Token Retention Strategy
Your method operates on individual images. Under LLaVA-Next's dynamic resolution framework:
How is the fixed retained token count of 160 ensured across varying grid configurations?
If the configuration model = visionzip(model, dominant=135, contextual=25) is applied, could the actual retained tokens scale to 160 * n (where n depends on the grid, e.g., 2, 3, 4, 5)?
We greatly appreciate your time and insights!
The text was updated successfully, but these errors were encountered:
Hello,
Thank you for your contributions to the community. We have a few questions regarding your work:
1. Token Count in LLaVA-Next
LLaVA-Next utilizes dynamic resolution with grid configurations (e.g., 1:2, 2:1, 2:2, 1:3, 3:1), resulting in variable token counts rather than a fixed value of 2880. The token count of 2880 appears specific to the 2:2 grid configuration.
Question: Could you clarify the basis for reporting 2880 tokens for LLaVA-Next in your paper?
2. Token Retention Strategy
Your method operates on individual images. Under LLaVA-Next's dynamic resolution framework:
model = visionzip(model, dominant=135, contextual=25)
is applied, could the actual retained tokens scale to160 * n
(wheren
depends on the grid, e.g., 2, 3, 4, 5)?We greatly appreciate your time and insights!
The text was updated successfully, but these errors were encountered: