-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Granite Docling stopping #16438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Granite Docling stopping #16438
Conversation
Branch: GraniteDoclingStopping Signed-off-by: Gabe Goodhart <[email protected]>
Branch: GraniteDoclingStopping Signed-off-by: Gabe Goodhart <[email protected]>
… prompt There should not be one, even for the language models. Branch: GraniteDoclingStopping Signed-off-by: Gabe Goodhart <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to update the test as well (where interestingly no-one caught the additional \n
at the end:
llama.cpp/tests/test-chat-template.cpp
Lines 214 to 219 in b1afcab
{ | |
/* .name= */ "ibm-granite/granite-3.0-8b-instruct", | |
/* .template_str= */ "{%- if tools %}\n {{- '<|start_of_role|>available_tools<|end_of_role|>\n' }}\n {%- for tool in tools %}\n {{- tool | tojson(indent=4) }}\n {%- if not loop.last %}\n {{- '\n\n' }}\n {%- endif %}\n {%- endfor %}\n {{- '<|end_of_text|>\n' }}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n {{- '<|start_of_role|>system<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- elif message['role'] == 'user' %}\n {{- '<|start_of_role|>user<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '<|start_of_role|>assistant<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- elif message['role'] == 'assistant_tool_call' %}\n {{- '<|start_of_role|>assistant<|end_of_role|><|tool_call|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- elif message['role'] == 'tool_response' %}\n {{- '<|start_of_role|>tool_response<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- endif %}\n {%- if loop.last and add_generation_prompt %}\n {{- '<|start_of_role|>assistant<|end_of_role|>' }}\n {%- endif %}\n{%- endfor %}", | |
/* .expected_output= */ "<|start_of_role|>system<|end_of_role|>You are a helpful assistant<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Hello<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>Hi there<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Who are you<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|> I am an assistant <|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Another question<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>\n", | |
/* .expected_output_jinja= */ "<|start_of_role|>system<|end_of_role|>You are a helpful assistant<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Hello<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>Hi there<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Who are you<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|> I am an assistant <|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Another question<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>", | |
}, |
This comment was marked as spam.
This comment was marked as spam.
The failed CI seems to be unrelated to the current PR |
They are definitely related, see my comment. :) |
Ah yeah ok I misread the CI output |
Yikes, thanks! I'll get that fixed asap |
Branch: GraniteDoclingStopping Signed-off-by: Gabe Goodhart <[email protected]>
Description
This is a follow up to #16110 to fix the issue of the model not terminating correctly. It turned out to be a few lingering bugs in the tokenization:
<fake_token_around_image>
before the first image slice\n
before the start of the global image instead of a double newlinellama-chat.cpp
had an errant\n
at the end when adding the assistant generation promptI'm pretty sure (3) was the main cause of the problem, but all three are fixed here. For reference, here are the chat templates for the various models that use the granite chat template:
granite-docling-258M
: https://huggingface.co/ibm-granite/granite-docling-258M?chat_template=default#L20granite-3.3-8b-instruct
: https://huggingface.co/ibm-granite/granite-3.3-8b-instruct?chat_template=default#L60granite-3.2-8b-instruct
: https://huggingface.co/ibm-granite/granite-3.2-8b-instruct?chat_template=default#L65granite-3.1-8b-instruct
: https://huggingface.co/ibm-granite/granite-3.1-8b-instruct?chat_template=default#L62granite-3.0-8b-instruct
: https://huggingface.co/ibm-granite/granite-3.0-8b-instruct?chat_template=default#L33