-
Notifications
You must be signed in to change notification settings - Fork 10
[临时方案]support dsv32 performance benchmark #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @wenba0, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request implements a temporary solution to support performance benchmarking for 'dsv32' by introducing a new message encoding and decoding utility. It defines a structured format for handling system messages, user prompts, assistant responses, and tool calls, including a 'thinking mode' for reasoning content. This new encoding mechanism is then integrated into the existing Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a temporary solution for dsv32 performance benchmarking by adding a new encoding and decoding module for a specific message format. The changes are primarily in the new ais_bench/benchmark/utils/file/encoding_dsv32.py file, which handles message construction, tool calls, and parsing. The AISTokenizer is also updated to use this new encoding logic. My review focuses on improving the robustness, maintainability, and code quality of this new module. I've provided suggestions to address issues such as overly broad exception handling, fragile parsing logic that relies on assert, complex functions that could be refactored, and missing type hints. I've also noted an opportunity to improve flexibility by making a hardcoded configuration in load_tokenizer.py more adaptable.
| break | ||
| return last_user_index | ||
|
|
||
| def render_message(index: int, messages: List[Dict[str, Any]], thinking_mode: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function render_message is very long and has high cyclomatic complexity due to handling multiple message roles in a large if/elif/else block. To improve readability and maintainability, consider refactoring it by breaking it down into smaller, role-specific rendering functions (e.g., _render_system_message, _render_user_message, etc.). You could use a dictionary to map roles to their respective rendering functions.
| def parse_tool_calls(index: int, text: str): | ||
| tool_calls: List[Dict[str, Any]] = [] | ||
| stop_token = None | ||
| tool_calls_end_token = f"</{dsml_token}function_calls>" | ||
|
|
||
| while index < len(text): | ||
| index, _, stop_token = _read_until_stop(index, text, [f"<{dsml_token}invoke", tool_calls_end_token]) | ||
| assert _ == ">\n", "Tool call format error" | ||
|
|
||
| if stop_token == tool_calls_end_token: | ||
| break | ||
|
|
||
| assert stop_token is not None, "Missing special token" | ||
|
|
||
| index, tool_name_content, stop_token = _read_until_stop(index, text, [f"<{dsml_token}parameter", f"</{dsml_token}invoke"]) | ||
|
|
||
| p_tool_name = re.findall(r'^\s*name="(.*?)">\n$', tool_name_content, flags=re.DOTALL) | ||
| assert len(p_tool_name) == 1, "Tool name format error" | ||
| tool_name = p_tool_name[0] | ||
|
|
||
| tool_args: Dict[str, Tuple[str, str]] = {} | ||
| while stop_token == f"<{dsml_token}parameter": | ||
| index, param_content, stop_token = _read_until_stop(index, text, [f"/{dsml_token}parameter"]) | ||
|
|
||
| param_kv = re.findall(r'^ name="(.*?)" string="(true|false)">(.*?)<$', param_content, flags=re.DOTALL) | ||
| assert len(param_kv) == 1, "Parameter format error" | ||
| param_name, string, param_value = param_kv[0] | ||
|
|
||
| assert param_name not in tool_args, "Duplicate parameter name" | ||
| tool_args[param_name] = (param_value, string) | ||
|
|
||
| index, content, stop_token = _read_until_stop(index, text, [f"<{dsml_token}parameter", f"</{dsml_token}invoke"]) | ||
| assert content == ">\n", "Parameter format error" | ||
|
|
||
| tool_call = decode_dsml_to_arguments(tool_name=tool_name, tool_args=tool_args) | ||
| tool_calls.append(tool_call) | ||
|
|
||
| return index, stop_token, tool_calls | ||
|
|
||
| # NOTE: This function is designed to parse only correctly formatted string and will not attempt to correct malformed output that may be generated by the model. | ||
| def parse_message_from_completion_text(text: str, thinking_mode: str): | ||
| summary_content, reasoning_content, tool_calls = "", "", [] | ||
| index, stop_token = 0, None | ||
| tool_calls_start_token = f"\n\n<{dsml_token}function_calls" | ||
|
|
||
| is_thinking, is_tool_calling = thinking_mode == "thinking", False | ||
|
|
||
| if is_thinking: | ||
| index, content_delta, stop_token = _read_until_stop(index, text, [thinking_end_token, tool_calls_start_token]) | ||
| reasoning_content = content_delta | ||
| assert stop_token == thinking_end_token, "Invalid thinking format" | ||
|
|
||
| index, content_delta, stop_token = _read_until_stop(index, text, [eos_token, tool_calls_start_token]) | ||
| summary_content = content_delta | ||
| if stop_token == tool_calls_start_token: | ||
| is_tool_calling = True | ||
| else: | ||
| assert stop_token == eos_token, "Invalid summary format" | ||
|
|
||
| if is_tool_calling: | ||
| index, stop_token, tool_calls = parse_tool_calls(index, text) | ||
|
|
||
| index, tool_ends_text, stop_token = _read_until_stop(index, text, [eos_token]) | ||
| assert not tool_ends_text, "Unexpected content after tool calls" | ||
|
|
||
| assert len(text) == index and stop_token in [eos_token, None], "Unexpected content at end" | ||
|
|
||
| for sp_token in [bos_token, eos_token, thinking_start_token, thinking_end_token, dsml_token]: | ||
| assert sp_token not in summary_content and sp_token not in reasoning_content, "Unexpected special token in content" | ||
|
|
||
| return { | ||
| "role": "assistant", | ||
| "content": summary_content, | ||
| "reasoning_content": reasoning_content, | ||
| "tool_calls": tool_calls_to_openai_format(tool_calls) | ||
| } No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parsing functions parse_tool_calls and parse_message_from_completion_text rely heavily on assert statements for validating model-generated text. This approach is not robust because assert statements can be disabled when Python is run in optimized mode (with the -O flag), and they raise a generic AssertionError which is not ideal for handling parsing failures. It's better to use explicit checks and raise a ValueError or a custom parsing exception with a descriptive message to handle malformed input gracefully.
| def to_json(value: Any) -> str: | ||
| try: | ||
| return json.dumps(value, ensure_ascii=False) | ||
| except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a bare except: is overly broad and can mask unexpected errors by catching system-level exceptions like KeyboardInterrupt or SystemExit. It's a best practice to catch more specific exceptions. If the intent is to catch any error during serialization, except Exception: is a safer alternative.
| except: | |
| except Exception: |
| except: | ||
| return json.dumps(value, ensure_ascii=True) | ||
|
|
||
| def tools_from_openai_format(tools): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is missing type hints for its arguments and return value. Adding type hints would improve code clarity, make it easier to understand the expected data structures, and enable static analysis tools to catch potential type-related bugs.
| def tools_from_openai_format(tools): | |
| def tools_from_openai_format(tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]: |
| def tools_from_openai_format(tools): | ||
| return [tool["function"] for tool in tools] | ||
|
|
||
| def tool_calls_from_openai_format(tool_calls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is missing type hints for its arguments and return value. Adding type hints would improve code clarity, make it easier to understand the expected data structures, and enable static analysis tools to catch potential type-related bugs.
| def tool_calls_from_openai_format(tool_calls): | |
| def tool_calls_from_openai_format(tool_calls: List[Dict[str, Any]]) -> List[Dict[str, Any]]: |
| for tool_call in tool_calls | ||
| ] | ||
|
|
||
| def tool_calls_to_openai_format(tool_calls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is missing type hints for its arguments and return value. Adding type hints would improve code clarity, make it easier to understand the expected data structures, and enable static analysis tools to catch potential type-related bugs.
| def tool_calls_to_openai_format(tool_calls): | |
| def tool_calls_to_openai_format(tool_calls: List[Dict[str, Any]]) -> List[Dict[str, Any]]: |
| def decode_dsml_to_arguments(tool_name: str, tool_args: Dict[str, Tuple[str, str]]) -> Dict[str, str]: | ||
| def _decode_value(key: str, value: str, string: str): | ||
| if string == "true": | ||
| value = to_json(value) | ||
| return f"{to_json(key)}: {value}" | ||
|
|
||
| tool_args_json = "{" + ", ".join([_decode_value(k, v, string=is_str) for k, (v, is_str) in tool_args.items()]) + "}" | ||
| return dict(name=tool_name, arguments=tool_args_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually constructing a JSON string is fragile and can be error-prone, especially if values contain special characters. It's safer and more readable to build a Python dictionary first and then serialize it to a JSON string using json.dumps(). This ensures correct formatting and escaping.
| def decode_dsml_to_arguments(tool_name: str, tool_args: Dict[str, Tuple[str, str]]) -> Dict[str, str]: | |
| def _decode_value(key: str, value: str, string: str): | |
| if string == "true": | |
| value = to_json(value) | |
| return f"{to_json(key)}: {value}" | |
| tool_args_json = "{" + ", ".join([_decode_value(k, v, string=is_str) for k, (v, is_str) in tool_args.items()]) + "}" | |
| return dict(name=tool_name, arguments=tool_args_json) | |
| def decode_dsml_to_arguments(tool_name: str, tool_args: Dict[str, Tuple[str, str]]) -> Dict[str, str]: | |
| arguments = {} | |
| for key, (value, is_str) in tool_args.items(): | |
| if is_str == "true": | |
| arguments[key] = value | |
| else: | |
| try: | |
| arguments[key] = json.loads(value) | |
| except json.JSONDecodeError: | |
| arguments[key] = value | |
| return {"name": tool_name, "arguments": to_json(arguments)} |
|
|
||
| def find_last_user_index(messages: List[Dict[str, Any]]) -> int: | ||
| last_user_index = -1 | ||
| for idx in range(len(messages)-1, -1, -1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| prompt += "\n\n" + thinking_end_token | ||
|
|
||
| elif role == "assistant": | ||
| prev_assistant_idx = index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| encode_config = dict(thinking_mode="thinking", drop_thinking=True, add_default_bos_token=True) | ||
| messages = encode_messages(prompt, **encode_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encode_config dictionary is hardcoded within the encode method. This reduces flexibility and makes it difficult to change the encoding behavior without modifying the code. Consider passing this configuration as an argument to the encode method or setting it in the AISTokenizer constructor to make it more configurable.
Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Please describe the motivation of this PR and the goal you want to achieve through this PR.
Modification
Please briefly describe what modification is made in this PR.
BC-breaking (Optional)
Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here and update the documentation.
Checklist
Before PR:
After PR:
🌟 Useful CI Command
/gemini review/gemini summary/gemini help/readthedocs build