Skip to content

feat(alias-ds): refactor code execution logic and plan organization#111

Open
SSSuperDan wants to merge 7 commits intoagentscope-ai:mainfrom
StCarmen:main_ds
Open

feat(alias-ds): refactor code execution logic and plan organization#111
SSSuperDan wants to merge 7 commits intoagentscope-ai:mainfrom
StCarmen:main_ds

Conversation

@SSSuperDan
Copy link
Copy Markdown
Contributor

@SSSuperDan SSSuperDan commented Jan 22, 2026

📝 PR Type

  • Add new sample
  • Update existing sample
  • Add new test cases
  • Fix test failures
  • Documentation/Configuration update

📚 Description

This PR refactors the internal code execution logic and plan organization in the data science agent.


✅ Checklist

Please complete the following checks before submitting the PR:

  • All sample code has been formatted with pre-commit run --all-files
  • Test coverage has not decreased (if applicable)
  • Related documentation in agentscope-samples has been updated (e.g., README.md)

@SSSuperDan SSSuperDan requested a review from a team January 22, 2026 11:24
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @SSSuperDan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a substantial refactoring of the Data Science Agent's operational framework. The primary goal is to enhance the agent's planning capabilities and standardize its code execution workflow by integrating a more structured planning notebook system. This change aims to improve the agent's ability to manage complex tasks, ensure consistent code execution practices, and provide more detailed reporting, ultimately leading to a more reliable and efficient data science agent.

Highlights

  • Refactored Data Science Agent Logic: The internal code execution and plan organization logic within the Data Science Agent has been significantly refactored. This involves transitioning from a custom 'todo list' mechanism to a more robust planning system leveraging agentscope.plan.PlanNotebook.
  • Updated Planning Tools and Protocols: The agent's system workflow prompt has been updated to reflect new planning tools such as create_plan, update_subtask_state, finish_subtask, revise_current_plan, and finish_plan. Explicit protocols for Python code execution, emphasizing writing to file and executing via shell, have also been added.
  • Removal of Legacy Todo List System: The todo_write tool, its associated prompt files (_agent_todo_reminder_prompt.md, _tool_todo_list_prompt.yaml), and related logic within the _data_science_agent.py and ds_agent_utils/__init__.py files have been removed.
  • Enhanced Report Generation: The report generation functionality now produces both Markdown and HTML versions of detailed reports, which are then saved to files, providing more comprehensive output.
  • Dependency and Configuration Updates: The agentscope-runtime dependency is now pinned to 1.0.4, and pydantic and mcp versions have been updated in requirements.txt. New dependencies agentscope[full] and qdrant-client were added. Nginx configuration has also been updated with proxy timeout settings.
  • Workspace Initialization: A new set_workspace_dir function has been introduced to create standard code and data directories within the sandbox environment, streamlining initial setup.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring of the data science agent's internal logic, transitioning from a simple 'todo list' mechanism to a more structured 'plan notebook' approach. This change enhances task management and provides clearer guidelines for the agent's operation. Key updates include the removal of the todo_write tool and its associated prompts, replaced by new planning tools like create_plan, update_subtask_state, finish_subtask, revise_current_plan, and finish_plan. Additionally, explicit 'Code Execution Protocols' have been added to the agent's system prompt, enforcing a safer and more controlled method for executing Python code by requiring writing to file and executing via shell. The run_ipython_cell tool has been removed from the data science agent's direct toolkit, aligning with these new protocols. Dependency versions for pydantic and mcp have been updated, and agentscope[full] and qdrant-client are now explicitly included in the sandbox requirements. Nginx configuration has also been updated with proxy timeout settings for improved stability. Overall, these changes improve the agent's robustness, maintainability, and adherence to best practices for code execution and task management.

Comment on lines +556 to 558
"read_file",
"write_file",
"run_ipython_cell",
"run_shell_command",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ds_tool_list in init_ds_toolkit now includes read_file and removes run_ipython_cell. The removal of run_ipython_cell is a critical change that aligns with the new 'Code Execution Protocols' in the system prompt, promoting safer execution by requiring code to be written to a file and executed via run_shell_command.

Comment on lines +39 to +65
## Code Execution Protocols

**For all Python code execution:**

1. **Write code to file first** using `write_file` or similar tools
2. **Execute via shell** using `run_shell_command` with `python <filename>.py`
3. **For large and key results**: Save to file (CSV/JSON/pickle), load in next step. Avoid printing large outputs or passing data via stdout
4. **Never execute code inline or outside this workflow**

**Example:**

Tool: `write_file`
Arguments: {
"file_path": "/workspace/code/analysis.py",
"content": "import pandas as pd\ndf = pd.read_csv('data.csv')\nprint(df.head())"
}

Tool: `run_shell_command`
Arguments: {
"command": "python /workspace/code/analysis.py"
}

Tool: `write_file`
Arguments: {
"file_path": "/workspace/code/step1_aggregate.py",
"content": "import pandas as pd\ndf = pd.read_csv('raw_data.csv')\nprocessed = df.groupby('category').sum()\nprocessed.to_csv('/workspace/data/intermediate_result.csv', index=False)\nprint('✓ Results saved')"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

A new 'Code Execution Protocols' section has been added, explicitly outlining the recommended workflow for executing Python code: write to file, then execute via shell. This is a crucial improvement for security, reproducibility, and maintainability, preventing arbitrary inline code execution and promoting structured development.

Comment on lines +31 to 36
- **You must use `create_plan` to create a task plan**, especially for multi-step tasks.
- Use `update_subtask_state` to mark subtasks as 'in_progress' when starting them.
- Use `finish_subtask` to mark subtasks as 'done' with specific outcomes upon completion.
- Use `revise_current_plan` to adapt the plan accordingly if new insights or results emerge.
- Use `finish_plan` to finalize the entire task when all subtasks are complete.
- Skipping planning risks missing critical steps—this is unacceptable.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The 'Task Management Rules' section has been completely revised to detail the usage of the new PlanNotebook tools (create_plan, update_subtask_state, finish_subtask, revise_current_plan, finish_plan). This provides clear instructions for the agent on how to manage its tasks effectively.

Comment on lines +6 to +8
proxy_connect_timeout 120s;
proxy_send_timeout 120s;
proxy_read_timeout 120s;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding proxy_connect_timeout, proxy_send_timeout, and proxy_read_timeout to the Nginx configuration is a good operational improvement. These settings help prevent timeouts for long-running requests or slow network conditions, enhancing the stability and reliability of the service.

return (
f"{base_prompt}{self._selected_scenario_prompts}\n\n{todo_prompt}"
)
return f"{base_prompt}{self._selected_scenario_prompts}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The sys_prompt property has been simplified by removing the todo_prompt logic. This is consistent with the removal of the old todo list management system and the introduction of PlanNotebook.

Comment on lines +328 to +329
global_toolkit = AliasToolkit(sandbox, add_all=True)
worker_toolkit = init_ds_toolkit(global_toolkit)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The worker_toolkit is now initialized directly by calling init_ds_toolkit(global_toolkit). This refactoring improves modularity by centralizing the data science toolkit setup logic within init_ds_toolkit, making arun_datascience_agent cleaner and more focused.

return response.content[0]["text"]

async def generate_report(self) -> Tuple[str, str]:
async def generate_report(self) -> Tuple[str, str, str]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The return type hint for generate_report has been updated to Tuple[str, str, str], reflecting the new functionality of returning both markdown and HTML report content.

}
}
} No newline at end of file
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding a newline at the end of the file is a minor stylistic improvement, adhering to common file formatting conventions.

Comment on lines +4 to +6
pydantic==2.12.5
requests==2.32.3
mcp==1.9.0
mcp==1.25.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Updating pydantic to 2.12.5 and mcp to 1.25.0 ensures that the sandbox environment uses the latest versions of these dependencies. This can bring bug fixes, performance improvements, and new features, contributing to overall system stability and security.

Comment on lines +17 to +18
agentscope[full]==1.0.11
qdrant-client==1.15.1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Explicitly adding agentscope[full]==1.0.11 and qdrant-client==1.15.1 to the sandbox requirements ensures that these critical packages are available within the sandbox environment. This is essential for the agent's functionality, especially after the removal of the install_package utility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants