Skip to content

Conversation

@CodeBy-HP
Copy link

Hi! I've really been enjoying the Sidekick project. While working through it, I noticed that recent updates to Gradio (v6.0) and LangGraph (v0.2) introduced some deprecation warnings and syntax changes.

I've updated the codebase to align with these newer standards to ensure it runs smoothly for future students.

Summary of Changes:

  • Gradio 6.0: Removed deprecated parameters (like show_copy_button) and moved CSS to launch().
  • Async/Await: Switched to native async patterns to avoid event loop conflicts.
  • LangGraph: Updated node signatures to accept RunnableConfig (required for v0.2+) and switched to .ainvoke.
  • Cleanup: Improved resource cleanup logic for the browser/tools.

Tested and running successfully! Hope this is helpful.

User Interface

Screenshot 2026-01-12 134917

Copilot AI review requested due to automatic review settings January 12, 2026 08:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the Sidekick project to be compatible with Gradio 6.0 and LangGraph v0.2, implementing modern async/await patterns and improving resource management. The changes ensure the codebase runs smoothly with recent library updates while maintaining the core functionality of an AI agent that completes tasks with evaluation feedback.

Changes:

  • Migrated to Gradio 6.0 with updated message formats and CSS handling in launch()
  • Updated LangGraph nodes to accept RunnableConfig parameters and use ainvoke() for v0.2 compatibility
  • Implemented native async/await patterns throughout for better event loop handling

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
sidekick_tools.py Tool integrations with async Playwright initialization, web search, file management, Wikipedia, Python REPL, and push notifications
sidekick.py Core LangGraph agent with async worker/evaluator nodes, RunnableConfig support, and resource cleanup
app.py Gradio 6.0 UI with async event handlers and proper message formatting
pyproject.toml Updated dependencies including Gradio 6.3.0+ and LangGraph 1.0.5+
README.md Documentation covering setup, features, and tech stack
.env.example Environment variable template for Azure OpenAI, Serper, and Pushover APIs
.gitignore Standard Python gitignore with Playwright additions
.python-version Python 3.12 version specification

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pyproject.toml specifies 'requires-python = ">=3.12"' but the README Tech Stack section mentions 'Python 3.10+ asyncio'. This creates an inconsistency about the minimum Python version. Ensure all documentation and configuration files specify Python 3.12 as the minimum version.

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +136
# Update or add system message
messages = state["messages"]
found_system_message = False

for message in messages:
if isinstance(message, SystemMessage):
message.content = system_message
found_system_message = True
break

if not found_system_message:
messages = [SystemMessage(content=system_message)] + messages

# Invoke the LLM with config for proper async context propagation
response = await self.worker_llm_with_tools.ainvoke(messages, config)
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutating messages in the state directly can lead to unexpected behavior. Instead of modifying the existing SystemMessage object in place, consider creating a new messages list with the updated SystemMessage to ensure proper state management and avoid side effects.

Suggested change
# Update or add system message
messages = state["messages"]
found_system_message = False
for message in messages:
if isinstance(message, SystemMessage):
message.content = system_message
found_system_message = True
break
if not found_system_message:
messages = [SystemMessage(content=system_message)] + messages
# Invoke the LLM with config for proper async context propagation
response = await self.worker_llm_with_tools.ainvoke(messages, config)
# Update or add system message without mutating existing state messages
original_messages = state["messages"]
found_system_message = False
new_messages: List[Any] = []
for message in original_messages:
if not found_system_message and isinstance(message, SystemMessage):
# Replace the first SystemMessage with an updated one
new_messages.append(SystemMessage(content=system_message))
found_system_message = True
else:
new_messages.append(message)
if not found_system_message:
# Prepend a new SystemMessage if none existed
new_messages = [SystemMessage(content=system_message)] + new_messages
# Invoke the LLM with config for proper async context propagation
response = await self.worker_llm_with_tools.ainvoke(new_messages, config)

Copilot uses AI. Check for mistakes.
Comment on lines +281 to +288
print("Cleaning up Sidekick resources...")
try:
if self.browser:
await self.browser.close()
if self.playwright:
await self.playwright.stop()
except Exception as e:
print(f"Exception during cleanup: {e}") No newline at end of file
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup method uses a generic print statement instead of the configured logger. For consistency with the rest of the codebase which uses the logging module, consider using logger.info() and logger.error() instead of print statements.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +49
print("Setting up Sidekick...")
sidekick = Sidekick()
await sidekick.setup()
return sidekick


async def process_message(
message: str,
success_criteria: str,
history: Optional[list],
sidekick: Sidekick
) -> tuple:
"""
Process user message and return updated chat history.
Returns: (updated_history, cleared_message_box)
"""
if not message or not success_criteria:
return history or [], ""

updated_history = await sidekick.run_superstep(
message, success_criteria, history or []
)
return updated_history, "" # Clear the message box after submission


async def reset_conversation(sidekick: Optional[Sidekick]):
"""Reset the conversation and create a new Sidekick instance"""
if sidekick:
await sidekick.cleanup()
new_sidekick = await setup_sidekick()
return [], "", "", new_sidekick # chatbot, message, success_criteria, sidekick


async def free_resources(sidekick: Optional[Sidekick]) -> None:
"""Cleanup callback when app closes"""
if sidekick:
try:
await sidekick.cleanup()
print("Sidekick resources freed")
except Exception as e:
print(f"Error during cleanup: {e}")
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print statements in the app.py file are inconsistent with the logging approach used in sidekick.py and sidekick_tools.py. Consider importing and using the logging module for consistent logging throughout the application.

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +209
async def evaluator(self, state: State, config: RunnableConfig) -> Dict[str, Any]:
"""
Evaluator node: determines if success criteria met and if more input needed.
"""
last_response = state["messages"][-1].content

system_message = """You are an evaluator that assesses task completion by an Assistant.
Determine if the task meets the success criteria based on the assistant's final response.
Also assess whether the user needs to provide more input, clarification, or if the assistant is stuck."""

user_message = f"""Evaluate this conversation:

{self.format_conversation(state["messages"])}

Task Success Criteria:
{state["success_criteria"]}

Assistant's Final Response:
{last_response}

Decide:
1. Is the success criteria met?
2. Does the user need to provide more input (clarification, stuck, etc.)?
3. Provide constructive feedback.

Note: If the assistant says they wrote a file, assume they did. Give the benefit of the doubt but reject if more work is needed."""

if state["feedback_on_work"]:
user_message += f"""

Previous Feedback Given:
{state["feedback_on_work"]}

If the assistant is repeating the same mistakes, mark that user input is required."""

evaluator_messages = [
SystemMessage(content=system_message),
HumanMessage(content=user_message),
]

eval_result = await self.evaluator_llm_with_output.ainvoke(evaluator_messages, config)

return {
"feedback_on_work": eval_result.feedback,
"success_criteria_met": eval_result.success_criteria_met,
"user_input_needed": eval_result.user_input_needed,
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluator node is marked as async but the RunnableConfig parameter is not utilized in the function body. While it's correctly passed to ainvoke(), consider documenting why this parameter is required for LangGraph v0.2 compatibility to help future maintainers understand its purpose.

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +73
logger.error(f"Error initializing file tools: {str(e)}")
return []
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function returns an empty list if an exception occurs during FileManagementToolkit initialization, but this failure is silently ignored by the caller. This could lead to the agent running without file management capabilities without the user being aware. Consider propagating the exception or logging a warning that's more visible to users.

Suggested change
logger.error(f"Error initializing file tools: {str(e)}")
return []
# Log the error with full traceback and propagate so callers are aware
logger.error(f"Error initializing file tools: {str(e)}", exc_info=True)
raise

Copilot uses AI. Check for mistakes.
Comment on lines +264 to +268
worker_response = (
result["messages"][-2].content if len(result["messages"]) > 1 else ""
)
evaluator_feedback = result["feedback_on_work"] or "No feedback"

Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worker_response extraction logic assumes the second-to-last message contains the worker's response, but this may not always be accurate depending on the conversation flow. If there's only one message or if the message structure is different than expected, this could extract the wrong content or fail. Consider adding validation to ensure the message exists and is of the expected type.

Suggested change
worker_response = (
result["messages"][-2].content if len(result["messages"]) > 1 else ""
)
evaluator_feedback = result["feedback_on_work"] or "No feedback"
messages = result.get("messages") or []
worker_response = ""
if isinstance(messages, list) and messages:
# Prefer the most recent AIMessage as the worker response
for msg in reversed(messages):
if isinstance(msg, AIMessage):
worker_response = getattr(msg, "content", "") or ""
break
# Fallback: use the last message's content if no AIMessage was found
if not worker_response:
last_msg = messages[-1]
if hasattr(last_msg, "content"):
worker_response = last_msg.content or ""
elif isinstance(last_msg, dict) and "content" in last_msg:
worker_response = last_msg.get("content") or ""
evaluator_feedback = result.get("feedback_on_work") or "No feedback"

Copilot uses AI. Check for mistakes.
def get_file_tools() -> List[Any]:
"""Get file management tools (read, write, list files)"""
try:
toolkit = FileManagementToolkit(root_dir="sandbox")
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'sandbox' directory used by FileManagementToolkit is hardcoded but not created if it doesn't exist. This could cause the file tools to fail on first run. Consider either creating the directory during initialization or documenting that users need to create it manually.

Suggested change
toolkit = FileManagementToolkit(root_dir="sandbox")
sandbox_dir = "sandbox"
os.makedirs(sandbox_dir, exist_ok=True)
toolkit = FileManagementToolkit(root_dir=sandbox_dir)

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,129 @@
import gradio as gr
from sidekick import Sidekick
import asyncio
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'asyncio' is not used.

Suggested change
import asyncio

Copilot uses AI. Check for mistakes.
from pydantic import BaseModel, Field
from sidekick_tools import playwright_tools, other_tools
import uuid
import asyncio
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'asyncio' is not used.

Suggested change
import asyncio

Copilot uses AI. Check for mistakes.
@ed-donner
Copy link
Owner

Thanks so much. Would you be able to remove the uv.lock, and remove anything not needed, to reduce the amount of review work needed? As it stands, this is 3,500 lines and I won't have capacity to review.. thanks so much

- Remove uv.lock (users generate their own)
- Remove .python-version (local-only tool)
- Simplify .gitignore to essential entries
- Simplify .env.example for clarity
- Update README with pip-first installation
- Update pyproject.toml with proper project name
@CodeBy-HP
Copy link
Author

Hi! 👋
Thanks for pointing that out, and apologies for the extra noise earlier. I’ve removed uv.lock and cleaned up anything not needed to significantly reduce the diff. The PR should be much smaller and easier to review now. Thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants