fix(providers): fix GitHub Copilot provider — broken session lifecycle and empty responses#553
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes the GitHub Copilot LLM provider session lifecycle so it can successfully connect to the Copilot CLI, maintain a usable session, and return actual assistant content instead of empty responses.
Changes:
- Persist the Copilot client on the provider struct (instead of a short-lived local), and remove the constructor-time
defer client.Stop(). - Add a 15s startup timeout and proper error handling around
Start()/CreateSession(). - Switch from async
Send()to blockingSendAndWait()and return the resulting event content.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| connectCtx, connectCancel := context.WithTimeout(context.Background(), 15*time.Second) | ||
| defer connectCancel() | ||
| if err := client.Start(connectCtx); err != nil { | ||
| return nil, fmt.Errorf("can't connect to Github Copilot: %w", err) |
There was a problem hiding this comment.
The error message uses "Github" instead of the project’s consistent "GitHub" capitalization. Please update the string to "GitHub Copilot" (and keep wrapping the underlying error).
| return nil, fmt.Errorf("can't connect to Github Copilot: %w", err) | |
| return nil, fmt.Errorf("can't connect to GitHub Copilot: %w", err) |
| func (p *GitHubCopilotProvider) Close() { | ||
| if p.client != nil { | ||
| p.client.Stop() | ||
| } |
There was a problem hiding this comment.
Close() is added, but LLMProvider doesn’t include it and there are no call sites in the repo invoking (*GitHubCopilotProvider).Close(), so the Copilot client connection will never be stopped during normal operation. Consider making providers that own resources implement io.Closer (e.g., Close() error) and have the agent/provider factory/loop call it when shutting down.
| func (p *GitHubCopilotProvider) Close() { | |
| if p.client != nil { | |
| p.client.Stop() | |
| } | |
| func (p *GitHubCopilotProvider) Close() error { | |
| if p.client != nil { | |
| p.client.Stop() | |
| } | |
| return nil |
| fullcontent, _ := json.Marshal(out) | ||
|
|
||
| content, _ := p.session.Send(ctx, copilot.MessageOptions{ | ||
| event, err := p.session.SendAndWait(ctx, copilot.MessageOptions{ | ||
| Prompt: string(fullcontent), | ||
| }) |
There was a problem hiding this comment.
json.Marshal(out) errors are ignored. If marshaling fails, fullcontent will be nil and the prompt becomes an empty string, which can lead to confusing downstream behavior. Please handle the marshal error and return it to the caller.
| @@ -65,13 +81,20 @@ func (p *GitHubCopilotProvider) Chat(ctx context.Context, messages []Message, to | |||
|
|
|||
| fullcontent, _ := json.Marshal(out) | |||
|
|
|||
There was a problem hiding this comment.
Chat() calls p.session.SendAndWait(...) without verifying p.session is non-nil. Given the constructor can currently return session == nil for some modes, add a guard (and a clear error) before using the session to avoid a nil-pointer panic.
| if p.session == nil { | |
| return nil, fmt.Errorf("copilot session is not initialized") | |
| } |
| return nil, fmt.Errorf("copilot error: %w", err) | ||
| } | ||
|
|
||
| if event == nil || event.Data.Content == nil { |
There was a problem hiding this comment.
The nil-check can panic: event.Data may be nil even when event is non-nil, so event.Data.Content will dereference a nil pointer. Please check event == nil || event.Data == nil || event.Data.Content == nil (or equivalent) before accessing .Content.
| if event == nil || event.Data.Content == nil { | |
| if event == nil || event.Data == nil || event.Data.Content == nil { |
…ponse handling - Store client as struct field to prevent premature garbage collection - Remove defer client.Stop() from constructor which was killing the gRPC client immediately after NewGitHubCopilotProvider returned, causing all subsequent Chat() calls to silently fail - Add proper error handling for CreateSession (was ignoring error with _) - Replace session.Send() with session.SendAndWait(): Send() only returns a message UUID, not the response text; SendAndWait() blocks until the assistant is idle and returns the actual SessionEvent with content - Add Close() method for proper client lifecycle management - Add 15s connection timeout to fail fast if CLI is unreachable
a69774f to
9bf6199
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| case "stdio": | ||
| // todo | ||
| //todo |
There was a problem hiding this comment.
Comment formatting should follow Go conventions with "TODO:" (uppercase with colon) instead of "//todo". See pkg/providers/factory_test.go:292 and :296 for examples of the correct format.
| //todo | |
| // TODO: |
| } | ||
|
|
||
| func NewGitHubCopilotProvider(uri string, connectMode string, model string) (*GitHubCopilotProvider, error) { | ||
|
|
There was a problem hiding this comment.
Extra blank line added. The function should follow the same formatting pattern as other providers in the codebase (e.g., claude_cli_provider.go:18-24, codex_cli_provider.go) where there's no blank line after the function signature.
| } | ||
|
|
||
| func (p *GitHubCopilotProvider) GetDefaultModel() string { | ||
|
|
There was a problem hiding this comment.
Extra blank line added. This should follow the same pattern as other GetDefaultModel methods in the codebase (e.g., claude_cli_provider.go:62-65) where there's no blank line before the return statement.
| Content: content, | ||
| Content: *event.Data.Content, | ||
| }, nil | ||
|
|
There was a problem hiding this comment.
Extra blank line added after the return statement. This should follow the same pattern as other provider Chat methods (e.g., claude_cli_provider.go:59-60) where there's no blank line after the return statement.
| client.Stop() | ||
| return nil, fmt.Errorf("failed to create Copilot session: %w", err) | ||
| } | ||
|
|
There was a problem hiding this comment.
The switch statement has no default case to handle invalid connectMode values. If a user provides an unsupported connectMode (anything other than "stdio" or "grpc"), the function will silently return a provider with nil client and session, causing a nil pointer dereference later. A default case should return an error for unsupported modes.
| default: | |
| return nil, fmt.Errorf("unsupported connect mode %q", connectMode) |
| "fmt" | ||
| "time" | ||
|
|
||
| json "encoding/json" |
There was a problem hiding this comment.
The import alias json "encoding/json" is unconventional. Throughout the codebase, imports use standard naming without aliases. For example, in antigravity_provider.go, claude_cli_provider.go, and other providers, the import is simply "encoding/json" without an alias. This should follow the same convention for consistency.
| json "encoding/json" | |
| "encoding/json" |
| case "stdio": | ||
| // todo | ||
| //todo |
There was a problem hiding this comment.
When connectMode is "stdio", both session and client will be nil, but the constructor returns successfully. This means any subsequent call to Chat() will panic with a nil pointer dereference when calling p.session.SendAndWait(). The "stdio" case should either be implemented or explicitly return an error indicating it's not yet supported.
| func (p *GitHubCopilotProvider) Close() { | ||
| if p.client != nil { | ||
| p.client.Stop() | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The Close() method is added but never called anywhere in the codebase. The LLMProvider interface (pkg/providers/types.go:22-31) does not include a Close() method, so this won't be invoked automatically. Providers are created once at application startup in cmd/picoclaw/cmd_agent.go:62 and cmd/picoclaw/cmd_gateway.go:47 and are never cleaned up. This means the gRPC client connection will remain open until the process terminates. Either the Close() method should be integrated into the application lifecycle (e.g., with a shutdown handler), or it can be removed if it's not needed given that the provider lives for the entire application lifetime.
| func (p *GitHubCopilotProvider) Close() { | |
| if p.client != nil { | |
| p.client.Stop() | |
| } | |
| } |
- Remove json import alias (use standard encoding/json) - Fix capitalization: 'Github' -> 'GitHub' in error message - Fix //todo -> // TODO: comment formatting - stdio case now returns explicit error (not yet implemented) - Add default case returning error for unsupported connect modes - Close() now returns error to satisfy io.Closer interface - Add nil guard for p.session in Chat() - Handle json.Marshal error instead of ignoring it - Remove extra blank lines for consistency with codebase style
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| type GitHubCopilotProvider struct { | ||
| uri string | ||
| connectMode string // `stdio` or `grpc`` | ||
| connectMode string // `stdio` or `grpc` | ||
|
|
||
| client *copilot.Client | ||
| session *copilot.Session | ||
| } |
There was a problem hiding this comment.
The PR description claims to add a Close() method for proper client lifecycle management, but this method is not present in the implementation. Without a Close() method, the gRPC client connection will remain open for the lifetime of the provider instance, creating a resource leak. While this may be acceptable if provider instances are long-lived and reused, it's inconsistent with the claimed changes in the PR description.
| ) | ||
| connectCtx, connectCancel := context.WithTimeout(context.Background(), 15*time.Second) | ||
| defer connectCancel() | ||
| if err := client.Start(connectCtx); err != nil { |
There was a problem hiding this comment.
When client.Start() fails, the client is not properly cleaned up before returning the error. While it may not be strictly necessary to call client.Stop() on a client that failed to start, for consistency and to ensure proper cleanup of any partially initialized resources, consider calling client.Stop() in the error path at line 37.
| if err := client.Start(connectCtx); err != nil { | |
| if err := client.Start(connectCtx); err != nil { | |
| client.Stop() |
| var err error | ||
| session, err = client.CreateSession(connectCtx, &copilot.SessionConfig{ |
There was a problem hiding this comment.
The same context (connectCtx) with a 15-second timeout is reused for both client.Start() and client.CreateSession(). If client.Start() takes close to 15 seconds, CreateSession() may fail immediately due to timeout. Consider using separate contexts with appropriate timeouts for each operation, or at minimum using context.Background() for CreateSession after Start succeeds.
| var err error | |
| session, err = client.CreateSession(connectCtx, &copilot.SessionConfig{ | |
| sessionCtx, sessionCancel := context.WithTimeout(context.Background(), 15*time.Second) | |
| defer sessionCancel() | |
| var err error | |
| session, err = client.CreateSession(sessionCtx, &copilot.SessionConfig{ |
🐛 Bug Fixes
The GitHub Copilot provider had two critical bugs causing it to always return empty responses.
Bug 1:
defer client.Stop()kills the session immediatelyNewGitHubCopilotProvidercalleddefer client.Stop()inside the constructor — so by the time anyChat()call was made, the gRPC connection was already torn down. Every message silently failed.Bug 2:
session.Send()returns a UUID, not response textSend()is async and only returns a message ID. The actual response text comes via events.SendAndWait()must be used instead — it blocks until the assistant is idle and returns the finalSessionEventcontaining.Data.Content.Changes
defer client.Stop()from constructorclientas a struct field (was a local variable with no cleanup path)session.Send()withsession.SendAndWait()to get actual response textCreateSession(was silently ignoring error with_)Close()method for proper client lifecycle managementType of Change
AI Code Generation
Test Environment