-
Notifications
You must be signed in to change notification settings - Fork 140
CBG-4345 don't panic if role documents can't be read #7816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Refactor tests in prep for CBG-4345 which changes some return values. - create helper functions - replace assert.True(t, x == y) with assert.Equal(t, x, y) - Create error types to assert on
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses panic issues when role documents cannot be read by converting functions that previously panicked into error-returning functions. The changes ensure graceful error handling throughout the authentication and channel access system, particularly when role documents are missing or corrupted.
Key changes:
- Modified authentication functions to return errors instead of panicking when role documents can't be read
- Updated all callers to handle the new error returns
- Added proper error logging and user-friendly error responses
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| auth/principal.go | Updated interface definitions to return errors for channel access methods |
| auth/user.go | Modified user methods to return errors when role loading fails instead of panicking |
| auth/role.go | Updated role methods to return errors for channel access operations |
| auth/user_collection_access.go | Added error handling to collection-specific channel access methods |
| auth/role_collection_access.go | Updated collection role methods to return errors |
| auth/collection_access.go | Modified interface definitions for collection access to include error returns |
| db/crud.go | Updated MakeUserCtx to handle errors from user channel operations |
| db/changes.go | Added error handling throughout changes feed processing |
| db/design_doc.go | Updated view filtering to handle channel access errors |
| db/functions/function.go | Added error handling for channel authorization in functions |
| db/functions/js_function.go | Updated JavaScript function execution to handle user context errors |
| rest/handler.go | Modified audit logging to handle role retrieval errors gracefully |
| rest/bulk_api.go | Updated bulk operations to handle channel access errors properly |
| rest/admin_api.go | Added error handling to principal marshaling |
| rest/diagnostic_api.go | Updated diagnostic API to handle channel access errors |
| rest/user_api_test.go | Updated tests to handle new error returns |
| rest/role_api_test.go | Modified tests to check for errors in channel operations |
| db/util_testing.go | Added error handling to test utilities |
| db/database_test.go | Updated tests to handle new error signatures |
| auth/user_test.go | Modified tests to check for errors in user operations |
| auth/auth_test.go | Updated authentication tests to handle new error returns |
| auth/collection_access_test.go | Added error checking to collection access tests |
| if availableChannels == nil { | ||
| // TODO: CBG-1948 | ||
| panic("no channels for user?") | ||
| base.AssertfCtx(h.ctx(), "User %q has no channels in handleAllDocs", base.UD(h.user.Name())) |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The panic replacement uses base.AssertfCtx followed by returning an error. This is inconsistent - either use the assertion (which may panic in debug builds) or return the error, but not both. Consider removing the assertion and only returning the HTTP error.
| base.AssertfCtx(h.ctx(), "User %q has no channels in handleAllDocs", base.UD(h.user.Name())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do want to assert here, and we still return the same error as a panic but with a better message.
a45805d to
0b831e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.
| visibleChannels, err = user.InheritedCollectionChannels(base.DefaultScope, base.DefaultCollection) | ||
| hasStarChannel = !visibleChannels.Contains("*") | ||
| if !applyChannelFiltering { | ||
| return // this is an error |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment 'this is an error' is misleading since the function now returns an error value. The early return here should either return an error or the comment should be updated to clarify the intent. Consider returning a proper error if this is truly an error condition.
| return // this is an error | |
| // Channel filtering was expected but not applied; returning error | |
| return result, errors.New("channel filtering not applied when expected") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving existing code as is.
|
|
||
| // Subroutine that creates a response row for a document: | ||
| output := make(chan *ChangeEntry, len(explicitDocIds)) | ||
| defer close(output) |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the defer close(output) to the top of the function is incorrect. The channel was previously closed after the loop that populates it completes. If an error occurs during iteration and causes an early return, this defer will close the channel before all values have been sent, potentially causing data loss or goroutine issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code uses this channel to signal that this is done, if the channel isn't closed an early return is needed, then this was a latent bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. Generally I don't like including the text "Error" in warning or info logging, made a few suggestions to avoid that, and a few other error handling comments.
| // base.InfofCtx(user.auth.LogCtx, base.KeyAccess, "User %s role %q = %v", base.UD(user.Name_), base.UD(name), base.UD(role)) | ||
| if err != nil { | ||
| panic(fmt.Sprintf("Error getting user role %q: %v", name, err)) | ||
| return nil, fmt.Errorf("Error getting user role %q: %w", name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's going in logs, I think we want base.UD(name).
| previousChannels = col.user.InheritedCollectionChannels(col.ScopeName, col.Name) | ||
| previousChannels, err = col.user.InheritedCollectionChannels(col.ScopeName, col.Name) | ||
| if err != nil { | ||
| base.WarnfCtx(ctx, "Error getting previous channels for user %q: %v", base.UD(col.user.Name()), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| base.WarnfCtx(ctx, "Error getting previous channels for user %q: %v", base.UD(col.user.Name()), err) | |
| base.WarnfCtx(ctx, "Unable to retrieve inherited channels for user %q: %v", base.UD(col.user.Name()), err) |
| changedChannels = col.user.InheritedCollectionChannels(col.ScopeName, col.Name).CompareKeys(previousChannels) | ||
| channels, err := col.user.InheritedCollectionChannels(col.ScopeName, col.Name) | ||
| if err != nil { | ||
| base.WarnfCtx(ctx, "Error getting channels for user %q: %v", base.UD(col.user.Name()), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| base.WarnfCtx(ctx, "Error getting channels for user %q: %v", base.UD(col.user.Name()), err) | |
| base.WarnfCtx(ctx, "Unable to retrieve inherited channels for user %q: %v", base.UD(col.user.Name()), err) |
| var err error | ||
| channelsSince, channelsRemoved, err = col.user.FilterToAvailableCollectionChannels(col.ScopeName, col.Name, chans) | ||
| if err != nil { | ||
| base.WarnfCtx(ctx, "Error filtering to available channels for user %q: %v", base.UD(col.user.Name()), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| base.WarnfCtx(ctx, "Error filtering to available channels for user %q: %v", base.UD(col.user.Name()), err) | |
| base.WarnfCtx(ctx, "Unable to filter to available channels for user %q: %v", base.UD(col.user.Name()), err) |
| newChannelsSince, _ := col.user.FilterToAvailableCollectionChannels(col.ScopeName, col.Name, chans) | ||
| newChannelsSince, _, err := col.user.FilterToAvailableCollectionChannels(col.ScopeName, col.Name, chans) | ||
| if err != nil { | ||
| base.WarnfCtx(ctx, "Error filtering to available channels for user %q: %v", base.UD(col.user.Name()), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| base.WarnfCtx(ctx, "Error filtering to available channels for user %q: %v", base.UD(col.user.Name()), err) | |
| base.WarnfCtx(ctx, "Unable to filter to available channels for user %q: %v", base.UD(col.user.Name()), err) |
| if db.user.CanSeeCollectionChannel(db.ScopeName, db.Name, channel) && (removal == nil || removal.Seq > options.Since.Seq) { | ||
| canSee, err := db.user.CanSeeCollectionChannel(db.ScopeName, db.Name, channel) | ||
| if err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places we're wrapping the underlying error with additional context. I think that makes sense to help identify exactly where the failure occurred during changes processing - any reason to not do so here as well?
| currentChannels := user.InheritedCollectionChannels(dsName.ScopeName(), dsName.CollectionName()) | ||
| currentChannels, err := user.InheritedCollectionChannels(dsName.ScopeName(), dsName.CollectionName()) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error getting user channels: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return nil, fmt.Errorf("error getting user channels: %w", err) | |
| return nil, fmt.Errorf("error retrieving inherited user channels: %w", err) |
| } | ||
| roleNames, err := getSGUserRolesForAudit(dbCtx, h.user) | ||
| if err != nil { | ||
| base.InfofCtx(h.ctx(), base.KeyHTTP, "Error getting user roles for audit logging: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| base.InfofCtx(h.ctx(), base.KeyHTTP, "Error getting user roles for audit logging: %v", err) | |
| base.InfofCtx(h.ctx(), base.KeyHTTP, "Unable to retrieve user roles for audit logging, will be omitted from audit entry: %v", err) |
CBG-4345 don't panic if role documents can't be read
If a role can't be read, we want to end the connection because we don't know the right role information to use to use. In the case of BLIP, we should exit with a 503 to indicate to the app to retry. At the point we have a
auth.Userobject, we have authenticated and that's probably OK, but there could be an error reading the role document (SDK error, etc).I changed the handler code to exit with the auth error if we can't read the role document from the user. This is a behavioral change, but I think is the correct one to do, we don't know information about the user. An alternative would be to log an info or error. This is not that dissimilar to the behavior we have now (GetRole would panic, the HTTP handler would exit 500).
This code will need careful review to make sure that the functions and goroutines will terminate correctly.
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiIntegration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/148/