-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Conversation
type SymbolUsagesFilter { | ||
# TODO: There needs to be a way to say 'current repo only' or any repos. | ||
# Allow file filter | ||
# Allow kind filter | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Document connections with Chromium code search, IntelliJ, and Linear's filtering API.
10aaae0
to
e50bb3b
Compare
""" | ||
Either one of 'symbol' or 'range' must be provided, but this isn't enforced at the GraphQL | ||
layer due to the lack of support for input unions. | ||
https://github.com/graphql/graphql-wg/blob/main/rfcs/InputUnion.md | ||
""" | ||
usages( | ||
symbol: LookupSCIPSymbol, | ||
range: LookupRange, | ||
filter: UsagesFilter, | ||
first: Int, | ||
after: String, | ||
): UsageConnection! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anton's Q: Why not split this into RangeUsageConnection and SymbolUsageConnection? This will cause more complexity in the outputs because of input dependence.
Varun's A: The output needs to have some complexity anyways because of needing to support search-based/syntactic/precise and some information will not be present depending on the data source. (Anton's idea: In GraphQL, you'd typically use union types for results so that caller can check the type and cast.) Also, depending on whether the symbol is file-local, repo-local or cross-repo, we do actually need different kinds of range information.
3618cfe
to
1c05c11
Compare
@@ -47,6 +47,64 @@ extend type GitBlob { | |||
Experimental: This API is likely to change in the future. | |||
""" | |||
symbolInfo(line: Int!, character: Int!): SymbolInfo | |||
|
|||
codeGraphData: CodeGraphData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pass in a ProvenanceComparator
as a parameter here? Should we pass anything else?
""" | ||
Invariants: | ||
1. forall sym in symbols. | ||
(sym.provenance, sym.dataSource) == (scipDocument.provenance, scipDocument.dataSource) | ||
""" | ||
type CodeGraphData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This invariant is a bit annoying; should we remove the duplication the (provenance, dataSource)
pair of fields? Right now, we have Provenance on SymbolInformation
so that you can check it after invoking usagesBySymbol
|
||
input OccurrencesFilter { | ||
rangeFilter: RangeFilter | ||
rolesFilter: RolesFilter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Add documentation here describing how this can be used for the symbols side bar so that the symbols sidebar is consistent with the navigation in the file. E.g. you can have a precise symbols sidebar if you have precise code intel instead of having to use type:symbol
search in the file.
line: Int! | ||
line: Int | ||
|
||
""" | ||
The character (not byte) of the start line on which the symbol occurs (zero-based, inclusive). | ||
""" | ||
character: Int! | ||
character: Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Revert these changes, we don't care about this API anymore
input RepositoryFilter { | ||
name: StringComparator! | ||
} | ||
|
||
input PathFilter { | ||
name: StringComparator! | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Should we remove the !
here?
type UsageConnection { | ||
nodes: [Usage!]! | ||
pageInfo: PageInfo! | ||
} | ||
|
||
type UsageRange { | ||
repo: String! | ||
revision: String! | ||
range: Range! | ||
} | ||
|
||
type Usage { | ||
symbol: SymbolInformation! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Think through the non-null cases more carefully, especially for UsageConnection
1c05c11
to
d5b2438
Compare
d5b2438
to
1949594
Compare
manually generating bindings using | ||
https://github.com/sourcegraph/scip/blob/main/scip.proto | ||
""" | ||
data: String! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder: would it be more natural to use proto's canonical JSON encoding for a GraphQL API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the downside is then there is temptation to not use the SCIP bindings and just use the deserialized JSON object directly. Also it doesn't look like the bindings support JSON serialization. So nevermind, ignore me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We don't want to update our GraphQL API for every change to SCIP, because most of our indexers may not adopt the new features right away. Some features also add a significant amount of complexity (e.g. this one being proposed which exposes full type structures Add support for
SymbolInformation.signature
scip#231), and I don't think we should expose that in our GraphQL API. Exposing the raw SCIP allows a customer to upload updated indexes using the new SCIP feature and implement custom functionality. - Protobuf allows a client to parse the data in a streaming manner, which can allow for optimizations like interning-while-parsing (e.g. for symbol names), as well as just skipping needless input, without allocating a lot of memory.
Some of the important Document data is already exposed by the symbols
and occurrences
fields in the API today. So 'light' clients can make use of it directly. 'Heavy' clients can take on a bit more complexity by parsing the Protobuf for full access to all the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By shoving this into a string field, you lose the ability to do a subselect, which kind of defeats GraphQL. That said, if we really need a proto representation here, I don't think that is blocking.
However, from an APIs perspective, it feels a little cleaner to return JSON.
If having to support streaming decoding here, I think I have another concern:
Is this gonna be a huge string sent over? Base64 encoding adds another bit of overhead to the size of the proto, and you need to unmarshal the whole JSON GQL response to then grab this string to then streaming decode it.
Our backend has to build up the whole response in memory, and the clients load it into memory entirely. Is that a concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing here is that we want clients to use the SCIP proto-bindings for interacting with this field
manually generating bindings using | ||
https://github.com/sourcegraph/scip/blob/main/scip.proto | ||
""" | ||
data: String! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By shoving this into a string field, you lose the ability to do a subselect, which kind of defeats GraphQL. That said, if we really need a proto representation here, I don't think that is blocking.
However, from an APIs perspective, it feels a little cleaner to return JSON.
If having to support streaming decoding here, I think I have another concern:
Is this gonna be a huge string sent over? Base64 encoding adds another bit of overhead to the size of the proto, and you need to unmarshal the whole JSON GQL response to then grab this string to then streaming decode it.
Our backend has to build up the whole response in memory, and the clients load it into memory entirely. Is that a concern?
or linesAfter is negative or exceeds the number of available lines, | ||
the value is interpreted as until the start/end of the file. | ||
""" | ||
surroundingBlobContent(surroundingLines: SurroundingLines = {linesBefore: 0, linesAfter: 0}): String! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0
meaning "no value given" is a very go-ey thing, I think you don't need a default value here, since surroundingLines: SurroundingLines
already indicates nullable and the backend can then decide how to represent that in the given programming language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, 0
doesn't mean "no value given", it literally means 0
because that makes sense as a default -- only the single line from the blob will be returned in the default case. I wrote the 0 explicitly here as it serves as documentation for an API client for what the behavior will be if they omit this parameter. Another potential default would be INT32_MAX
(i.e. get the full blob contents), but we're deliberately picking 0
here, not INT32_MAX
.
Thanks everyone for the feedback and questions @eseliger, @mmanela, @camdencheek and @keynmol. I think I've addressed most of other people's comments. Overall, API shape-wise, I think we're close to where we want to be. The main TODO left is the API for the hover doc. If we think about it broadly, I think we really want to think about "get me information about this symbol" (or if a symbol is not available, such as in the case of search-based, then a range), which may include:
The first field is already part of I'm going to close this PR for now. I will tag people individually for subsequent PRs on an as-needed basis, especially if there is a big divergence from what has been discussed here. |
Part of #62148
TODO:
Customizable grouping?Let client handle this for now.Test plan
n/a