Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redact query string values #13114

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jeanbisutti
Copy link
Member

@jeanbisutti jeanbisutti commented Jan 27, 2025

@jeanbisutti jeanbisutti requested a review from a team as a code owner January 27, 2025 14:02
@jeanbisutti jeanbisutti marked this pull request as draft January 27, 2025 14:02
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch 2 times, most recently from aba2207 to 608bc78 Compare January 27, 2025 14:43
arguments("https://github.com#[email protected]", "https://github.com#[email protected]"),
arguments("user1:[email protected]", "user1:[email protected]"),
arguments("https://github.com@", "https://github.com@"),
arguments(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was (probably inadvertly) removed in #9925.

The test is added again with new test cases.

@jeanbisutti jeanbisutti marked this pull request as ready for review January 27, 2025 15:18
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch from 608bc78 to d8d8dc3 Compare January 27, 2025 16:45
@jeanbisutti jeanbisutti force-pushed the redact-query-string-values branch from d8d8dc3 to c95b161 Compare January 27, 2025 16:57
jeanbisutti and others added 2 commits January 29, 2025 14:08
…tion/api/semconv/http/HttpClientAttributesExtractor.java

Co-authored-by: Steve Rao <[email protected]>
Comment on lines +75 to +77
redactSensitiveParameters =
Boolean.getBoolean(
"otel.instrumentation.http.client.experimental.redact-sensitive-parameters");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not align with our usual way of handling configuration options. It only supports system properties while we usually also accept environment variables. Perhaps it would be better to handle it the same way as other http client experimental options in

so that it could also be used from spring boot configuration. Since the HttpClientAttributesExtractor and builder are stable introducing experimental options will require trickery with internal classes.

Comment on lines +166 to +168

int questionMarkIndex = urlpart.indexOf('?');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int questionMarkIndex = urlpart.indexOf('?');
int questionMarkIndex = urlpart.indexOf('?');

@@ -141,8 +151,69 @@ private static String stripSensitiveData(@Nullable String url) {
}

if (atIndex == -1 || atIndex == len - 1) {
return url;
return redactSensitiveParameters ? redactUrlParameters(url) : url;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you considered structuring it like

String stripSensitiveData(@Nullable String url) {
  if (url == null || url.isEmpty()) {
    return url;
  }
  url = redactUserInfo(url);
  url = redactQueryParameters(url);
  return url;
}

then you will not need to duplicate redactSensitiveParameters ? redactUrlParameters(url) : url;

return urlpart.substring(0, questionMarkIndex) + "?" + urlPartAfterQuestionMark;
}

private static boolean containsParamToRedact(String urlpart) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you consider that parameter names and values in url may be percent encoded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsing do processing based on the presence of the ?, =, & and # characters. All these characters are reserved ones after percent-encoding: https://en.wikipedia.org/wiki/Percent-encoding

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have misunderstood, the problem isn't in these characters but rather in the names of the query parameters. For example sig could be encoded as %73%69%67. I'm not claiming that such encoding would be common or sensible, just that it is acceptable to the the web server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, I suspect it's ok to ignore a query param encoded as %73%69%67, the goal of this masking isn't to account for malicious usage, but for valid usage of specific client libraries

@jeanbisutti can you open a semconv issue asking for this clarification?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it is fine to not bother with this since what is used by the http client is set by the application author. If needed we can always make the redacted parameter list configurable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jeanbisutti, I think we can proceed with the behavior in this PR (no percent encoding, exact case matching) and revisit after getting any semconv clarification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants