Skip to content

WIP - OIDC keycloak implementation alpha cut#93

Open
ntai-arxiv wants to merge 46 commits intodevelopfrom
ntai/keycloak-take-2
Open

WIP - OIDC keycloak implementation alpha cut#93
ntai-arxiv wants to merge 46 commits intodevelopfrom
ntai/keycloak-take-2

Conversation

@ntai-arxiv
Copy link

No description provided.

I believe the missing ingredients was the CLASSIC_SESSION_HASH and thus the cookie between this and other services did not agree.
Safe-guard not having jwt secret (it's a disaster without it.)
… switch to turn this off.)

Username needs casefolging.

Add "login" to Makefile.
Add logout-callback endpoint. It's NOOP and until we need it, we may set it to none on keycloak. It is "Backchannel logout URL"
legacy_auth_provider.py to include the tracking cookie.
# Update the password for each fetched user_id
email: str
for user_id, email in user_ids:
if "cornell.edu" in email or "arxiv.org" in email:

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization

The string [cornell.edu](1) may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI over 1 year ago

To fix the problem, we need to ensure that the email addresses are checked more robustly. Instead of using a substring match, we should parse the email addresses and check the domain part explicitly. This can be done using the email.utils module to parse the email address and then checking the domain part.

  • Parse the email address to extract the domain part.
  • Check if the domain part matches "cornell.edu" or "arxiv.org".
  • Update the code in the hack_creds function to implement this change.
Suggested changeset 1
legacy_auth_provider/tools/mass_reset_password.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/legacy_auth_provider/tools/mass_reset_password.py b/legacy_auth_provider/tools/mass_reset_password.py
--- a/legacy_auth_provider/tools/mass_reset_password.py
+++ b/legacy_auth_provider/tools/mass_reset_password.py
@@ -42,3 +42,4 @@
         for user_id, email in user_ids:
-            if "cornell.edu" in email or "arxiv.org" in email:
+            domain = email.split('@')[-1]
+            if domain == "cornell.edu" or domain == "arxiv.org":
                 continue
EOF
@@ -42,3 +42,4 @@
for user_id, email in user_ids:
if "cornell.edu" in email or "arxiv.org" in email:
domain = email.split('@')[-1]
if domain == "cornell.edu" or domain == "arxiv.org":
continue
Copilot is powered by AI and may make mistakes. Always verify output.
# Update the password for each fetched user_id
email: str
for user_id, email in user_ids:
if "cornell.edu" in email or "arxiv.org" in email:

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization

The string [arxiv.org](1) may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI over 1 year ago

To fix the problem, we need to ensure that the email addresses are properly validated to belong to the allowed domains (cornell.edu and arxiv.org). Instead of using a simple substring check, we should parse the email addresses and verify the domain part explicitly.

  • We will use the email.utils module to parse the email addresses and extract the domain part.
  • We will then check if the domain part matches the allowed domains (cornell.edu and arxiv.org).
Suggested changeset 1
legacy_auth_provider/tools/mass_reset_password.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/legacy_auth_provider/tools/mass_reset_password.py b/legacy_auth_provider/tools/mass_reset_password.py
--- a/legacy_auth_provider/tools/mass_reset_password.py
+++ b/legacy_auth_provider/tools/mass_reset_password.py
@@ -42,3 +42,5 @@
         for user_id, email in user_ids:
-            if "cornell.edu" in email or "arxiv.org" in email:
+            from email.utils import parseaddr
+            domain = parseaddr(email)[1].split('@')[-1]
+            if domain == "cornell.edu" or domain == "arxiv.org":
                 continue
EOF
@@ -42,3 +42,5 @@
for user_id, email in user_ids:
if "cornell.edu" in email or "arxiv.org" in email:
from email.utils import parseaddr
domain = parseaddr(email)[1].split('@')[-1]
if domain == "cornell.edu" or domain == "arxiv.org":
continue
Copilot is powered by AI and may make mistakes. Always verify output.
if "cornell.edu" in email or "arxiv.org" in email:
continue
new_password = generate_random_password()
print(f"{email},{new_password}", file=creds_file)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information

This expression logs [sensitive data (password)](1) as clear text.

Copilot Autofix

AI over 1 year ago

To fix the problem, we should avoid logging sensitive information such as passwords in clear text. Instead, we can log a placeholder or mask the sensitive data. In this case, we can log the email and a masked version of the password (e.g., only showing the first and last character of the password with the rest replaced by asterisks).

  • Modify the code to mask the password before logging it.
  • Ensure that the functionality of the code remains unchanged, i.e., the passwords are still updated in the database correctly.
Suggested changeset 1
legacy_auth_provider/tools/mass_reset_password.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/legacy_auth_provider/tools/mass_reset_password.py b/legacy_auth_provider/tools/mass_reset_password.py
--- a/legacy_auth_provider/tools/mass_reset_password.py
+++ b/legacy_auth_provider/tools/mass_reset_password.py
@@ -45,3 +45,4 @@
             new_password = generate_random_password()
-            print(f"{email},{new_password}", file=creds_file)
+            masked_password = new_password[0] + '*' * (len(new_password) - 2) + new_password[-1]
+            print(f"{email},{masked_password}", file=creds_file)
             cursor.execute(
EOF
@@ -45,3 +45,4 @@
new_password = generate_random_password()
print(f"{email},{new_password}", file=creds_file)
masked_password = new_password[0] + '*' * (len(new_password) - 2) + new_password[-1]
print(f"{email},{masked_password}", file=creds_file)
cursor.execute(
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +257 to +258
response.set_cookie(session_cookie_key, token, max_age=cookie_max_age,
domain=domain, path="/", secure=secure, samesite=samesite)

Check warning

Code scanning / CodeQL

Failure to use secure cookies

Cookie is added without the HttpOnly attribute properly set.
Comment on lines +260 to +261
response.set_cookie(session_cookie_key, "", max_age=0,
domain=domain, path="/", secure=secure, samesite=samesite)

Check warning

Code scanning / CodeQL

Failure to use secure cookies

Cookie is added without the HttpOnly attribute properly set.
Comment on lines +265 to +266
response.set_cookie(classic_cookie_key, tapir_cookie, max_age=cookie_max_age,
domain=domain, path="/", secure=secure, samesite=samesite)

Check warning

Code scanning / CodeQL

Failure to use secure cookies

Cookie is added without the HttpOnly attribute properly set.
Comment on lines +269 to +270
response.set_cookie(classic_cookie_key, '', max_age=0,
domain=domain, path="/", secure=secure, samesite=samesite)

Check warning

Code scanning / CodeQL

Failure to use secure cookies

Cookie is added without the HttpOnly attribute properly set.

if tapir_cookie:
logger.debug('%s=%s',classic_cookie_key, tapir_cookie)
response.set_cookie(classic_cookie_key, tapir_cookie, max_age=cookie_max_age,

Check warning

Code scanning / CodeQL

Construction of a cookie using user-supplied input

Cookie is constructed from a [user-supplied input](1).
Include the access token in the URL so the web app can see it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant