Skip to content

Conversation

@shaidar
Copy link
Contributor

@shaidar shaidar commented Oct 21, 2025

Description (What does it do?)

Setup dagster pipeline for instructor onboarding using our existing access-forge process to add users with associated roles.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a Dagster pipeline for instructor onboarding by automating the process of extracting user email addresses from course role data and submitting them to the access-forge GitHub repository for access management.

Key Changes:

  • Creates a GitHub API client resource for Dagster pipelines with Vault-based authentication
  • Implements two new assets: one to generate a CSV of unique user emails from course roles, and another to create GitHub pull requests with this data
  • Integrates the new assets into the existing edxorg daily job schedule

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
packages/ol-orchestrate-lib/src/ol_orchestrate/resources/github.py Adds a new GitHub API client factory resource that retrieves credentials from Vault
dg_projects/edxorg/edxorg/definitions.py Registers the GitHub resource and new instructor onboarding assets with the Dagster pipeline
dg_projects/edxorg/edxorg/assets/instructor_onboarding.py Implements assets for extracting user emails and creating PRs to the access-forge repository

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

f"pipeline.\n\n"
f"- Action: {action} file\n"
f"- File: {file_path}\n"
f"- Users: {instructor_onboarding_user_list.count(chr(10))} entries"
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user count calculation is incorrect. This counts newline characters in the CSV string, which includes the header row. The actual user count should be instructor_onboarding_user_list.count(chr(10)) - 1 to exclude the header, or better yet, pass the count from the previous asset's metadata.

Suggested change
f"- Users: {instructor_onboarding_user_list.count(chr(10))} entries"
f"- Users: {instructor_onboarding_user_list.count(chr(10)) - 1} entries"

Copilot uses AI. Check for mistakes.
default="pipelines/github/api", description="Path to GitHub API token in Vault"
)

_client: Github | None = PrivateAttr(default=None)
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cached client instance in _client is not thread-safe. If multiple Dagster ops/assets access this resource concurrently, they could encounter race conditions during client initialization or reuse a client with stale state. Consider using a lock or removing the cache to create a new client on each call.

Copilot uses AI. Check for mistakes.
@shaidar shaidar force-pushed the sar/dg_instructor_onboarding branch from 2ea91ce to a78ea6d Compare October 23, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant