Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add downstream tests #6

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/cron.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
# currently set to 5:00 UTC and takes ~12 hours
- cron: "15 18 * * *"
workflow_dispatch: {}

jobs:
setup:
runs-on: ubuntu-latest
Expand Down
25 changes: 25 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: pr

on:
push:
branches:
- "main"
pull_request:
branches:
- "main"
workflow_dispatch: # allows you to trigger manually

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
check-style:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-python@v3
- uses: pre-commit/[email protected]
15 changes: 15 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,18 @@ repos:
^cpp/cmake/thirdparty/patches/.*|
^python/cudf/cudf/tests/data/subword_tokenizer_data/.*
)
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.3
hooks:
- id: ruff
args: ["--fix"]
- id: ruff-format
- repo: https://github.com/shellcheck-py/shellcheck-py
rev: v0.10.0.1
hooks:
- id: shellcheck
args: ["--severity=warning"]
files: ^scripts/

default_language_version:
python: python3
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

This repository contains the scripts to run Dask's `gpu`-marked tests on a schedule.

In addition, we run some light downstream tests, as an early warning check for breaking in downstream packages like cuDF, dask-cuDF, and cuML.

## Version Policy

The primary goal here is to quickly identify breakages in tests defined in `dask/dask` and `dask/distributed`, so we'll use the latest `main` from each of those.
Expand Down
14 changes: 14 additions & 0 deletions downstream/test_downstream.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
def test_import_cudf():
import cudf # noqa: F401


def test_import_dask_cudf():
import dask_cudf # noqa: F401

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only want to test import rapids_package? If we are planning to move away from testing @main in branches of all repos like in rapidsai/rapids-dask-dependency#85 I feel we should be running all dask related pytests of cudf, cuml & dask_cuda in this repo to be able to catch real dask upstream related failures rather than just ImportError's.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a chat with @rjzamora on this topic, but I think that's the big outstanding question here. I think we want two things:

  1. Regular day-to-day workflow shouldn't be vulnerable to upstream changes on main breaking CI
  2. Regressions should be caught early by running tests against main, otherwise CI (and potentially our users' code) will break when the upstream package is released

These two goals are somewhat in tension.

Other projects I've worked on have managed this by having multiple environments, one of which used dev versions of relevant upstream packages. The upstream environment was allowed to fail but would be monitored for breakage. I'm not sure if that's a realistic option for libraries like cudf.

Assuming that's not an option, we can try to get some exposure to dev upstream versions here in dask-upstream-testing. We can do that either by

  1. writing our own tests here (which I've started). But that's code to maintain, and wouldn't get as thorough test coverage.
  2. trying to run the downstream tests in an environment with upstream dev. That gets a little complicated (we're installing downstream libraries from wheels rather than source so we don't have tests unless they're shipped with the built distribution) but it's possible with a bit of work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. trying to run the downstream tests in an environment with upstream dev. That gets a little complicated (we're installing downstream libraries from wheels rather than source so we don't have tests unless they're shipped with the built distribution) but it's possible with a bit of work.

I'm +1 to this approach. We can git clone the repositories and run the pytests with dask@main & latest nightly version of rapids_library. Until that happens I don't think we should be merging rapidsai/rapids-dask-dependency#85

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could add an upstream_dev branch to rapids-dask-dependency, and GPU-CI could add a new (optional) matrix item to build/tests against this in the necessary downstream repos (dask-cudf/dask-cuda/cuml/raft)? Otherwise, we will indeed need to find a way to include down-stream tests within dask-upstream-testing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have test.yaml that runs nightly: https://github.com/rapidsai/cudf/actions/workflows/test.yaml

We could just modify each of the repo's jobs to run nightly tests with main dask and achieve what we want + not increase GPU-CI usage by any bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note: https://github.com/rapidsai/dask-upstream-testing/actions/runs/13292220048 did pick up the failure that spurred all this because dask/dask does have some tests that exercise dask_cudf.


IMO, if the downstream libraries already have nightly runs then it'd be simpler to ensure that nightly environment tests against dask main. Then we won't have to mess with cloning source repositories and syncing commits up with the installed version, installing test dependencies, and picking out the right subset of tests to run.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we made changes to nightlies to pick main and PRs to run with a stable version, that will help drop this repository too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll still need this to run the GPU tests defined in the dask and distributed repo (which aren't in the rapidsai org and so I think can't use our runners).

But we wouldn't need these import tests.



def test_import_cuml():
import cuml # noqa: F401


def test_dask_cuda():
import dask_cuda # noqa: F401
1 change: 1 addition & 0 deletions scripts/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ pip install --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/
"cudf-${RAPIDS_PY_CUDA_SUFFIX}" \
"dask-cudf-${RAPIDS_PY_CUDA_SUFFIX}" \
"ucx-py-${RAPIDS_PY_CUDA_SUFFIX}" \
"cuml-${RAPIDS_PY_CUDA_SUFFIX}" \
"scipy" \
"dask-cuda"

Expand Down
17 changes: 12 additions & 5 deletions scripts/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,25 @@
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION & AFFILIATES.

echo "[testing dask]"
pushd dask
pushd dask || exit 1
pytest dask -v -m gpu
dask_status=$?
popd
popd || exit 1

echo "[testing distributed]"
pushd distributed
pushd distributed || exit 1
pytest distributed -v -m gpu --runslow
distributed_status=$?
popd
popd || exit 1

if [ $dask_status -ne 0 ] || [ $distributed_status -ne 0 ]; then
echo "[testing downstream]"

pushd downstream || exit 1
pytest -v .
downstream_status=$?
popd || exit 1

if [ $dask_status -ne 0 ] || [ $distributed_status -ne 0 ] || [ $downstream_status -ne 0 ] ; then
echo "Tests faild"
exit 1
fi