Skip to content
5 changes: 5 additions & 0 deletions features/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# features

This directory contains feature descriptions written in
[Gherkin](https://cucumber.io/docs/gherkin/). Please note that we should have
one feature per file.
22 changes: 22 additions & 0 deletions features/code_coverage.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Feature: Code coverage

Our workflow should be able to report code coverage to external
services. (For testing, we'll just be sure we can integrate with
CodeCov.)

Scenario: Report default coverage
# Note that this is probably NOT what most users will want. Imagine
# that our runner, because it is on GPU, runs more code paths than
# the basic runs, and runs less frequently. This means that PRs (not
# using our runner) will see spurious decrease in coverage.
Given a workflow that uses CodeCov for coverage
When I run the workflow
Then coverage should successfully be updated on CodeCov

Scenario: Report coverage with CodeCov flags
# Using CodeCov flags may help solve the problem mentioned in the
# default coverage scenario, but we should play with it a bit to
# determine a recommended practice. (Out of scope for MVP.)
Given a workflow that uses CodeCov flags for coverage
When I run then workflow
Then the correct flag should be updated on CodeCov
12 changes: 12 additions & 0 deletions features/external_users.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Feature: Allow external contributors to use resources

...

# NOTE: this is essentially the same as a scenario from the run_pr
# feature; might not ever fill it in
#Scenario: Authorized user permits a PR from unauthorized user to run

Scenario: Adding a new authorized user
Given an unauthorized user who should become authorized
When I give the user committer access to the repository
Then the user should have the ability to launch self-hosted workflows
12 changes: 12 additions & 0 deletions features/hard_kill.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Feature: Hard kill a runaway workflow job

A user should be able to kill a running job, and that should also
terminate the associated instance.

Scenario: Manual kill
Given a long-running workflow
And I am logged in as an authorized user
And the workflow is running
When I kill the workflow using the GitHub UI
Then the workflow should stop
And the instance should terminate
20 changes: 20 additions & 0 deletions features/physical_cost.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Feature: Track physical cost of running

The amount of time that has been used (or ideally, the actual cost
incurred) should be easily accessible.
[Possible mechanisms: (1) Refer to AWS billing info; (2) use an API to
extract stuff from AWS billing / CloudTrail; (3) have some custom
cloud-independent approach -- probably (1) or (2)]

Scenario:
# TODO: having trouble with this one because I feel like it depends
# on the specific mechanism

# WIP: I think this is the generic form of this information
# the mechanism for tracking the cost is not specified here.
Scenario: When I run a test, I can see how much it costs
Given I have a test that runs for X amount of time
And I have a cost of Y per unit time
And I have a mechanism for tracking the cost
When I run the test
Then I receive a caclulated cost of running the test.
40 changes: 40 additions & 0 deletions features/prevent_abuse.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Feature: Safeguards to prevent abuse of self-hosted runners

Compute resources should be protected from use outside of intended runs,
either due to accidental triggering or due to intential abuse by
malicious actors. This includes preventing forks from accessing our
resources and includes preventing runs on untrusted PRs.

Scenario: Forks should not be able to use our runners
# This should be guaranteed by the fact that secrets don't propagate
# to forks.
Given a fork of a repository with a self-hosted workflow
When the fork owner tries to run (within fork) using workflow dispatch
Then the workflow should give an error due to authorization
And the workflow should fail to start instances on AWS

Scenario: Pull requests from first-time contributors should not start runners
# With default repo settings, first-time contributors should require
# approval to run CI at all.
Given a fork of a repository with a self-hosted workflow
And the fork owner has not previously contributed to the repository
And the fork owner has changed our workflow to run on PRs
When the fork owner creates a pull request to our repository
Then the workflow should give an error due to authorization
And the workflow should fail to start instances on AWS

Scenario: Pull requests from previous contributors should not start runners
# With default repo settings, an external contributor who has
# previously contributed no longer requires approval for CI to run.
# However, this should be guaranteed because PRs from forks don't
# have access to secrets.
Given a fork of a repository with a self-hosted workflow
And the fork owner has previously contributed to the repository
And the fork owner has changed our workflow to run on PRs
When the fork owner creates a pull request to our repository
Then the workflow should give an error due to authorization
And the workflow should fail to start instances on AWS

# Non-tested scenario: AWS tokens (as secrets) should not leak in PRs
# from forks because forks don't see secrets. (Leaking AWS tokens is
# a different attack vector from the ones described above.)
18 changes: 18 additions & 0 deletions features/quickstart.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Feature: Quickstart guide

There should be a quick and easy way to set up workflows, and a simple
demo workflow.

# TODO: There should be a scenario here about documentation, maybe? or
# is that another feature? Up-to-date getting started documentation.

Scenario: Easy set-up of for first-time users
Given I have AWS credentials
And I have not previously set up AWS infra for this tool
When I use the quickstart command
Then I should have a working workflow

Scenario: Up-to date documentation
Given I have the latest version of the tool
When I look at the documentation
Then I should see up-to-date and tested information
10 changes: 10 additions & 0 deletions features/reproducible_env.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Feature: Reproducible workflow environment

Within a version of our tool and a specific cloud machine image, the
starting environment for all workflows should be the same.

Scenario: Reproducible workflow environment
Given a fixed version of our tool and of a cloud machine image
When I start the workflow
Then the versions of important libraries should be as expected
And the versions of important software tools should be as expected
17 changes: 17 additions & 0 deletions features/retrieve_results.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Feature: Retrieve results of a benchmarking run

A user may generate data during a run that they want to save somewhere
long-term. This will require that the user explicitly store that data
somewhere; in this, we will test that we can store it.

Scenario: Store results to an S3 bucket
Given a workflow that intends to upload a file to an S3 bucket
When I run the workflow
Then the file should be uploaded to the S3 bucket

Scenario: Store results to Dropbox
# we do a separate test for Dropbox just to ensure that there's
# nothing special happening because S3 and EC2 are both AWS
Given a workflow that intends to upload a file to Dropbox
When I run the workflow
Then the file should be uploaded to Dropbox
26 changes: 26 additions & 0 deletions features/run_manual.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Feature: Manual runs of the workflow

A user should be able to manually launch a workflow from the web UI.
[Mechanism: workflow_dispatch and run workflow]

Scenario: Authorized users should see the run workflow button
Given I have a workflow generated with our tool
And I am logged in as an authorized user
When I load the workflow's page
Then I should see the Run Workflow button

Scenario: Unauthorized users should not see the run worklow button
Given I have a workflow generated with our tool
And I am logged in as an unauthorized user
When I load the workflow's page
Then I should not see the Run Workflow button

Scenario: Running the Run Workflow button should run the workflow
Given I have a workflow generated with our tool
And I am logged in as an authorized user
When I load the workflow's page
And I press the Run Workflow button
Then the workflow should complete a manual run



11 changes: 11 additions & 0 deletions features/run_matrix.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Feature: Run a matrix build

A user should be able to run a full build matrix (ideally in parallel).

Scenario: Run a matrix
Given a workflow that involves a complicated matrix
When I run the workflow
Then all builds in the matrix should complete
# maybe this too:
# And an instance should be launched for each job
# And all jobs should run on different instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of inherent to the architecture we seek to design. Not sure if we need to add this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two "And" statements here are the test that we ran all jobs in parallel. If we want running the matrix in parallel to be a requirement, then we should probably test the requirement. The initial "Then" statement would also be satisfied if the matrix was run serially.

It might be better gherkin to combine the two "And" statements into a single "And all matrix jobs should run in parallel"? There's a trade-off between making the statement represent less code (better for the developer) and making the statement's purpose clearer to readers (better for the client).

14 changes: 14 additions & 0 deletions features/run_pr.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Feature: Run on pull requests

A user should be able to run a workflow on self-hosted runners prior to
merging a pull request. NOTE: This will *not* use the normal
pull_request trigger for workflows. Instead, this will be a
workflow_dispatch caused by some external decision. This is because we
don't expect to want to run expensive CI on every commit, but rather
when an admin chooses to.

Scenario: Choose to run a workflow on a PR
Given I have a workflow generated with our tool
And a pull request is open against that repository
When I [trigger the workflow to run on the PR] (how? TBD)
Then the workflow runs on our runner using code in the PR
9 changes: 9 additions & 0 deletions features/run_scheduled.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Feature: Scheduled runs of the workflow

A user should be able to run scheduled runs of a workflow

Scenario: A scheduled run should run
Given I have a workflow generated with our tool
When I wait until after the scheduled run time
Then the workflow should have completed a scheduled run

63 changes: 63 additions & 0 deletions features/select_platform.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Feature: Select platform to run on

A user should be able to select the hardware that suits the needs of
their run.

Scenario: Running with large memory
Given a workflow that requires and requests a large-memory host
When I run the workflow
Then it should run on the appropriate large-memory host

Scenario: Running with a single CUDA GPU
Given a workflow that requires and requests a single CUDA GPU
When I run the workflow
Then it should run on hardware with a GPU
And my software should be able to interact with the CUDA drivers

Scenario: Running with multiple GPUs
Given a workflow that requires and request multiple GPUs
When I run the workflow
Then it should run on hardware with multiple GPUs
And my software should be able to interact with all requested GPUs

Scenario: Running with smaller hardware
Given a workflow that requests lower-cost hardware
When I run the workflow
Then it should run on the appropriate hardware

Scenario: Running with preemptible instances
Given a workflow that can run on preemptible hosts
When I run the workflow
Then it should run on a preemptible host
# NOTE: anything about continuing from preemption is the
# responsibility of the workflow writer

Scenario: A run on a preemptible instance is preempted
Given a workflow that can run on preemptible hosts
And the workflow is running
When the workflow is preempted
Then the workflow should be retried (up to a specified retry limit)

Scenario: True failures should not be retried on preemtible instances
Given a workflow that can run on preemptible hosts
And the workflow is running
When the workflow fails
Then the workflow should not be retried

# NOTE: This is not an MVP requirement
#Scenario: Running with ROCM stack
# Given a workflow that requires an ROCM stack
# When I run the workflow
# Then it should run on hardware with the appropriate ROCM stack

Scenario: Running with an inference stack with various hardware
Given a workflow that requires an inference stack
When I run the workflow
Then it should run on hardware with the appropriate inference stack
And my software should be able to interact with the inference stack

Scenario: Running a small ML training run
Given a workflow that requires an inference stack
And the workflow is a small ML training run
When I run the workflow
Then it should run on hardware with the appropriate inference stack
12 changes: 12 additions & 0 deletions features/set_gpu_mode.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Feature: Workflow should be able to set the GPU compute mode

A given workflow should be able to use different GPU compute modes
(e.g., EXCLUSIVE_PROCESS).
[Mechanism: This might be either via machine selection or by setting
mode in the workflow]

Scenario: Run in EXCLUSIVE_PROCESS
Given a workflow that should run with EXCLUSIVE_PROCESS set
When I run the workflow
Then my main process should take the GPU
And any other process should error if it tries to use the GPU