Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft: adding FIXME/XFAIL issue linter to pre commit hook #681

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dorotat-nv
Copy link
Collaborator

Description

We want to ensure that all issues marked to be fixed in the codebase have corresponding github issue created ie

  • pytest.mark.xfail has associated github issue link
  • FIXME comments have associated issue links

This promotes good practices such as
@pytest.mark.xfail(reason="Known issue #123")

FIXME: Need better error handling #456

Implemented this simple and lightweight linter scripted employed with pre commit hook, which can also be run from the command line

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels:

Note

By default, the notebooks validation tests are skipped unless explicitly enabled.

Usage

# or pre-commit run --all-files
python internal/scripts/lint_xfail_fixme.py

Scanning files from: /Users/dorotat/experiments/bionemo-framework
  Line 170: FIXME in .github/workflows/unit-tests.yml:170 missing required issue link
    > # TODO: figure out way of cleaning up working directory (requires sudo or for us to fix file ownership from release container)

  Line 13: FIXME in .pre-commit-config.yaml:13 missing required issue link
    > # 1. Attempt to automatically fix any lint issues.

  Line 37: FIXME in .pre-commit-config.yaml:37 missing required issue link
    > - id: lint-xfail-fixme

  Line 38: FIXME in .pre-commit-config.yaml:38 missing required issue link
    > name: Lint XFAIL and FIXME markers

  Line 184: FIXME in Dockerfile:184 missing required issue link
    > # FIXME the following result in unstable training curves even if they are faster

  Line 240: FIXME in Dockerfile:240 missing required issue link
    > # FIXME the following results in unstable training curves even if faster.

  Line 109: FIXME in internal/infra-bionemo/tests/test_infra_bionemo/test_license_check.py:109 missing required issue link
    > # valid w/o license header, but automatic fix works

  Line 207: FIXME in internal/infra-bionemo/tests/test_infra_bionemo/test_license_check.py:207 missing required issue link
    > # can fix if there are no invalid files

  Line 139: FIXME in sub-packages/bionemo-esm2/src/bionemo/esm2/scripts/finetune_esm2.py:139 missing required issue link
    > resume_if_exists (bool): attempt to resume if the checkpoint exists [FIXME @skothenhill this doesn't work yet]

  Line 530: FIXME in sub-packages/bionemo-esm2/src/bionemo/esm2/scripts/finetune_esm2.py:530 missing required issue link
    > # FIXME (@skothenhill) figure out how checkpointing and resumption should work with the new nemo trainer

  Line 129: FIXME in sub-packages/bionemo-esm2/src/bionemo/esm2/scripts/train_esm2.py:129 missing required issue link
    > resume_if_exists (bool): attempt to resume if the checkpoint exists [FIXME @skothenhill this doesn't work yet]

  Line 467: FIXME in sub-packages/bionemo-esm2/src/bionemo/esm2/scripts/train_esm2.py:467 missing required issue link
    > # FIXME (@skothenhill) figure out how checkpointing and resumption should work with the new nemo trainer

  Line 343: FIXME in sub-packages/bionemo-example_model/src/bionemo/example_model/lightning/lightning_basic.py:343 missing required issue link
    > # FIXME add an assertion that the user is not trying to do tensor parallelism since this doesn't use

  Line 132: FIXME in sub-packages/bionemo-geneformer/src/bionemo/geneformer/scripts/train_geneformer.py:132 missing required issue link
    > resume_if_exists (bool): attempt to resume if the checkpoint exists [FIXME @skothenhill this doesn't work yet]

  Line 298: FIXME in sub-packages/bionemo-geneformer/src/bionemo/geneformer/scripts/train_geneformer.py:298 missing required issue link
    > bias_dropout_fusion=True,  # TODO fix the recompilation issue, but for now it's faster even with recompilations

  Line 405: FIXME in sub-packages/bionemo-geneformer/src/bionemo/geneformer/scripts/train_geneformer.py:405 missing required issue link
    > # FIXME (@skothenhill) figure out how checkpointing and resumption should work with the new nemo trainer

  Line 63: FIXME in sub-packages/bionemo-geneformer/tests/bionemo/geneformer/test_stop_and_go.py:63 missing required issue link
    > # FIXME: for now the test doesn't work unless dropout is inactivated because the megatron rng state is not saved

  Line 103: FIXME in sub-packages/bionemo-llm/src/bionemo/llm/model/biobert/model.py:103 missing required issue link
    > "activation_func",  # FIXME hack: update the ESM2 checkpoint with the updated activation function and don't override

  Line 119: FIXME in sub-packages/bionemo-llm/src/bionemo/llm/train.py:119 missing required issue link
    > overlap_param_gather=False,  # TODO waiting for NeMo fix

  Line 27: XFAIL in sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_convert.py:27 missing required issue link
    > # pytestmark = pytest.mark.xfail(

  Line 157: XFAIL in sub-packages/bionemo-llm/tests/bionemo/llm/model/test_loss.py:157 missing required issue link
    > @pytest.mark.xfail(reason="tensor_parallel.vocab_parallel_cross_entropy modifies input token_logits")

  Line 168: XFAIL in sub-packages/bionemo-llm/tests/bionemo/llm/test_lightning.py:168 missing required issue link
    > @pytest.mark.xfail(reason="MegatronStrategy no longer has '_get_loss_reduction' attribute")

  Line 311: XFAIL in sub-packages/bionemo-noodles/tests/bionemo/noodles/test_nvfaidx.py:311 missing required issue link
    > @pytest.mark.xfail(reason="This is a known failure mode for pyfaidx that we are trying to prevent with nvfaidx.")

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SKIP_CI Completely skips the CI pipeline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant