Skip to content

feat: add above_threshold() #184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@ quartodoc:
- name: Validate.all_passed
- name: Validate.assert_passing
- name: Validate.assert_below_threshold
- name: Validate.above_threshold
- name: Validate.n
- name: Validate.n_passed
- name: Validate.n_failed
Expand Down
1 change: 1 addition & 0 deletions pointblank/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -515,6 +515,7 @@ def _get_api_text() -> str:
"Validate.all_passed",
"Validate.assert_passing",
"Validate.assert_below_threshold",
"Validate.above_threshold",
"Validate.n",
"Validate.n_passed",
"Validate.n_failed",
Expand Down
114 changes: 109 additions & 5 deletions pointblank/data/api-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7621,7 +7621,7 @@ assert_passing(self) -> 'None'
```


assert_below_threshold(self, level: 'str' = 'warning', i: 'int' = None, message: 'str' = None) -> 'None'
assert_below_threshold(self, level: 'str' = 'warning', i: 'int | None' = None, message: 'str | None' = None) -> 'None'

Raise an `AssertionError` if validation steps exceed a specified threshold level.

Expand Down Expand Up @@ -7716,15 +7716,119 @@ assert_below_threshold(self, level: 'str' = 'warning', i: 'int' = None, message:

See Also
--------
- [`warning()`](`pointblank.Validate.warning`): Get the 'warning' status for each validation
- [`warning()`](`pointblank.Validate.warning`): get the 'warning' status for each validation
step
- [`error()`](`pointblank.Validate.error`): Get the 'error' status for each validation step
- [`critical()`](`pointblank.Validate.critical`): Get the 'critical' status for each
- [`error()`](`pointblank.Validate.error`): get the 'error' status for each validation step
- [`critical()`](`pointblank.Validate.critical`): get the 'critical' status for each
validation step
- [`assert_passing()`](`pointblank.Validate.assert_passing`): Assert all validations pass
- [`assert_passing()`](`pointblank.Validate.assert_passing`): assert all validations pass
completely


above_threshold(self, level: 'str' = 'warning', i: 'int | None' = None) -> 'bool'

Check if any validation steps exceed a specified threshold level.

The `above_threshold()` method checks whether validation steps exceed a given threshold
level. This provides a non-exception-based alternative to
[`assert_below_threshold()`](`pointblank.Validate.assert_below_threshold`) for conditional
workflow control based on validation results.

This method is useful in scenarios where you want to check if any validation steps failed
beyond a certain threshold without raising an exception, allowing for more flexible
programmatic responses to validation issues.

Parameters
----------
level
The threshold level to check against. Valid options are: `"warning"` (the least severe
threshold level), `"error"` (the middle severity threshold level), and `"critical"` (the
most severe threshold level). The default is `"warning"`.
i
Specific validation step number(s) to check. If a single integer, checks only that step.
If a list of integers, checks all specified steps. If `None` (the default), checks all
validation steps. Step numbers are 1-based (first step is `1`, not `0`).

Returns
-------
bool
`True` if any of the specified validation steps exceed the given threshold level,
`False` otherwise.

Raises
------
ValueError
If an invalid threshold level is provided.

Examples
--------
Below are some examples of how to use the `above_threshold()` method. First, we'll create a
simple Polars DataFrame with a single column (`values`).

Then a validation plan will be created with thresholds (`warning=0.1`, `error=0.2`,
`critical=0.3`). After interrogating, we display the validation report table:

```python
import pointblank as pb

validation = (
pb.Validate(data=tbl, thresholds=(0.1, 0.2, 0.3))
.col_vals_gt(columns="values", value=0)
.col_vals_lt(columns="values", value=10)
.col_vals_between(columns="values", left=0, right=5)
.interrogate()
)

validation
```

Let's check if any steps exceed the 'warning' threshold with the `above_threshold()` method.
A message will be printed if that's the case:

```python
if validation.above_threshold(level="warning"):
print("Some steps have exceeded the warning threshold")
```

Check if only steps 2 and 3 exceed the 'error' threshold through use of the `i=` argument:

```python
if validation.above_threshold(level="error", i=[2, 3]):
print("Steps 2 and/or 3 have exceeded the error threshold")
```

You can use this in a workflow to conditionally trigger processes. Here's a snippet of how
you might use this in a function:

```python
def process_data(validation_obj):
# Only continue processing if validation passes critical thresholds
if not validation_obj.above_threshold(level="critical"):
# Continue with processing
print("Data meets critical quality thresholds, proceeding...")
return True
else:
# Log failure and stop processing
print("Data fails critical quality checks, aborting...")
return False
```

Note that this is just a suggestion for how to implement conditional workflow processes. You
should adapt this pattern to your specific requirements, which might include different
threshold levels, custom logging mechanisms, or integration with your organization's data
pipelines and notification systems.

See Also
--------
- [`assert_below_threshold()`](`pointblank.Validate.assert_below_threshold`): a similar
method that raises an exception if thresholds are exceeded
- [`warning()`](`pointblank.Validate.warning`): get the 'warning' status for each validation
step
- [`error()`](`pointblank.Validate.error`): get the 'error' status for each validation step
- [`critical()`](`pointblank.Validate.critical`): get the 'critical' status for each
validation step


n(self, i: 'int | list[int] | None' = None, scalar: 'bool' = False) -> 'dict[int, int] | int'

Provides a dictionary of the number of test units for each validation step.
Expand Down
149 changes: 144 additions & 5 deletions pointblank/validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -8831,7 +8831,7 @@ def assert_passing(self) -> None:
raise AssertionError(msg)

def assert_below_threshold(
self, level: str = "warning", i: int = None, message: str = None
self, level: str = "warning", i: int | None = None, message: str | None = None
) -> None:
"""
Raise an `AssertionError` if validation steps exceed a specified threshold level.
Expand Down Expand Up @@ -8940,12 +8940,12 @@ def assert_below_threshold(

See Also
--------
- [`warning()`](`pointblank.Validate.warning`): Get the 'warning' status for each validation
- [`warning()`](`pointblank.Validate.warning`): get the 'warning' status for each validation
step
- [`error()`](`pointblank.Validate.error`): Get the 'error' status for each validation step
- [`critical()`](`pointblank.Validate.critical`): Get the 'critical' status for each
- [`error()`](`pointblank.Validate.error`): get the 'error' status for each validation step
- [`critical()`](`pointblank.Validate.critical`): get the 'critical' status for each
validation step
- [`assert_passing()`](`pointblank.Validate.assert_passing`): Assert all validations pass
- [`assert_passing()`](`pointblank.Validate.assert_passing`): assert all validations pass
completely
"""
# Check if validation has been interrogated
Expand Down Expand Up @@ -8991,6 +8991,145 @@ def assert_below_threshold(
)
raise AssertionError(msg)

def above_threshold(self, level: str = "warning", i: int | None = None) -> bool:
"""
Check if any validation steps exceed a specified threshold level.

The `above_threshold()` method checks whether validation steps exceed a given threshold
level. This provides a non-exception-based alternative to
[`assert_below_threshold()`](`pointblank.Validate.assert_below_threshold`) for conditional
workflow control based on validation results.

This method is useful in scenarios where you want to check if any validation steps failed
beyond a certain threshold without raising an exception, allowing for more flexible
programmatic responses to validation issues.

Parameters
----------
level
The threshold level to check against. Valid options are: `"warning"` (the least severe
threshold level), `"error"` (the middle severity threshold level), and `"critical"` (the
most severe threshold level). The default is `"warning"`.
i
Specific validation step number(s) to check. If a single integer, checks only that step.
If a list of integers, checks all specified steps. If `None` (the default), checks all
validation steps. Step numbers are 1-based (first step is `1`, not `0`).

Returns
-------
bool
`True` if any of the specified validation steps exceed the given threshold level,
`False` otherwise.

Raises
------
ValueError
If an invalid threshold level is provided.

Examples
--------
```{python}
#| echo: false
#| output: false
import pointblank as pb
pb.config(report_incl_header=False, report_incl_footer=False, preview_incl_header=False)
```
Below are some examples of how to use the `above_threshold()` method. First, we'll create a
simple Polars DataFrame with a single column (`values`).

```{python}
import polars as pl

tbl = pl.DataFrame({
"values": [1, 2, 3, 4, 5, 0, -1]
})
```

Then a validation plan will be created with thresholds (`warning=0.1`, `error=0.2`,
`critical=0.3`). After interrogating, we display the validation report table:

```{python}
import pointblank as pb

validation = (
pb.Validate(data=tbl, thresholds=(0.1, 0.2, 0.3))
.col_vals_gt(columns="values", value=0)
.col_vals_lt(columns="values", value=10)
.col_vals_between(columns="values", left=0, right=5)
.interrogate()
)

validation
```

Let's check if any steps exceed the 'warning' threshold with the `above_threshold()` method.
A message will be printed if that's the case:

```{python}
if validation.above_threshold(level="warning"):
print("Some steps have exceeded the warning threshold")
```

Check if only steps 2 and 3 exceed the 'error' threshold through use of the `i=` argument:

```{python}
if validation.above_threshold(level="error", i=[2, 3]):
print("Steps 2 and/or 3 have exceeded the error threshold")
```

You can use this in a workflow to conditionally trigger processes. Here's a snippet of how
you might use this in a function:

```python
def process_data(validation_obj):
# Only continue processing if validation passes critical thresholds
if not validation_obj.above_threshold(level="critical"):
# Continue with processing
print("Data meets critical quality thresholds, proceeding...")
return True
else:
# Log failure and stop processing
print("Data fails critical quality checks, aborting...")
return False
```

Note that this is just a suggestion for how to implement conditional workflow processes. You
should adapt this pattern to your specific requirements, which might include different
threshold levels, custom logging mechanisms, or integration with your organization's data
pipelines and notification systems.

See Also
--------
- [`assert_below_threshold()`](`pointblank.Validate.assert_below_threshold`): a similar
method that raises an exception if thresholds are exceeded
- [`warning()`](`pointblank.Validate.warning`): get the 'warning' status for each validation
step
- [`error()`](`pointblank.Validate.error`): get the 'error' status for each validation step
- [`critical()`](`pointblank.Validate.critical`): get the 'critical' status for each
validation step
"""
# Ensure validation has been run
if not hasattr(self, "time_start") or self.time_start is None:
return False

# Validate the level parameter
level = level.lower()
if level not in ["warning", "error", "critical"]:
raise ValueError(
f"Invalid threshold level: {level}. Must be one of 'warning', 'error', or 'critical'."
)

# Get the threshold status using the appropriate method
if level == "warning":
status = self.warning(i=i)
elif level == "error":
status = self.error(i=i)
elif level == "critical":
status = self.critical(i=i)

# Return True if any steps exceeded the threshold
return any(status.values())

def n(self, i: int | list[int] | None = None, scalar: bool = False) -> dict[int, int] | int:
"""
Provides a dictionary of the number of test units for each validation step.
Expand Down
Loading
Loading