Skip to content

Add UNURAN sampling#71

Open
wrdxwrdxwrdx wants to merge 5 commits intomainfrom
distribution/sampling
Open

Add UNURAN sampling#71
wrdxwrdxwrdx wants to merge 5 commits intomainfrom
distribution/sampling

Conversation

@wrdxwrdxwrdx
Copy link
Collaborator

@wrdxwrdxwrdx wrdxwrdxwrdx commented Feb 21, 2026

UNU.RAN integration

Start with notebooks and scripts under examples/—they showcase end-to-end scenarios for the new sampling strategy.

Now you need to run this after git clone
git submodule update --init --remote --recursive

Now unuran submodule cloning from dev branch. Please review PR

WARNING

Added dpdf to registry. This is done due to the fact that many methods require it to work. In the future, it's worth implementing it more correctly in the registry.

Platforms

  • Supported: Linux, macOS and Windows🙏. Windows requires MVSC.

Added

  1. CFFI build + UNU.RAN bindings.
  2. DefaultSamplingStrategy and DefaultSampler that samples based on a distribution and method config.
  3. Tests covering CFFI build, callbacks, initialization, and orchestration.

Characteristics & methods

  • Continuous (PDF/CDF/PPF) -> methods AROU, TDR, HINV, PINV, NINV.
  • Discrete (PMF/CDF) -> DGT.

@wrdxwrdxwrdx wrdxwrdxwrdx force-pushed the distribution/sampling branch 2 times, most recently from a1de7e9 to d8ff3f0 Compare February 22, 2026 23:15
@wrdxwrdxwrdx wrdxwrdxwrdx marked this pull request as ready for review February 22, 2026 23:57
Copy link
Collaborator

@LeonidElkin LeonidElkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed only core/* except for _unuran_sampler

Comment on lines 1 to -15
@@ -9,10 +8,9 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ ubuntu-latest, macos-latest, windows-latest ]
python-version: [ "3.12", "3.13" ]
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ["3.12", "3.13"]
runs-on: ${{ matrix.os }}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably some tools could auto-format it. Such diffs shouldn't be in the PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this and other pictures could be inside of jupiter notebook using matplotlib. There is no need to store it directly in our repository

"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably execute this notebook with other kernel version or something like that so it rewrites the metadata. You definitely shouldn't have committed this diff, because it doesn't affect the repository in any way

Comment on lines +10 to +17
_unuran_cffi_module: ModuleType | None = None
try:
_unuran_cffi_module = import_module("pysatl_core.sampling.unuran.bindings._unuran_cffi")
except ModuleNotFoundError: # pragma: no cover - optional binary module
try:
_unuran_cffi_module = import_module("_unuran_cffi")
except ModuleNotFoundError: # pragma: no cover - optional binary module
_unuran_cffi_module = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use something like that to avoid unnecessary nesting. And it seems more natural I think

_unuran_cffi_module = None

for name in (
    "pysatl_core.sampling.unuran.bindings._unuran_cffi",
    "_unuran_cffi",
):
    try:
        _unuran_cffi_module = import_module(name)
        break
    except ModuleNotFoundError:  # pragma: no cover - optional binary module
        pass

Comment on lines +124 to +128
if _unuran_cffi is None:
raise RuntimeError(
"UNURAN CFFI bindings are not available. "
"Please build them via `python "
"poetry build"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message just ends. Adjust it a little

When True, the last sampler and distribution are cached and reused
if the same distribution object is used in subsequent calls.
"""
self._default_config = default_config or UnuranMethodConfig() # TODO config not default
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "default config" . Shouldn't we just pass the config and if it wasn't passed, use the default value, that is just UnuranMethodConfig()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at this file in detail, I come to the conclusion that what we call a strategy is not actually a strategy. In fact, the same sampler can be called just a strategy and it will make more sense. It seems to me that this is an abstraction for demolition, that is, this file and, in general, such an abstraction is not needed.

6. Initializes the UNURAN generator
"""
self.distr = distr
self.config = config or UnuranMethodConfig()
Copy link
Collaborator

@LeonidElkin LeonidElkin Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it means that the config can be changed in runtime and everything will work normally, then ok. If not, it is better to do it as a readonly, through the property

Comment on lines +216 to +217
def _cleanup(self) -> None:
cleanup_unuran_resources(self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to do it honestly with a class method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to do it as a static DefaultUnuranSampler method.

@LeonidElkin LeonidElkin added API: Design ALG: Sampling New algorithms for sampling or improvements of existing ones CI: tooling CI: tests labels Feb 23, 2026
Copy link
Collaborator

@LeonidElkin LeonidElkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement tests as packages with __init__.py

f"Unsupported distribution type: {distr_type}. "
"Only Euclidean distribution types are supported."
)
self._is_continuous = distr_type.kind == Kind.CONTINUOUS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's store it as a kind, not as a bool

Comment on lines +65 to +68
if self._is_continuous:
self._unuran_distr = self._lib.unur_distr_cont_new()
else:
self._unuran_distr = self._lib.unur_distr_discr_new()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace it with a switch case

Comment on lines +196 to +213
if hasattr(self, "_unuran_gen") and self._unuran_gen is not None:
if self._unuran_gen != self._ffi.NULL:
with contextlib.suppress(Exception):
self._lib.unur_free(self._unuran_gen)
gen_freed = True
self._unuran_gen = None

if hasattr(self, "_unuran_par") and self._unuran_par is not None:
if not gen_freed and self._unuran_par != self._ffi.NULL:
with contextlib.suppress(Exception):
self._lib.unur_par_free(self._unuran_par)
self._unuran_par = None

if hasattr(self, "_unuran_distr") and self._unuran_distr is not None:
if self._unuran_distr != self._ffi.NULL:
with contextlib.suppress(Exception):
self._lib.unur_distr_free(self._unuran_distr)
self._unuran_distr = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need in hasattr if you initialize it as None and it is not a classmethod nor staticmethod

Comment on lines +238 to +244
for characteristic in available_chars:
if characteristic in self._distr.analytical_computations:
distr_characteristic[characteristic] = cast(
"Method[Any, Any]", self._distr.analytical_computations[characteristic]
)
else:
distr_characteristic[characteristic] = self._distr.query_method(characteristic)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query_method is already checking that characteristic is analytical

characteristics: Mapping[CharacteristicName, Method[Any, Any]],
):
self._unuran_distr = unuran_distr
self._is_continuous = is_continuous
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change with Kind, not bool

Comment on lines +34 to +107
def determine_domain_from_support(self) -> tuple[int, int | None] | None:
"""
Determine domain boundaries from distribution support if available.

Returns
-------
tuple[int, int | None] or None
Domain as (left, right) if support is bounded, or (left, None) if only
left boundary is known, or None if support is unavailable/unbounded.
"""
support = self._distr.support
if support is None:
return None

if isinstance(support, ExplicitTableDiscreteSupport):
points = support.points
if points.size == 0:
return None
left = int(np.floor(points[0]))
right = int(np.ceil(points[-1]))
return (left, right)

if isinstance(support, IntegerLatticeDiscreteSupport):
first = support.first()
last = support.last()

if first is not None and last is not None:
return (first, last)

if first is not None:
return (first, None)

return None

# Fallback: try to call first() and last() methods
# if not IntegerLatticeDiscreteSupport or ExplicitTableDiscreteSupport
first_callable = getattr(support, "first", None)
last_callable = getattr(support, "last", None)

if callable(first_callable) and callable(last_callable):
try:
first = first_callable()
last = last_callable()
if first is not None and last is not None:
return (int(first), int(last))
except TypeError:
pass

return None

def _get_continuous_support_bounds(self) -> tuple[float, float] | None:
support = getattr(self._distr, "support", None)
if support is None:
return None

left = getattr(support, "left", None)
right = getattr(support, "right", None)

if isinstance(left, int | float) and isinstance(right, int | float):
return float(left), float(right)

# Fallback to callable accessors if available (e.g., support.first()/last())
left_fn = getattr(support, "first", None)
right_fn = getattr(support, "last", None)
try:
left_val = float(left_fn()) if callable(left_fn) else None
right_val = float(right_fn()) if callable(right_fn) else None
except TypeError:
left_val = right_val = None

if left_val is not None and right_val is not None:
return left_val, right_val

return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name these functions in the same way (smth like determine_discrete_domain and determine_continuous_domain) and make them both public, not private

Comment on lines +31 to +32
def __init__(self, disr: Distribution) -> None:
self._distr = disr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos, ahhhrrr

Comment on lines +48 to +54
if isinstance(support, ExplicitTableDiscreteSupport):
points = support.points
if points.size == 0:
return None
left = int(np.floor(points[0]))
right = int(np.ceil(points[-1]))
return (left, right)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is need in fallback to our default sampler if there is even one non int number in the table

Comment on lines +68 to +80
# Fallback: try to call first() and last() methods
# if not IntegerLatticeDiscreteSupport or ExplicitTableDiscreteSupport
first_callable = getattr(support, "first", None)
last_callable = getattr(support, "last", None)

if callable(first_callable) and callable(last_callable):
try:
first = first_callable()
last = last_callable()
if first is not None and last is not None:
return (int(first), int(last))
except TypeError:
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

return None
left = int(np.floor(points[0]))
right = int(np.ceil(points[-1]))
return (left, right)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove here and in other places redundant parentheses

@LeonidElkin
Copy link
Collaborator

Rename vendor to subprojects

@wrdxwrdxwrdx wrdxwrdxwrdx force-pushed the distribution/sampling branch from 81c9dc0 to 1478930 Compare March 2, 2026 17:52
@wrdxwrdxwrdx wrdxwrdxwrdx force-pushed the distribution/sampling branch from 1478930 to cd12314 Compare March 2, 2026 18:05
@wrdxwrdxwrdx wrdxwrdxwrdx requested a review from LeonidElkin March 2, 2026 18:39
@wrdxwrdxwrdx wrdxwrdxwrdx force-pushed the distribution/sampling branch from cd12314 to 5c22a36 Compare March 2, 2026 18:48
_CDEF_FILE = Path(__file__).with_name("cffi_unuran.h")
UNURAN_CDEF: Final = _CDEF_FILE.read_text()

LOGGER = logging.getLogger(__name__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where can I see the log if something goes wrong?

if self._cleaned_up:
return

self._cleaned_up = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to move it to the bottom of the method for safety

Comment on lines +388 to +391
@property
def _config(self) -> UnuranMethodConfig:
"""The configuration for this sampler (read-only)."""
return self.__config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could actually remove this method. Strategy is already giving us a way to reach config. This code isn't used anywhere else

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also could make self.__config just self._config

UnuranMethodConfig,
)

from .unuran_sampler import DefaultUnuranSampler
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full path

Comment on lines +123 to +130
def __init__(
self,
distr: Distribution,
config: UnuranMethodConfig | None = None,
**override_options: Any,
) -> None:
"""Initialize the sampler for ``distr`` with optional configuration overrides."""
...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need in __init__ in Protocol. Correct me if I'm wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to every __init__.py in tests such code

"""
PySATL Core
===========

Core framework for probabilistic distributions providing type definitions,
distribution abstractions, characteristic computation graphs, and parametric
family management.
"""

__author__ = "YOUR NAME"
__copyright__ = "Copyright (c) 2025 PySATL project"
__license__ = "SPDX-License-Identifier: MIT"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ALG: Sampling New algorithms for sampling or improvements of existing ones API: Design CI: tests CI: tooling

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants