Frontier Transfer Certifier

frontier-transfer-certifier is an alpha research prototype: a Python package and CLI for deciding whether a small-scale agent intervention has enough replayable evidence to justify a larger frontier trial.

This package provides full theory-registry coverage for the paper and implements selected conservative executable pathways for frontier-transfer certification. Formal items not backed by dedicated compilers are represented as schema-only artifacts, fail-closed diagnostics, or impossibility diagnostics. The package is not a formal proof assistant and not a complete implementation of every theorem.

The tool is deliberately conservative. It is not a predictor that says "the small model improved, so the frontier model will improve." It is an audit compiler that asks a narrower question:

Given replayable inputs, explicit descriptors, evaluation anchors, provenance tags, and lower-bound assumptions, is there still a positive conservative margin after all transport, protocol, cost, and confidence penalties are subtracted?

Implemented compiler paths return one of two outcomes:

certificate: the lower margin is positive under the supplied finite evidence and required external evidence has replay hashes, so the intervention can be treated as a candidate for a frontier trial.
abstention: some required evidence, descriptor coordinate, raw-log synthesis, semantic translation proof, evaluation anchor, or provenance tag is missing or invalid, so the tool refuses to make a positive claim.

This implementation follows:

Takahashi, K. (2026). Small-to-Frontier Transfer Theory for Agentic AI. Zenodo. https://doi.org/10.5281/zenodo.19619480

Scope And Limitations

Positive certificates are emitted only by implemented compiler paths.
Registry coverage means every formal paper item is represented. It does not mean every theorem has a dedicated executable compiler.
Schema-only, diagnostic-only, and impossibility-diagnostic registry entries cannot support positive certification by themselves.
External or imported evidence consumed by a positive certificate must be linked to field-level replay evidence with valid SHA-256 digests. External evidence without replay hashes weakens replayability and causes abstention.
Trusted-core and source-translation support is prototype-level unless checker digests, witness hashes, source certificate hashes, and translation witness hashes are supplied.
Confidence support is limited to fixed-horizon bounded-mean intervals and explicit manifest slack. Anytime-valid auditing and global alpha accounting are represented conservatively, not fully implemented.
Large portfolio fallback uses deterministic density ordering without claiming an approximation guarantee.
The package is designed to fail closed when assumptions are absent, unsupported, or under-specified.

Not Reproduced In Alpha

This alpha release is not a complete computational reproduction of the paper. The following parts are represented in the registry and documentation, but are not implemented as general positive certifiers:

mechanically checked proofs of the paper's theorems;
automatic discovery of hidden causal structure or missing descriptors from arbitrary unstructured logs;
open-world raw-log synthesis beyond finite manifest-declared event algebras, channels, statistics, and novelty gates;
replay retrieval or verification of external artifacts beyond user-supplied SHA-256 digests;
trusted-core kernel verification beyond digest and witness-hash checks;
source-translation proof checking beyond typed metadata, source-certificate hashes, translation-witness hashes, and positive source margins;
anytime-valid confidence sequences and global alpha accounting;
approximation guarantees for large portfolio allocation unless future compilers verify the required monotonicity or submodularity assumptions;
external predictive validity of semantic units beyond the finite packet, packing, anchor-loss, and translation-loss calculations supplied in the manifest.

These limitations are intentional trust boundaries. When a positive compiler path cannot represent and check the required assumptions, the output is a reason-coded abstention rather than a weakened certificate.

What The Tool Does

At the core, the package compiles a manifest into a lower-bound ledger:

G_lower - H_upper - C_upper
  - transport_penalty
  - protocol_penalty
  - confidence_slack
  > 0

where:

G is the direct lower-bounded gain.
H is the upper-bounded harmful or adverse load.
C is the upper-bounded residual cost.
The remaining terms are explicit conservative penalties.

The input is a YAML or JSON manifest. The output is deterministic JSON and a stable SHA-256 replay hash. The canonical replay hash is computed over the normalized manifest model using sorted-key JSON, deterministic separators, and no NaN or infinity. Raw file-byte hashing is available only with ftc replay --raw.

Five-Minute Quickstart

uv sync
uv run ftc profile
uv run ftc theorems
uv run ftc theory coverage
uv run ftc validate examples/split_inference_minimal.yaml
uv run ftc certify examples/split_inference_minimal.yaml

Expected certificate summary:

output_type                 certificate
certificate_mode            layered_validated
margin_lower                0.266
accepted_for_frontier_trial True

Write the certificate to disk:

uv run ftc certify examples/split_inference_minimal.yaml --out out/certificate.json

Choose A Usage Profile

Use only the layers you need. The layers are modular and can be adopted incrementally.

uv run ftc profile minimal
uv run ftc profile standard
uv run ftc profile advanced
uv run ftc profile full

Profile	Level	Use when	Main commands
`minimal`	0	You only need schema validation, replay hashes, and abstention readiness checks.	`ftc validate`, `ftc replay`, `ftc schema export`
`standard`	1	You have explicit component bounds or split-inference observables for one intervention.	`ftc certify`
`advanced`	3	You need generated raw-log evidence, descriptor repair, or budgeted portfolio allocation.	`ftc synthesize`, `ftc descriptor`, `ftc allocate`
`full`	4	You need all currently implemented alpha compiler paths: raw-log synthesis, descriptor-language growth, allocation, and semantic translation checks.	`ftc validate`, `ftc synthesize`, `ftc descriptor`, `ftc certify`, `ftc allocate`, `ftc semantic`

CLI Examples

Validate and hash a manifest:

uv run ftc validate examples/split_inference_minimal.yaml
uv run ftc replay examples/split_inference_minimal.yaml
uv run ftc replay examples/split_inference_minimal.yaml --raw

List theorem IDs accepted by the implemented compiler scopes:

uv run ftc theorems
uv run ftc theorems transport
uv run ftc theory list
uv run ftc theory show thm:certificate-sound
uv run ftc theory coverage --json

Compile a single-intervention certificate:

uv run ftc certify examples/split_inference_minimal.yaml

Compile stage-one raw-log open-world synthesis:

uv run ftc synthesize examples/raw_log_open_world_minimal.yaml
uv run ftc certify examples/raw_log_open_world_minimal.yaml

Select a descriptor-language scaffold:

uv run ftc descriptor examples/descriptor_growth_minimal.yaml

Allocate a certified portfolio:

uv run ftc allocate examples/allocation_minimal.yaml

Compile semantic and source-translation checks:

uv run ftc semantic examples/semantic_translation_minimal.yaml

Export JSON Schema:

uv run ftc schema export transport_certificate_core --out out/transport.schema.json
uv run ftc schema export raw_log_synthesis --out out/raw_log_synthesis.schema.json
uv run ftc schema export descriptor_language_growth --out out/descriptor_growth.schema.json
uv run ftc schema export audit_policy_and_allocation --out out/allocation.schema.json
uv run ftc schema export semantic_and_translation_manifest --out out/semantic.schema.json

Create example manifests:

uv run ftc init split-inference --out scratch/split.yaml
uv run ftc init raw-log-open-world --out scratch/raw.yaml
uv run ftc init descriptor-growth --out scratch/descriptor.yaml
uv run ftc init allocation --out scratch/allocation.yaml
uv run ftc init semantic-translation --out scratch/semantic.yaml

Python API

from frontier_transfer_certifier import (
    compile_allocation,
    compile_certificate,
    compile_descriptor_growth,
    compile_raw_synthesis,
    compile_semantic_translation,
    get_profile,
)

profile = get_profile("standard")
print(profile.commands)

certificate = compile_certificate("examples/split_inference_minimal.yaml")
print(certificate.output_type)

stage_one = compile_raw_synthesis("examples/raw_log_open_world_minimal.yaml")
print(stage_one.output_type)

descriptor = compile_descriptor_growth("examples/descriptor_growth_minimal.yaml")
print(descriptor.selected_scaffold_id)

allocation = compile_allocation("examples/allocation_minimal.yaml")
print(allocation.selected_candidates)

semantic = compile_semantic_translation("examples/semantic_translation_minimal.yaml")
print(semantic.semantic_lower_index)

Manifest Requirements

The package never silently invents missing theory objects. A positive certificate requires compatible provenance for every consumed field.

If a positive certificate consumes imported evidence, each consumed imported field must have a field_evidence_ref pointing to a replay_ref with a valid 64-character lowercase SHA-256 digest. Example manifests use deterministic example digests; replace them with real evidence hashes before relying on a certificate outside demonstrations.

All core schemas reject unknown fields by default. Measurement, bound, penalty, probability, budget, weight, cost, margin, overlap, drift, and confidence fields must be finite; NaN and infinity are invalid. Harmful-cost, penalty, overhead, tail, drift, curvature, and slope upper bounds are nonnegative unless a field is explicitly documented as a signed lower-bound quantity.

Supported provenance tags:

raw_generated: produced by a finite compiler in this package, such as raw-log synthesis or descriptor-language growth.
imported_and_validated: imported from another source and validated in this framework.
declared_and_validated: declared bookkeeping objects such as confidence budgets, split partitions, or deterministic tie rules.
blocked: forbidden for downstream positive claims.

Important output modes:

native_raw_log_complete: all substantive consumed evidence is raw-generated by finite subcompilers implemented by this package or validated bookkeeping.
layered_validated: at least one substantive consumed evidence object is imported and validated.
abstention: no positive claim is made.

The included examples are intentionally conservative. Split-inference and finite-context manifests are layered_validated because their source-theory objects are supplied as validated inputs. A field should be tagged raw_generated only when it is emitted by a finite replayable subcompiler that this package can identify, such as raw_log_synthesis or descriptor_language_growth.

The theorem-dependency list is also checked. A manifest cannot name arbitrary paper theorem IDs and receive a positive result unless the current compiler scope implements or operationalizes those dependencies. ftc validate performs this strict check by default; pass --schema-only when you only want Pydantic schema validation.

Included Examples

examples/split_inference_minimal.yaml: split-inference worked example.
examples/finite_context_minimal.yaml: explicit bounded component evidence.
examples/raw_log_open_world_minimal.yaml: event-local raw logs compiled into stage-one bounds.
examples/descriptor_growth_minimal.yaml: descriptor scaffold selection.
examples/allocation_minimal.yaml: singleton, pairwise, coordination, resource, factor, and dense residual allocation packets.
examples/semantic_translation_minimal.yaml: semantic packing with trusted source-translation checks.

Documentation

Theory To Code: mapping from paper objects to modules, models, and compiler behavior.
Theory Registry: generated machine-readable coverage of every formal paper item.
Alpha Implementation Matrix: registry coverage policy and selected implementation status summary.
Architecture: module responsibilities and extension points.
Manifest Guide: YAML manifest structure.
Audit Review: code-review findings, implemented fixes, and remaining conservative boundaries.
Operations: release, CI, security, and production-use checklist.
Threat Model: security and misuse boundaries.

Security And Data Handling

The CLI performs no network communication by default.
Raw private logs should not be copied into the repository. Store external references and replay hashes in manifests instead.
.env, credentials, private logs, local caches, out/, and data/private/ are ignored by Git.
CLI output avoids printing secret-like values.
CI is configured for tests, linting, typing, and secret-scan regression checks.

Alpha Release Trust Boundary

Registry coverage and implementation coverage are different. The registry is pinned to a recorded paper-source snapshot and covers every parsed formal item. Only selected registry entries are implemented by positive certifier code. All other entries are represented by schema-only or diagnostic artifacts that fail closed.

Trusted-core and source-translation claims require concrete witness hashes and checker digests before they can contribute to a positive semantic output. Unsupported theorem dependencies fail validation or produce abstention. The project does not claim production readiness, formal verification, or complete implementation of the paper.

License

Apache License 2.0. This repository intentionally does not include a NOTICE file.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docs		docs
examples		examples
scripts		scripts
src/frontier_transfer_certifier		src/frontier_transfer_certifier
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Frontier Transfer Certifier

Scope And Limitations

Not Reproduced In Alpha

What The Tool Does

Five-Minute Quickstart

Choose A Usage Profile

CLI Examples

Python API

Manifest Requirements

Included Examples

Documentation

Security And Data Handling

Alpha Release Trust Boundary

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Frontier Transfer Certifier

Scope And Limitations

Not Reproduced In Alpha

What The Tool Does

Five-Minute Quickstart

Choose A Usage Profile

CLI Examples

Python API

Manifest Requirements

Included Examples

Documentation

Security And Data Handling

Alpha Release Trust Boundary

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages