-
Notifications
You must be signed in to change notification settings - Fork 0
Add extensible prover implementation #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
class Plugin(ABC): | ||
@abstractmethod | ||
def dist(self) -> Dist: ... | ||
|
||
def inits(self) -> Mapping[str, Init]: | ||
return {} | ||
|
||
def shows(self) -> Mapping[str, Show]: | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, a plugin defines three things:
- K definition to be used.
- Ways for loading specifications (optional).
- Ways for printing proof nodes (optional).
from .api import Config, Init, Show | ||
|
||
|
||
def init_from_claims(config: Config, spec_file: Path, claim_id: str) -> APRProof: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All plugins implicitly support loading claims from K files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that everywhere we have claim_id
as a str
, but not a str | None
. Will this enable the case where a user wants to either (i) run the only claimthat is in th efile, or (ii) run all th eclaims in a given file?
return proof | ||
|
||
|
||
def show_pretty_term(config: Config, term: KInner) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, all plugins support printing proof nodes using kore-print
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using kore_print
? I think we should use Formatter
or PrettyPrinter
instead, does kore_print
corrcetly handle thinsg like variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does
kore_print
corrcetly handle thinsg like variables?
Yes, kore_print
handles symbolic terms as well.
Why are we using
kore_print
? I think we should useFormatter
orPrettyPrinter
instead
We could use Formatter
for sure. An advantage of kore_print
is that it handles an extra (non-trivial) case for parenthesization. Also, although I do not have measurements, for large terms, kore_print
is probably significantly faster as it is implemented in C++.
Should we switch to Formatter
as the default Show
implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No if kore_print
works on symbolic values we should be good.
class KRiscVPlugin(Plugin): | ||
def dist(self) -> Dist: | ||
from pyk.kdist import kdist | ||
|
||
return Dist( | ||
haskell_dir=kdist.get('riscv-semantics.haskell'), | ||
llvm_lib_dir=kdist.get('riscv-semantics.llvm-lib'), | ||
source_dirs=(kdist.get('riscv-semantics.source'),), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minimal plugin for riscv-semantics
: it only defines the definitions to be used.
[tool.poetry.plugins.kprovex] | ||
riscv = "kriscv.symtools:KRiscVPlugin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prover plugin is registered.
from .api import Init, Plugin, Show | ||
|
||
|
||
def create_prover(plugin_id: str, proof_dir: str | Path, *, bug_report: BugReport | None = None) -> KProveX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A prover for a registered plugin can be instantiated by referring to the plugin by name:
prover = create_prover('riscv', proof_dir='.proofs')
return _ID_PATTERN.fullmatch(s) is not None | ||
|
||
|
||
PLUGINS: Final = _load_plugins() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this module is imported, all plugins are loaded. (Lazy loading could be improved so that each plugin is loaded only when it is first referred to.)
|
||
@final | ||
@dataclass | ||
class KProveX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prover implementation: currently supports a minimal set of useful features including show
, view
, advance+proof
, prune
, etc.
After finalizing the interfaces I'll add docstrings for the public API. |
def init_proof( | ||
self, | ||
spec_file: str | Path, | ||
claim_id: str, | ||
*, | ||
init_id: str | None = None, | ||
exist_ok: bool = False, | ||
) -> str: | ||
spec_file = Path(spec_file) | ||
init = self._load_init(init_id=init_id) | ||
proof = init(config=self.config, spec_file=spec_file, claim_id=claim_id) | ||
if not exist_ok and APRProof.proof_data_exists(proof.id, self.proof_dir): | ||
raise ValueError(f'Proof with id already exists: {proof.id}') | ||
|
||
proof.write_proof_data() | ||
return proof.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we makes it more flexible? For example, when I'm working on the zekvm-harness
, I starts from a concrete configuration and constructs a claim by myself:
# concrete config
elf_file = build_elf(test_id, load_template, build_config)
symdata = {resolve_symbol(elf_file, f'OP{i}'): (32, f'W{i}') for i in range(arg_count)}
# make symbolic initial state
init_config = _init_config(symdata, build_config, elf_file, tools(build_config.target))
# construct claim
kclaim = cterm_build_claim(test_id.upper(), init_config, _final_config(symtool))
proof = APRProof.from_claim(symtool.kprove.definition, kclaim[0], {}, symtool.proof_dir)
BTW, must we have a final state? For my experience, I always start with a totally abstract state to explore the state transition graph because there are always problems at the begining. Additionally, we can eliminate the time to implies the final state
if I know the step it might reach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My expectation for this class to to simplify the following test:
# Given
symtool = symtools(f'{build_config.target}-haskell', f'{build_config.target}-lib', 'zkevm-semantics.source')
if APRProof.proof_data_exists(test_id.upper(), symtool.proof_dir):
proof = APRProof.read_proof_data(proof_dir=symtool.proof_dir, id=test_id.upper())
else:
elf_file = build_elf(test_id, load_template, build_config)
symdata = {resolve_symbol(elf_file, f'OP{i}'): (32, f'W{i}') for i in range(arg_count)}
init_config = _init_config(symdata, build_config, elf_file, tools(build_config.target))
kclaim = cterm_build_claim(test_id.upper(), init_config, _final_config(symtool))
proof = APRProof.from_claim(symtool.kprove.definition, kclaim[0], {}, symtool.proof_dir)
# When
with symtool.explore(id=test_id.upper()) as kcfg_explore:
prover = APRProver(
kcfg_explore=kcfg_explore,
execute_depth=DEPTH,
# optimize_kcfg=True,
)
prover.advance_proof(proof, max_iterations=MAX_ITERATIONS)
proof_show = APRProofShow(symtool.kprove)
show_result = '\n'.join(proof_show.show(proof, [node.id for node in proof.kcfg.nodes]))
(symtool.proof_dir / f'{test_id.upper()}-proof-result.txt').write_text(show_result)
into something like
kprovex = create_prover(build_config.target, proof_dir, bug_report = bug_report)
elf_file = build_elf(test_id, load_template, build_config)
symdata = {resolve_symbol(elf_file, f'OP{i}'): (32, f'W{i}') for i in range(arg_count)}
init_config = _init_config(symdata, build_config, elf_file, tools(build_config.target))
# I can change the reinit to True to restart a proof
proof_id = kprovex.init_proof_from_cterm(test_id.upper(), init_config, _final_config(symtool), exist_ok = True, reinit = False)
# I might prune a node if it is too far away from the error point for easy investigation.
# kprovex.prune_node(proof_id, 12) # I can change the number to continue execution from the node before it.
# I need optimize_kcfg to make the kcfg small
krpovex.advance_proof(proof_id, max_depth = 100, max_iterations = 100, optimize_kcfg = True)
# It can be helpful if I can also minimize the proof whenever I want. It can be better if the minimization generate a new proof.
# kprovex.minimize_proof()
# it can be better if I can see the difference between particular nodes and eliminate specific cells that I don't care.
(symtool.proof_dir / f'{test_id.upper()}-proof-result.txt').write_text(kprovex.show(proof_id))
I think this PR is good enough to use , and I can provide further modification if I need.
This PR introduces a module
kprovex
that defines an extensible prover based onAPRPRover
. It has the the following submodules:kprovex.api
: the plugin API definition. A plugin provides the K definition to the prover, as well as functions for loading and printing proofs of that definition.kprovex._default
: semantics-agnostic defaults for loading and printing proofs.kprovex._loader
: plugin loader.kprovex._kprovex
: prover implementation.Furthermore, it adds a simple plugin implementation for
riscv-semantics
.The advantage of this architecture is that
kprovex
can be upstreamed topyk
without breaking it or theriscv-semantics
prover. (Naturally, imports still need to be adjusted.)