Skip to content

Conversation

@hbierlee
Copy link
Contributor

By adding encoding variables to CSE, e.g. for the direct encoding for x which encodes b == ( x == 2 ), adding cse[x==2] = b ensures b is used whenever we see the expression x==2.

In the following test, we see how this improves the encoding of (x == 0) | (x == 2) for x in 0..2:

old: [(BV1) or (BV2), sum([⟦x == 0⟧, ⟦x == 1⟧, ⟦x == 2⟧]) == 1, (BV1) -> (⟦x == 0⟧), (~BV1) -> (sum([0, 1, 2, -3] * (⟦x == 0⟧, ⟦x == 1⟧, ⟦x == 2⟧, BV3)) <= -1), (~BV1) -> (sum([0, 1, 2, -1] * (⟦x == 0⟧, ⟦x == 1⟧, ⟦x == 2⟧, BV3)) >= 0), sum([1, -1] * (BV1, ~BV3)) <= 0, (BV2) -> (⟦x == 2⟧), (~BV2) -> (sum([0, 1, 2, -1] * (⟦x == 0⟧, ⟦x == 1⟧, ⟦x == 2⟧, BV4)) <= 1), (~BV2) -> (sum([0, 1, 2, -3] * (⟦x == 0⟧, ⟦x == 1⟧, ⟦x == 2⟧, BV4)) >= 0), sum([1, -1] * (BV2, ~BV4)) <= 0]
new: [(⟦x == 0⟧) or (⟦x == 2⟧), sum([⟦x == 0⟧, ⟦x == 1⟧, ⟦x == 2⟧]) == 1]

However, this does mean integer variables need to be collected and encoded before the other transformations. This may be an encoding inefficiency, plus it does mean we cannot inspect the constraint that holds the variable (an inequality constraint normally uses order encoding, an (dis)equality the direct encoding).

Another problem is that the int2bool transformation requires boilerplate across solvers. This should be solved in a separate PR, I think. This optimization is now only added to pindakaas, but should also happen for the other solvers which use int2bool.

@tias
Copy link
Collaborator

tias commented Nov 6, 2025

int2bool should definitely get the csemap, always...

I'm not sure I like the printing change... it makes a variable look like a special thing... while its just a boolean variable; if only because those types of brackets cannot be written by a user... you could also just do BV(x==i) as name?

finaly, is adding cse to int2bool not sufficient? what happens if you just do that, and do not declare the boolean variables upfront?

(I'm all for improving the encoding output btw!)

@tias
Copy link
Collaborator

tias commented Nov 13, 2025

I think the problem is in linearize, and that it should be fixed there.

What you do now is that you first create the literals, and then flatten will not generate any b -> (x=v) or b = (x=v) constraints. But you have to do this for all intvars upfront with the domain encoding.

I think we can 'detect' it instead, by splitting linearize into two parts:

  • Part 1: everything now, except every time you encounter b -> (x=v) or b = (x=v) or -b -> (x!=v) or 1b1 + 2b2 + ... = x; you store those Booleans in a map[x][v]=b.
  • Part 2: for all integers x that have only one map[x][v]=b, you do the standard b = (x=v) Big-M encoding.
    For all integers x that have more than one value in the map, you create the full domain decomposition sum_v b_v = 1, sum_v v*b_v = x where you reuse seen b_v's and create new b_v's for the others.

Optional in part 2: only add sum_v b_v = 1 and walk over your linearized constraints, substituting x by sum_v v*b_v; store the intvarmap in the solver so it can reconstruct the value after solve.

If you do it this way, then it works for linearize, pb and sat all in once...

To access the ivarmap of a solver (yes, it would have to be added to each solver calling linearize, that is fine) you can make it an optional argument of the linearize transformation.

@IgnaceBleukx what do you think?

@IgnaceBleukx
Copy link
Collaborator

Yes, I think indeed the core issue lies in linearize here and we should make the encoding variables there for categorical variables.

I quite like the two steps, not sure if it should become part of linearize or become it's own transformation? Probably linearize is fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants