Skip to content

Split expectation and strategy computations#98

Draft
JeffersonYeh wants to merge 2 commits intoqvaluefrom
decouple-strategy
Draft

Split expectation and strategy computations#98
JeffersonYeh wants to merge 2 commits intoqvaluefrom
decouple-strategy

Conversation

@JeffersonYeh
Copy link
Collaborator

No description provided.

@JeffersonYeh JeffersonYeh requested a review from Zinoex March 12, 2026 17:58
Copy link
Owner

@Zinoex Zinoex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just small comments. The only really important is regarding views.

prop = system_property(spec),
)

# 2. extract strategy and compute V(s) = max_a Q(s, a)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 2. extract strategy and compute V(s) = max_a Q(s, a)
# 2. extract strategy and compute V'(s) = max_a Q(s, a)

ismaximize(spec),
)

# 3. post process
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 3. post process
# 3. post process to compute V(s) = g(s, V'(s)) where the definition of g depends on the objective

maximize,
) where {R <: Real}

#TODO: can be threaded?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave it be for now - we can always change it later.

) where {R <: Real}

#TODO: can be threaded?
for jₛ in CartesianIndices(source_shape(model))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for jₛ in CartesianIndices(source_shape(model))
@inbounds for jₛ in CartesianIndices(source_shape(model))


Vres[jₛ] = extract_strategy!(
strategy_cache,
Q[:, jₛ],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to avoid expensive copies (expensive = many allocations, not large allocations. Previously it was allocating and copying for every state).

Suggested change
Q[:, jₛ],
@view(Q[:, jₛ]),

# interleaved concat gives shape: (a1, a2) , (s1, s2) => (a1, s1, a2, s2)
dim = (action_values(mp)..., state_values(mp)...)
# concat gives shape: (a1, a2) , (s1, s2) => (a1, a2, s1, s2)
# (a, s) to access s more frequently due to column major
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# (a, s) to access s more frequently due to column major
# (a, s) to access a more frequently due to column major

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants