You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to implement a autoregressive policy network, for example,
in each action, I need to sample target object, and then use the target object embedding to sample the action type, and finally output the action parameters.
But it looks it is impossible to achive this since what _build_mlp_extractor in MaskableActorCriticPolicy asks it to return a action embedding and value embedding.
Is there a good way to achieve the feature?
Example figure:
Thanks!
Checklist
I have checked that there is no similar issue in the repo
❓ Question
Hi there,
I am trying to implement a autoregressive policy network, for example,
in each action, I need to sample target object, and then use the target object embedding to sample the action type, and finally output the action parameters.
But it looks it is impossible to achive this since what
_build_mlp_extractor
inMaskableActorCriticPolicy
asks it to return a action embedding and value embedding.Is there a good way to achieve the feature?
Example figure:
Thanks!
Checklist
The text was updated successfully, but these errors were encountered: