Fix: Default value of cosine_min_value_wrong
parameter
#305
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
According to the semantics of the parameter name, min_value should be smaller than max_value, but the original default value does not meet this point and is inconsistent with the correct default value of the
get_cosine_scaled_reward
function in rewards.py.In the
get_cosine_scaled_reward
function, min_value and max_value will be exchanged when the question is wrong:Then in the formula
max_value - min_value
, the correct default value will get a negative value, but using the current default value will get a positive value, so that the shorter the wrong question is, the higher the score will be. This is inconsistent with the description ofLonger incorrect solutions are penalized less than shorter ones.
in theget_cosine_scaled_reward
function comment.