Skip to content

Commit afb6a35

Browse files
Merge pull request openai#282 from liuliuOD/fix/technique_to_improve_reliability
[Fix] typo in techniques_to_improve_reliability.md
2 parents db5dcd8 + 965befa commit afb6a35

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

techniques_to_improve_reliability.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ Solution:
8888
(c) Unknown; there is not enough information to determine whether Colonel Mustard was in the observatory with the candlestick
8989
```
9090

91-
Although clues 3 and 5 establish that Colonel Mustard was the only person in the observatory and that the person in the observatory had the candlestick, the models fails to combine them into a correct answer of (a) Yes.
91+
Although clues 3 and 5 establish that Colonel Mustard was the only person in the observatory and that the person in the observatory had the candlestick, the model fails to combine them into a correct answer of (a) Yes.
9292

9393
However, instead of asking for the answer directly, we can split the task into three pieces:
9494

@@ -274,15 +274,15 @@ To learn more, read the [full paper](https://arxiv.org/abs/2201.11903).
274274

275275
#### Implications
276276

277-
One advantage of the few-shot example-based approach relative to the `Let's think step by step` technique is that you can more easily specify the format, length, and style of reasoning that you want the model to perform before landing on its final answer. This can be be particularly helpful in cases where the model isn't initially reasoning in the right way or depth.
277+
One advantage of the few-shot example-based approach relative to the `Let's think step by step` technique is that you can more easily specify the format, length, and style of reasoning that you want the model to perform before landing on its final answer. This can be particularly helpful in cases where the model isn't initially reasoning in the right way or depth.
278278

279279
### Fine-tuned
280280

281281
#### Method
282282

283283
In general, to eke out maximum performance on a task, you'll need to fine-tune a custom model. However, fine-tuning a model using explanations may take thousands of example explanations, which are costly to write.
284284

285-
In 2022, Eric Zelikman and Yuhuai Wu et al. published a clever procedure for using a few-shot prompt to generate a dataset of explanations that could be used to fine-tune a model. The idea is to use a few-shot prompt to generate candidate explanations, and only keep the explanations that produce the correct answer. Then, to get additional explanations for some of the incorrect answers, retry the the few-shot prompt but with correct answers given as part of the question. The authors called their procedure STaR (Self-taught Reasoner):
285+
In 2022, Eric Zelikman and Yuhuai Wu et al. published a clever procedure for using a few-shot prompt to generate a dataset of explanations that could be used to fine-tune a model. The idea is to use a few-shot prompt to generate candidate explanations, and only keep the explanations that produce the correct answer. Then, to get additional explanations for some of the incorrect answers, retry the few-shot prompt but with correct answers given as part of the question. The authors called their procedure STaR (Self-taught Reasoner):
286286

287287
[![STaR procedure](images/star_fig1.png)
288288
<br>Source: *STaR: Bootstrapping Reasoning With Reasoning* by Eric Zelikman and Yujuai Wu et al. (2022)](https://arxiv.org/abs/2203.14465)

0 commit comments

Comments
 (0)