Skip to content

Commit f1e13cf

Browse files
Misc updates (openai#1022)
1 parent 2c441ab commit f1e13cf

File tree

67 files changed

+99936
-99927
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+99936
-99927
lines changed

articles/how_to_work_with_large_language_models.md

Lines changed: 23 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@
66

77
The magic of large language models is that by being trained to minimize this prediction error over vast quantities of text, the models end up learning concepts useful for these predictions. For example, they learn:
88

9-
* how to spell
10-
* how grammar works
11-
* how to paraphrase
12-
* how to answer questions
13-
* how to hold a conversation
14-
* how to write in many languages
15-
* how to code
16-
* etc.
9+
- how to spell
10+
- how grammar works
11+
- how to paraphrase
12+
- how to answer questions
13+
- how to hold a conversation
14+
- how to write in many languages
15+
- how to code
16+
- etc.
1717

1818
They do this by “reading” a large amount of existing text and learning how words tend to appear in context with other words, and uses what it has learned to predict the next most likely word that might appear in response to a user request, and each subsequent word after that.
1919

@@ -25,12 +25,12 @@ Of all the inputs to a large language model, by far the most influential is the
2525

2626
Large language models can be prompted to produce output in a few ways:
2727

28-
* **Instruction**: Tell the model what you want
29-
* **Completion**: Induce the model to complete the beginning of what you want
30-
* **Scenario**: Give the model a situation to play out
31-
* **Demonstration**: Show the model what you want, with either:
32-
* A few examples in the prompt
33-
* Many hundreds or thousands of examples in a fine-tuning training dataset
28+
- **Instruction**: Tell the model what you want
29+
- **Completion**: Induce the model to complete the beginning of what you want
30+
- **Scenario**: Give the model a situation to play out
31+
- **Demonstration**: Show the model what you want, with either:
32+
- A few examples in the prompt
33+
- Many hundreds or thousands of examples in a fine-tuning training dataset
3434

3535
An example of each is shown below.
3636

@@ -77,6 +77,7 @@ Output:
7777
Giving the model a scenario to follow or role to play out can be helpful for complex queries or when seeking imaginative responses. When using a hypothetical prompt, you set up a situation, problem, or story, and then ask the model to respond as if it were a character in that scenario or an expert on the topic.
7878

7979
Example scenario prompt:
80+
8081
```text
8182
Your role is to extract the name of the author from any given text
8283
@@ -141,24 +142,22 @@ Large language models aren't only great at text - they can be great at code too.
141142

142143
GPT-4 powers [numerous innovative products][OpenAI Customer Stories], including:
143144

144-
* [GitHub Copilot] (autocompletes code in Visual Studio and other IDEs)
145-
* [Replit](https://replit.com/) (can complete, explain, edit and generate code)
146-
* [Cursor](https://cursor.sh/) (build software faster in an editor designed for pair-programming with AI)
145+
- [GitHub Copilot] (autocompletes code in Visual Studio and other IDEs)
146+
- [Replit](https://replit.com/) (can complete, explain, edit and generate code)
147+
- [Cursor](https://cursor.sh/) (build software faster in an editor designed for pair-programming with AI)
147148

148-
GPT-4 is more advanced than previous models like `text-davinci-002`. But, to get the best out of GPT-4 for coding tasks, it's still important to give clear and specific instructions. As a result, designing good prompts can take more care.
149+
GPT-4 is more advanced than previous models like `gpt-3.5-turbo-instruct`. But, to get the best out of GPT-4 for coding tasks, it's still important to give clear and specific instructions. As a result, designing good prompts can take more care.
149150

150151
### More prompt advice
151152

152153
For more prompt examples, visit [OpenAI Examples][OpenAI Examples].
153154

154155
In general, the input prompt is the best lever for improving model outputs. You can try tricks like:
155156

156-
* **Be more specific** E.g., if you want the output to be a comma separated list, ask it to return a comma separated list. If you want it to say "I don't know" when it doesn't know the answer, tell it 'Say "I don't know" if you do not know the answer.' The more specific your instructions, the better the model can respond.
157-
* **Provide Context**: Help the model understand the bigger picture of your request. This could be background information, examples/demonstrations of what you want or explaining the purpose of your task.
158-
* **Ask the model to answer as if it was an expert.** Explicitly asking the model to produce high quality output or output as if it was written by an expert can induce the model to give higher quality answers that it thinks an expert would write. Phrases like "Explain in detail" or "Describe step-by-step" can be effective.
159-
* **Prompt the model to write down the series of steps explaining its reasoning.** If understanding the 'why' behind an answer is important, prompt the model to include its reasoning. This can be done by simply adding a line like "[Let's think step by step](https://arxiv.org/abs/2205.11916)" before each answer.
160-
161-
157+
- **Be more specific** E.g., if you want the output to be a comma separated list, ask it to return a comma separated list. If you want it to say "I don't know" when it doesn't know the answer, tell it 'Say "I don't know" if you do not know the answer.' The more specific your instructions, the better the model can respond.
158+
- **Provide Context**: Help the model understand the bigger picture of your request. This could be background information, examples/demonstrations of what you want or explaining the purpose of your task.
159+
- **Ask the model to answer as if it was an expert.** Explicitly asking the model to produce high quality output or output as if it was written by an expert can induce the model to give higher quality answers that it thinks an expert would write. Phrases like "Explain in detail" or "Describe step-by-step" can be effective.
160+
- **Prompt the model to write down the series of steps explaining its reasoning.** If understanding the 'why' behind an answer is important, prompt the model to include its reasoning. This can be done by simply adding a line like "[Let's think step by step](https://arxiv.org/abs/2205.11916)" before each answer.
162161

163162
[Fine Tuning Docs]: https://platform.openai.com/docs/guides/fine-tuning
164163
[OpenAI Customer Stories]: https://openai.com/customer-stories

articles/techniques_to_improve_reliability.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -14,25 +14,25 @@ If you were asked to multiply 13 by 17, would the answer pop immediately into yo
1414

1515
Similarly, if you give GPT-3 a task that's too complex to do in the time it takes to calculate its next token, it may confabulate an incorrect guess. Yet, akin to humans, that doesn't necessarily mean the model is incapable of the task. With some time and space to reason things out, the model still may be able to answer reliably.
1616

17-
As an example, if you ask `text-davinci-002` the following math problem about juggling balls, it answers incorrectly:
17+
As an example, if you ask `gpt-3.5-turbo-instruct` the following math problem about juggling balls, it answers incorrectly:
1818

19-
```text-davinci-002
19+
```gpt-3.5-turbo-instruct
2020
Q: A juggler has 16 balls. Half of the balls are golf balls and half of the golf balls are blue. How many blue golf balls are there?
2121
A:
2222
```
2323

24-
```text-davinci-002
24+
```gpt-3.5-turbo-instruct
2525
There are 8 blue golf balls.
2626
```
2727

2828
Does this mean that GPT-3 cannot do simple math problems? No; in fact, it turns out that by prompting the model with `Let's think step by step`, the model solves the problem reliably:
2929

30-
```text-davinci-002
30+
```gpt-3.5-turbo-instruct
3131
Q: A juggler has 16 balls. Half of the balls are golf balls and half of the golf balls are blue. How many blue golf balls are there?
3232
A: Let's think step by step.
3333
```
3434

35-
```text-davinci-002
35+
```gpt-3.5-turbo-instruct
3636
There are 16 balls in total.
3737
Half of the balls are golf balls.
3838
That means that there are 8 golf balls.
@@ -64,9 +64,9 @@ The rest of this article shares techniques for improving reliability of large la
6464

6565
One way to give a model more time and space to think is to break tasks into simpler pieces.
6666

67-
As an example, consider a task where we ask the model a multiple-choice question about some text - in this case, a game of Clue. When asked directly, `text-davinci-002` isn't able to put clues 3 & 5 together, and answers incorrectly:
67+
As an example, consider a task where we ask the model a multiple-choice question about some text - in this case, a game of Clue. When asked directly, `gpt-3.5-turbo-instruct` isn't able to put clues 3 & 5 together, and answers incorrectly:
6868

69-
```text-davinci-002
69+
```gpt-3.5-turbo-instruct
7070
Use the following clues to answer the following multiple-choice question.
7171
7272
Clues:
@@ -84,7 +84,7 @@ Question: Was Colonel Mustard in the observatory with the candlestick?
8484
Solution:
8585
```
8686

87-
```text-davinci-002
87+
```gpt-3.5-turbo-instruct
8888
(c) Unknown; there is not enough information to determine whether Colonel Mustard was in the observatory with the candlestick
8989
```
9090

@@ -96,7 +96,7 @@ However, instead of asking for the answer directly, we can split the task into t
9696
- Second, combine the relevant clues to reason out the answer to the question
9797
- Third, write the final answer: either (a), (b), or (c)
9898

99-
```text-davinci-002
99+
```gpt-3.5-turbo-instruct
100100
Use the following clues to answer the following multiple-choice question, using the following procedure:
101101
(1) First, go through the clues one by one and consider whether the clue is potentially relevant
102102
(2) Second, combine the relevant clues to reason out the answer to the question
@@ -118,7 +118,7 @@ Solution:
118118
(1) First, go through the clues one by one and consider whether the clue is potentially relevant:
119119
```
120120

121-
```text-davinci-002
121+
```gpt-3.5-turbo-instruct
122122
Clue 1: Miss Scarlett was the only person in the lounge. This clue is not relevant.
123123
Clue 2: The person with the pipe was in the kitchen. This clue is not relevant.
124124
Clue 3: Colonel Mustard was the only person in the observatory. This clue is relevant.
@@ -136,9 +136,9 @@ By giving the model more time and space to think, and guiding it along a reasoni
136136

137137
Another benefit of splitting complex instructions into smaller subtasks is that it can help keep the model focused on each subtask.
138138

139-
For example, if we ask `text-davinci-002` to summarize a text in its original language, the model can lapse back into English:
139+
For example, if we ask `gpt-3.5-turbo-instruct` to summarize a text in its original language, the model can lapse back into English:
140140

141-
```text-davinci-002
141+
```gpt-3.5-turbo-instruct
142142
Summarize the text using the original language of the text. The summary should be one sentence long.
143143
144144
Text:
@@ -149,13 +149,13 @@ La estadística (la forma femenina del término alemán Statistik, derivado a su
149149
Summary:
150150
```
151151

152-
```text-davinci-002
152+
```gpt-3.5-turbo-instruct
153153
The text explains that statistics is a science that studies the variability, collection, organization, analysis, interpretation, and presentation of data, as well as the random process that generates them following the laws of probability.
154154
```
155155

156156
However, if we first ask the model to identify the language of the text, and then summarize the text, it becomes more reliable:
157157

158-
```text-davinci-002
158+
```gpt-3.5-turbo-instruct
159159
First, identify the language of the text. Second, summarize the text using the original language of the text. The summary should be one sentence long.
160160
161161
Text:
@@ -166,7 +166,7 @@ La estadística (la forma femenina del término alemán Statistik, derivado a su
166166
Language:
167167
```
168168

169-
```text-davinci-002
169+
```gpt-3.5-turbo-instruct
170170
Spanish
171171
172172
La estadística es una ciencia que estudia la variabilidad, colección, organización, análisis, interpretación, y presentación de los datos, así como el proceso aleatorio que los genera siguiendo las leyes de la probabilidad.
@@ -203,7 +203,7 @@ To learn more, read the [full paper](https://arxiv.org/abs/2205.11916).
203203

204204
If you apply this technique to your own tasks, don't be afraid to experiment with customizing the instruction. `Let's think step by step` is rather generic, so you may find better performance with instructions that hew to a stricter format customized to your use case. For example, you can try more structured variants like `First, think step by step about why X might be true. Second, think step by step about why Y might be true. Third, think step by step about whether X or Y makes more sense.`. And you can even give the model an example format to help keep it on track, e.g.:
205205

206-
```text-davinci-002
206+
```gpt-3.5-turbo-instruct
207207
Using the IRS guidance below, answer the following questions using this format:
208208
(1) For each criterion, determine whether it is met by the vehicle purchase
209209
- {Criterion} Let's think step by step. {explanation} {yes or no, or if the question does not apply then N/A}.
@@ -229,7 +229,7 @@ Solution:
229229
- Does the vehicle have at least four wheels? Let's think step by step.
230230
```
231231

232-
```text-davinci-002
232+
```gpt-3.5-turbo-instruct
233233
The Toyota Prius Prime has four wheels, so the answer is yes.
234234
- Does the vehicle weigh less than 14,000 pounds? Let's think step by step. The Toyota Prius Prime weighs less than 14,000 pounds, so the answer is yes.
235235
- Does the vehicle draw energy from a battery with at least 4 kilowatt hours that may be recharged from an external source? Let's think step by step. The Toyota Prius Prime has a battery with at least 4 kilowatt hours that may be recharged from an external source, so the answer is yes.

articles/text_comparison_examples.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ Embeddings can be used for semantic search, recommendations, cluster analysis, n
88

99
For more information, read OpenAI's blog post announcements:
1010

11-
* [Introducing Text and Code Embeddings (Jan 2022)](https://openai.com/blog/introducing-text-and-code-embeddings/)
12-
* [New and Improved Embedding Model (Dec 2022)](https://openai.com/blog/new-and-improved-embedding-model/)
11+
- [Introducing Text and Code Embeddings (Jan 2022)](https://openai.com/blog/introducing-text-and-code-embeddings/)
12+
- [New and Improved Embedding Model (Dec 2022)](https://openai.com/blog/new-and-improved-embedding-model/)
1313

1414
For comparison with other embedding models, see [Massive Text Embedding Benchmark (MTEB) Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)
1515

@@ -19,14 +19,14 @@ Embeddings can be used for search either by themselves or as a feature in a larg
1919

2020
The simplest way to use embeddings for search is as follows:
2121

22-
* Before the search (precompute):
23-
* Split your text corpus into chunks smaller than the token limit (8,191 tokens for `text-embedding-ada-002`)
24-
* Embed each chunk of text
25-
* Store those embeddings in your own database or in a vector search provider like [Pinecone](https://www.pinecone.io), [Weaviate](https://weaviate.io) or [Qdrant](https://qdrant.tech)
26-
* At the time of the search (live compute):
27-
* Embed the search query
28-
* Find the closest embeddings in your database
29-
* Return the top results
22+
- Before the search (precompute):
23+
- Split your text corpus into chunks smaller than the token limit (8,191 tokens for `text-embedding-3-small`)
24+
- Embed each chunk of text
25+
- Store those embeddings in your own database or in a vector search provider like [Pinecone](https://www.pinecone.io), [Weaviate](https://weaviate.io) or [Qdrant](https://qdrant.tech)
26+
- At the time of the search (live compute):
27+
- Embed the search query
28+
- Find the closest embeddings in your database
29+
- Return the top results
3030

3131
An example of how to use embeddings for search is shown in [Semantic_text_search_using_embeddings.ipynb](../examples/Semantic_text_search_using_embeddings.ipynb).
3232

examples/Classification_using_embeddings.ipynb

Lines changed: 11 additions & 13 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)