Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Oct 12, 2024
1 parent ee339dc commit 4eec676
Show file tree
Hide file tree
Showing 2 changed files with 144 additions and 38 deletions.
76 changes: 38 additions & 38 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT-4o Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated October 11, 2024.</p>
<p>Tests are run every day at 1am PT. Last updated October 12, 2024.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand All @@ -58,12 +58,12 @@ <h1>How's GPT-4o Doing?</h1>
<div class="feature_header" style="min-height: auto">
<div class="feature_header_text" style="gap: var(--spacing-sizing-4)">
<h2>Response Time</h2>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.97 seconds</b> per request.</p>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.96 seconds</b> per request.</p>
<p class="subtitle">This number only accounts for requests made by this application.</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
<p>3.97 s</p>
<p>3.96 s</p>
</div>
</div>
</div>
Expand Down Expand Up @@ -127,12 +127,12 @@ <h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
</div>
</div>
</div>

<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
<h2>Handwriting OCR</h2>
<p>Can GPT-4V read handwriting?</p>
<h2>Document OCR</h2>
<p>Can GPT-4V read a document and return the exact characters in the text?</p>
</div>
<div class="chart">
<div class="chart_box chart_box_red">
Expand Down Expand Up @@ -168,20 +168,20 @@ <h2>Handwriting OCR</h2>
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
<div class="explainer">
<h3><span class="explainer_icon far fa-microscope"></span>Method</h3>
<pre class="test_method">We send a image of a handwritten note to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.</pre>
<pre class="test_method">We send a screenshot of a typed document to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.</pre>
<h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<pre class="prompt">
Read the text in the image. Return only the text, with punctuation.
</pre>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/ocr.jpeg" alt="Image of the input into GPT-4" />
<img class="test_image" src="images/swift.png" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day oh tea."</pre>
<pre>I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the *Midnights* album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times.</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
</div>

<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
Expand Down Expand Up @@ -230,7 +230,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>{'x': 0.5, 'y': 0.5, 'width': 0.2, 'height': 0.4}</pre>
<pre>{'x': 0.55, 'y': 0.4, 'width': 0.25, 'height': 0.2}</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -286,22 +286,22 @@ <h3><span class="explainer_icon far fa-image"></span>Image</h3>
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```json
{
"A": {
"quantity": 20,
"price": 10
},
"B": {
"quantity": 25,
"price": 20
},
"C": {
"quantity": 30,
"price": 30
},
"D": {
"quantity": 35,
"price": 40
}
"A": {
"quantity": 20,
"price": 10
},
"B": {
"quantity": 25,
"price": 20
},
"C": {
"quantity": 30,
"price": 30
},
"D": {
"quantity": 35,
"price": 40
}
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -359,9 +359,9 @@ <h3><span class="explainer_icon far fa-image"></span>Image</h3>
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```json
{
"R": 85,
"G": 22,
"B": 137
"R": 82,
"G": 0,
"B": 127
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -534,7 +534,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/easy_captcha.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>i'm sorry, i can't assist with that.</pre>
<pre>i'm sorry, but i can't help with that.</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://charlesfrye.github.io/" target="_blank">Charles Frye</a></p>
</div>
</div>
Expand Down Expand Up @@ -656,12 +656,12 @@ <h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
</div>
</div>
</div>

<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
<h2>Document OCR</h2>
<p>Can GPT-4V read a document and return the exact characters in the text?</p>
<h2>Handwriting OCR</h2>
<p>Can GPT-4V read handwriting?</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
Expand Down Expand Up @@ -697,20 +697,20 @@ <h2>Document OCR</h2>
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
<div class="explainer">
<h3><span class="explainer_icon far fa-microscope"></span>Method</h3>
<pre class="test_method">We send a screenshot of a typed document to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.</pre>
<pre class="test_method">We send a image of a handwritten note to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.</pre>
<h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<pre class="prompt">
Read the text in the image. Return only the text, with punctuation.
</pre>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/swift.png" alt="Image of the input into GPT-4" />
<img class="test_image" src="images/ocr.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times.</pre>
<pre>The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day old tea."</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
</div>

<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
Expand Down
106 changes: 106 additions & 0 deletions results/2024-10-12.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"zero_shot_classification": {
"score": 1,
"success": true,
"price": 0.00481,
"pass_fail": "Pass",
"response_time": 2.093273162841797,
"result": "Toyota Camry"
},
"count_fruit": {
"score": 0,
"success": false,
"price": 0.007870000000000002,
"pass_fail": "Fail",
"response_time": 2.1956775188446045,
"result": "7"
},
"document_ocr": {
"score": 0,
"success": false,
"price": 0.0086,
"pass_fail": "Fail",
"response_time": 3.04714298248291,
"result": "I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the *Midnights* album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times."
},
"handwriting_ocr": {
"score": 1,
"success": true,
"price": 0.00876,
"pass_fail": "Pass",
"response_time": 4.343177080154419,
"result": "The words of songs on the album have been echoing in my head all week. \"Fades into the grey of my day old tea.\""
},
"extraction_ocr": {
"score": 1.0,
"success": true,
"price": 0.00719,
"pass_fail": "Pass",
"response_time": 3.354351758956909,
"result": "[{'name': 'Mary Thomas', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]"
},
"math_ocr": {
"score": 1.0,
"success": true,
"price": 0.015290000000000002,
"pass_fail": "Pass",
"response_time": 5.292491912841797,
"result": "3x^2-6x+2"
},
"object_detection": {
"score": 0.30906593406593397,
"success": false,
"price": 0.009490000000000002,
"pass_fail": "Fail",
"response_time": 2.4365034103393555,
"result": "{'x': 0.55, 'y': 0.4, 'width': 0.25, 'height': 0.2}"
},
"graph_understanding": {
"score": 0.99,
"success": false,
"price": 0.01079,
"pass_fail": "Fail",
"response_time": 3.2758848667144775,
"result": "```json\n{\n \"A\": {\n \"quantity\": 20,\n \"price\": 10\n },\n \"B\": {\n \"quantity\": 25,\n \"price\": 20\n },\n \"C\": {\n \"quantity\": 30,\n \"price\": 30\n },\n \"D\": {\n \"quantity\": 35,\n \"price\": 40\n }\n}\n```"
},
"color_recognition": {
"score": 0.9529411764705882,
"success": false,
"price": 0.008870000000000001,
"pass_fail": "Fail",
"response_time": 3.1606976985931396,
"result": "```json\n{\n \"R\": 82,\n \"G\": 0,\n \"B\": 127\n}\n```"
},
"annotation_qa": {
"score": 0.33333333333333337,
"success": false,
"price": 0.015300000000000001,
"pass_fail": "Fail",
"response_time": 3.328705310821533,
"result": "```json\n{\n \"missing\": 1\n}\n```"
},
"measurement": {
"score": 0.8571428571428572,
"success": false,
"price": 0.00877,
"pass_fail": "Fail",
"response_time": 3.9818081855773926,
"result": "```json\n{\n \"length\": 3.0,\n \"width\": 3.0\n}\n```"
},
"easy_captcha": {
"score": 0,
"success": false,
"price": 0.004880000000000001,
"pass_fail": "Fail",
"response_time": 2.4497487545013428,
"result": "i'm sorry, but i can't help with that."
},
"easy_captcha_persuade": {
"score": 0,
"success": false,
"price": 0.005350000000000001,
"pass_fail": "Fail",
"response_time": 2.689030647277832,
"result": "i'm sorry, i can't assist with that."
}
}

0 comments on commit 4eec676

Please sign in to comment.