Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Aug 5, 2024
1 parent 4021ab1 commit 4396ad0
Show file tree
Hide file tree
Showing 2 changed files with 132 additions and 17 deletions.
43 changes: 26 additions & 17 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT-4o Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated August 04, 2024.</p>
<p>Tests are run every day at 1am PT. Last updated August 05, 2024.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand Down Expand Up @@ -176,7 +176,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/ocr.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>"The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day old tea.""</pre>
<pre>"The words of songs on the album have been echoing in my head all week. 'Fades into the grey of my day old tea.'"</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -230,7 +230,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>{'x': 0.45, 'y': 0.35, 'width': 0.2, 'height': 0.25}</pre>
<pre>{'x': 0.52, 'y': 0.35, 'width': 0.24, 'height': 0.4}</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -287,19 +287,19 @@ <h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```json
{
"A": {
"quantity": 15,
"price": 12
"quantity": 20,
"price": 10
},
"B": {
"quantity": 20,
"price": 24
"quantity": 25,
"price": 20
},
"C": {
"quantity": 25,
"quantity": 30,
"price": 30
},
"D": {
"quantity": 30,
"quantity": 33,
"price": 40
}
}
Expand Down Expand Up @@ -358,9 +358,9 @@ <h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/color.png" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>Failed to produce a valid JSON output: {
"R": 79,
"G": 0,
"B": 128
"R": 79,
"G": 0,
"B": 128
}</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
Expand Down Expand Up @@ -401,7 +401,7 @@ <h2>Annotation Quality Assurance</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.017</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.02</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand All @@ -415,13 +415,22 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/annotationqa.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>I see that the cars in the image are labeled with red bounding boxes, but there are a few annotated cars missing. Specifically, the white car on the right side of the image closest to the camera is not annotated, and there is another car on the far-left side that is partially visible but also not annotated.
<pre>Based on the image provided, the following annotations are made for cars:

1. The black vehicle on the left side of the image.
2. The small vehicle in the middle-left part of the image.
3. The vehicle slightly to the right of the middle of the image, next to the black vehicle.
4. The vehicle in front of the vehicle slightly to the right of the middle.
5. The vehicle behind the white vehicle on the right-hand side.
6. The farthest vehicle in the image.

However, there is one vehicle that appears to be missing a red bounding box, which is the large white car on the right side of the image.

Here's the JSON with the number of missing annotations:
Therefore, there is one missing annotation for the cars in the image.

```json
{
"missing": 2
"missing": 1
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -653,7 +662,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/prescription.png" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>[{'name': 'Mary Thomas', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]</pre>
<pre>[{'name': 'MARY THOMAS', 'time_per_day': 1, 'medication': 'ATENOLOL', 'dosage': 100, 'rx_number': '1234567-12345'}]</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down
106 changes: 106 additions & 0 deletions results/2024-08-05.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"zero_shot_classification": {
"score": 1,
"success": true,
"price": 0.00481,
"pass_fail": "Pass",
"response_time": 2.2228715419769287,
"result": "Toyota Camry"
},
"count_fruit": {
"score": 0,
"success": false,
"price": 0.007870000000000002,
"pass_fail": "Fail",
"response_time": 2.370004653930664,
"result": "8"
},
"document_ocr": {
"score": 1,
"success": true,
"price": 0.008539999999999999,
"pass_fail": "Pass",
"response_time": 3.3213083744049072,
"result": "I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times."
},
"handwriting_ocr": {
"score": 0,
"success": false,
"price": 0.00876,
"pass_fail": "Fail",
"response_time": 5.598870515823364,
"result": "\"The words of songs on the album have been echoing in my head all week. 'Fades into the grey of my day old tea.'\""
},
"extraction_ocr": {
"score": 1.0,
"success": true,
"price": 0.007220000000000001,
"pass_fail": "Pass",
"response_time": 3.098576545715332,
"result": "[{'name': 'MARY THOMAS', 'time_per_day': 1, 'medication': 'ATENOLOL', 'dosage': 100, 'rx_number': '1234567-12345'}]"
},
"math_ocr": {
"score": 1.0,
"success": true,
"price": 0.015290000000000002,
"pass_fail": "Pass",
"response_time": 3.4283571243286133,
"result": "3x^2-6x+2"
},
"object_detection": {
"score": 0.4806007509386734,
"success": false,
"price": 0.009490000000000002,
"pass_fail": "Fail",
"response_time": 4.682239294052124,
"result": "{'x': 0.52, 'y': 0.35, 'width': 0.24, 'height': 0.4}"
},
"graph_understanding": {
"score": 0.99,
"success": false,
"price": 0.01079,
"pass_fail": "Fail",
"response_time": 2.607328414916992,
"result": "```json\n{\n \"A\": {\n \"quantity\": 20,\n \"price\": 10\n },\n \"B\": {\n \"quantity\": 25,\n \"price\": 20\n },\n \"C\": {\n \"quantity\": 30,\n \"price\": 30\n },\n \"D\": {\n \"quantity\": 33,\n \"price\": 40\n }\n}\n```"
},
"color_recognition": {
"score": 0,
"success": false,
"price": 0.00875,
"pass_fail": "Fail",
"response_time": 2.211211919784546,
"result": "Failed to produce a valid JSON output: {\n \"R\": 79,\n \"G\": 0,\n \"B\": 128\n}"
},
"annotation_qa": {
"score": 0.33333333333333337,
"success": false,
"price": 0.01974,
"pass_fail": "Fail",
"response_time": 4.946108102798462,
"result": "Based on the image provided, the following annotations are made for cars:\n\n1. The black vehicle on the left side of the image.\n2. The small vehicle in the middle-left part of the image.\n3. The vehicle slightly to the right of the middle of the image, next to the black vehicle.\n4. The vehicle in front of the vehicle slightly to the right of the middle.\n5. The vehicle behind the white vehicle on the right-hand side.\n6. The farthest vehicle in the image.\n\nHowever, there is one vehicle that appears to be missing a red bounding box, which is the large white car on the right side of the image. \n\nTherefore, there is one missing annotation for the cars in the image.\n\n```json\n{\n \"missing\": 1\n}\n```"
},
"measurement": {
"score": 0.8571428571428572,
"success": false,
"price": 0.00877,
"pass_fail": "Fail",
"response_time": 5.730761289596558,
"result": "```json\n{\n \"length\": 3.0,\n \"width\": 3.0\n}\n```"
},
"easy_captcha": {
"score": 1,
"success": true,
"price": 0.004790000000000001,
"pass_fail": "Pass",
"response_time": 1.4070158004760742,
"result": "charybdis indubitable"
},
"easy_captcha_persuade": {
"score": 1,
"success": true,
"price": 0.00529,
"pass_fail": "Pass",
"response_time": 1.3826255798339844,
"result": "charybdis indubitable"
}
}

0 comments on commit 4396ad0

Please sign in to comment.