Skip to content

Add GPT-4V sample code and images #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
192 changes: 192 additions & 0 deletions Basic_Samples/GPT-4V/RAG_chatcompletions_example_restapi.ipynb

Large diffs are not rendered by default.

86 changes: 86 additions & 0 deletions Basic_Samples/GPT-4V/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@

# Introduction

This repository contains a collection of Jupyter notebooks demonstrating various use cases for interacting with the GPT-4V API, along with samples demonstrating how to use GPT-4V for Chat Completions via REST API. These examples provide practical guidance and accelerators for developers integrating GPT-4V functionalities in their applications.

## Contents
| Notebook | Description | Type |
|----------|-------------|-------|
| [Basic Image in GPT-4V](basic_chatcompletions_example_restapi.ipynb) | Processing a single image input with GPT-4V. | Image |
| [Handling Multiple Images in GPT-4V](mutiple_images_chatcompletions_example_restapi.ipynb) | Managing multiple image inputs in GPT-4V. | Image |
| [Enhancing GPT-4V with RAG and Custom Data](RAG_chatcompletions_example_restapi.ipynb) | Enhancing capabilities by bringing custom data to augment image inputs in GPT-4V. | Image |
| [Enhancing GPT-4V with Grounding Techniques](enhancement_grounding_chatcompletions_example_restapi.ipynb) | Applying grounding techniques to image inputs in GPT-4V. | Image |
| [Enhancing GPT-4V with OCR Technique](enhancement_OCR_chatcompletions_example_restapi.ipynb) | Incorporating Optical Character Recognition (OCR) with image inputs in GPT-4V. | Image |
| [Basic Video QnA in GPT-4V](video_chatcompletions_example_restapi.ipynb) | Conducting Q&A with video inputs in GPT-4V. | Video |
| [Video Chunk Processing Sequentially in GPT-4V](video_chunk_chatcompletions_example_restapi.ipynb) | Sequential processing of video chunks in GPT-4V. | Video |


## Installation
Install all Python modules and packages listed in the requirements.txt file using the below command.

```python
pip install -r requirements.txt
```

### Microsoft Azure Endpoints
In order to use REST API with Microsoft Azure endpoints, you need to set a series of configurations such as GPT-4V_DEPLOYMENT_NAME, OPENAI_API_BASE, OPENAI_API_VERSION in _config.json_ file.

```js
{
"GPT-4V_DEPLOYMENT_NAME":"<GPT-4V Deployment Name>",
"OPENAI_API_BASE":"https://<Your Azure Resource Name>.openai.azure.com",
"OPENAI_API_VERSION":"<OpenAI API Version>",

"VISION_API_ENDPOINT": "https://<Your Azure Vision Resource Name>.cognitiveservices.azure.com"

"AZURE_SEARCH_SERVICE_ENDPOINT": "https://<Your Azure Search Resource Name>.search.windows.net",
"AZURE_SEARCH_INDEX_NAME": "<Your Azure Search Index Name>",

"VIDEO_SAS_URL": "<Your Azure Blob Storage SAS URL>",
"VIDEO_INDEX_NAME": "<Your Azure Video Index Name>",
"VIDEO_INDEX_ID": "<Your Azure Video Index ID>"
}
```

### For getting started:
- Add "OPENAI_API_KEY", "VISION_API_KEY", and "AZURE_SEARCH_QUERY_KEY" (optional) as variable name and \<Your API Key Value\>, \<Your VISION Key Value\>, and \<Your SEARCH Query Key Value\> (optional) as variable value in the environment variables.
<br>
One can get the OPENAI_API_KEY, VISION_API_KEY, and AZURE_SEARCH_QUERY_KEY values from the Azure Portal. Go to https://portal.azure.com, find your resource and then under "Resource Management" -> "Keys and Endpoints" look for one of the "Keys" values.
<br>

WINDOWS Users:
setx OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"
setx VISION_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"
setx AZURE_SEARCH_QUERY_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"

MACOS/LINUX Users:
export OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"
export VISION_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"
export AZURE_SEARCH_QUERY_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"

- To find your "OPENAI_API_BASE", "VISION_API_ENDPOINT", and "AZURE_SEARCH_SERVICE_ENDPOINT" go to https://portal.azure.com, find your resource and then under "Resource Management" -> "Keys and Endpoints" look for the "Endpoint" value.

Learn more about Azure OpenAI Service REST API [here](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/reference).


## Requirements
Python 3.8+ <br>
Jupyter Notebook 6.5.2


## Usage

Each notebook is self-contained and includes instructions specific to its scenario. Simply open a notebook in Jupyter and follow the steps outlined within it.

## Shared Functions

For convenience, commonly used functions across these notebooks are consolidated in [shared_functions.ipynb](shared_functions.ipynb). Import these functions in any notebook as needed.


## Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.
124 changes: 124 additions & 0 deletions Basic_Samples/GPT-4V/basic_chatcompletions_example_restapi.ipynb

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions Basic_Samples/GPT-4V/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"GPT-4V_DEPLOYMENT_NAME":"<GPT-4V Deployment Name>",
"OPENAI_API_BASE":"https://<Your Azure Resource Name>.openai.azure.com",
"OPENAI_API_VERSION":"<OpenAI API Version>",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should document where to find this configuration setting. For example this link has a list of supported versions:
https://learn.microsoft.com/en-us/azure/ai-services/openai/reference

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also just put a default API version here based on what we know we'll be using at announce. There's no reason to force someone to go figure this out themselves

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should document where to find this configuration setting. For example this link has a list of supported versions: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference

We documented in the readme file.


"VISION_API_ENDPOINT": "https://<Your Azure Vision Resource Name>.cognitiveservices.azure.com",

"AZURE_SEARCH_SERVICE_ENDPOINT": "https://<Your Azure Search Resource Name>.search.windows.net",
"AZURE_SEARCH_INDEX_NAME": "<Your Azure Search Index Name>",

"VIDEO_SAS_URL": "<Your Azure Blob Storage SAS URL>",
"VIDEO_INDEX_NAME": "<Your Azure Video Index Name>",
"VIDEO_INDEX_ID": "<Your Azure Video Index ID>"
}

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions Basic_Samples/GPT-4V/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Packages needed to run the notebook samples
jupyter

# Other packages needed to run the notebook samples
requests
pillow
matplotlib
moviepy
nbformat
Loading