New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Migrate to OpenAI v1.0 #3

Merged

a0x8o merged 69 commits into alexxx-db:master from georgia-tech-db:master

Nov 22, 2023

a0x8o commented Nov 20, 2023

Remove empty evadb.db file.
Move github test into long intergration test so they are run on the
Clean up test_relational_api.py
Update link in the github data source documentation
Fix doc links
Merge remote-tracking branch 'origin/roadmap' into staging
checkpoint
checkpoint
Merge remote-tracking branch 'origin/fix_staging' into staging
docs: Update README.md
docs: Update README.md
Add the validation score and training time for create_function in XGB…
Add test_eva_db to gitignore (georgia-tech-db#1336)
Fix georgia-tech-db#1333 dependency and CMD in DockerFile (georgia-te…
CREATE INDEX IF NOT EXISTS is broken. (georgia-tech-db#1337)
Support semicolon and escaped strings in lark (georgia-tech-db#1339)
feat: third party app support in EVADB (georgia-tech-db#1033)
[WIP] Improving error handling messages for Custom Functions (georgia…
logging an error message for invalid files while loading (georgia-tec…
Verified that issue georgia-tech-db#1067 is resolved and added docume…
Add train scores for ludwig in the create function handler. (georgia-…
Make Logic Operators Case Insensitve (georgia-tech-db#1352)
Job scheduler implementation (georgia-tech-db#1308)
Fix python3.8 failing testcases due to type hint (georgia-tech-db#1364)
Add feedback for forecasting (georgia-tech-db#1258)
Adding changes for Flaml Sklearn integration (georgia-tech-db#1361)
Migrate ChatGPT function to openai v1.0 (georgia-tech-db#1368 )

xzdandy and others added 30 commits

October 4, 2023 02:48


          Remove empty evadb.db file.

920219c


          Move github test into long intergration test so they are run on the

2b5ba6a

circle ci.


          Clean up test_relational_api.py

45af91e


          Update link in the github data source documentation

9d2227e


          Fix doc links

c322ada


          Bump Version to v0.3.9+dev (#1304)

89e4888

Bump Version to v0.3.9+dev

---------

Co-authored-by: Jiashen Cao <[email protected]>
Co-authored-by: Gaurav Tarlok Kakkar <[email protected]>


          Fix current issues with forecasting (#1283)

e19f13d

This PR aims to solve the following issues:

- [x] Throwing error when non-numeric characters are in the data
(partially fixes #1243)
- [x] Math domain error with `statsforecast`.
- [x] Fix GPU support for `neuralforecast`.
- ~Neuralforecast support for directly using batched data.~
- ~Auto frequency determination ( #1279).~

Will create separate PRs for the last two points.

---------

Co-authored-by: Andy Xu <[email protected]>


          Configuration Manager Redesign (#1272)

4640d8f

1. Removed `config.yml` file. Users can directly use `SET` command. 
2. Moved `OPENAI_KEY` to `OPENAI_API_KEY`

---------

Co-authored-by: hershd23 <[email protected]>
Co-authored-by: Andy Xu <[email protected]>


          Fix: minor typo (#1307)

a6fdd6a


          SnowFlake Integration for EvaDB (#1289)

c637a71

Please suggest if this feature needs more test cases

---------

Co-authored-by: Lohith K S <[email protected]>


          add documentation

0f2d1d4


          Starting the changes for XGBoost classification integration.

e026460


          Adding doc changes

0ab5ff8


          Added note

3c25029


          Minor Fix

d25b228


          Merge branch 'db_doc_note' of github.com:Chitti-Ankith/eva into Chitt…

ce64231

…i-Ankith-db_doc_note


          Make the config related changes

81ab7a3


          checkpoint

d340548


          Skip long running model train test.

c6970bd


          Linter

d8f4a1b


          Fix typo

3f0d829


          Starting the changes for XGBoost classification integration. (#1305)

703dc94

We shall add XGBoost classification support in EVADB.

---------

Co-authored-by: Jineet Desai <[email protected]>
Co-authored-by: Andy Xu <[email protected]>


          Add roadmap page to toc

6af77d2


          Merge branch 'staging' into roadmap

482097e


          Convert nested join in Vector Queries to Pandas Merge. (#1298)

f420faa

Profiling on Vector Scan showed that we are spending a lot of time in
the post-processing logic doing a Nested Join. This is an initial commit
to change that into a Join using Pandas. Change showed ~50% improvement
in Similarity Queries.


          checkpoint


          docs: added survey link


          refactor: code refactor (#1319)

af696d6

Moved Function Expression Binder to a separate file


          Add Milvus integration for vector create and search (#1269)

71b9aca

Integrated Milvus vector store into EvaDB. Added a `MilvusVectorStore`
class and Milvus type for query parsing and execution.
Below are environment values for the use of the Milvus index:

* `MILVUS_URI` is the URI of the Milvus instance (which would be
http://localhost:19530 when running locally). **This value is required**
* `MILVUS_USER` is the name of the user for the Milvus instance.
* `MILVUS_PASSWORD` is the password of the user for the Milvus instance.
* `MILVUS_DB_NAME` is the name of the database to be used. This will
default to the `default` database if not provided.
* `MILVUS_TOKEN` is the authorization token for the Milvus instance.

---------

Co-authored-by: Andy Xu <[email protected]>


          Adding colab notebook for XGBoost Regression and Classification (#1316)

7c44d98

Example notebook added for XGBoost regression and classification.

---------

Co-authored-by: Jineet Desai <[email protected]>

xzdandy and others added 29 commits

October 28, 2023 19:04


          Add etc

f8c9c5f


          Add link

339e7e0


          Add details on optimization

4ca964e


          Add automl framework

c57e36e


          Move around links

affa3cf


          Merge remote-tracking branch 'origin/roadmap' into staging

919cd5b


          checkpoint

8dc9283


          checkpoint

2ed8448


          Merge remote-tracking branch 'origin/fix_staging' into staging

8eb9e6b


          docs: Update README.md

1c58afe


          docs: Update README.md

52ff444


          Add the validation score and training time for create_function in XGB…

f409057

…oost (#1327)

Let us show the validation score and training time for the XGBoost
AutoML model trained. This shall give us fair enough idea on how the
model trained on the training data set.

---------

Co-authored-by: Jineet Desai <[email protected]>


          Add test_eva_db to gitignore (#1336)

21aee56

The environment created in the setup instructions in the documentation
calls the environment `test_eva_db`


          Fix #1333 dependency and CMD in DockerFile (#1335)

f56ef82

Update the DockerFile in order to resolve dependency issues along with
fixing the invalid CMD that was previously passed in.


          CREATE INDEX IF NOT EXISTS is broken. (#1337)

c296318

This PR fixes an issue in CREATE INDEX IF NOT EXISTS command wherein if
'IF NOT EXISTS' is passed, we had an unreferenced variable issue. Added
Unit Tests to check the correctness of both the cases.

Also reverted the index changes while merging dataframes after vector
scan, as it's failing for some cases where indexes can be undefined.


          Support semicolon and escaped strings in lark (#1339)

Support semi-colons in string literals for queries of the form:
```
"""SELECT ChatGPT("Here's a; question", "This is the context") FROM TAIPAI;"""
```

Also support string escape to run ChatGPT queries more easily:
```
"""SELECT ChatGPT('Here\\'s a question', 'This is the context') FROM TAIPAI;"""
```


          feat: third party app support in EVADB (#1033)

bd344ed

This PR introduces a generic interface to support 3rd party apps in
EVADB. As an example. the template for integrating slack has been added.
In a subsequent PR the integration with slack will be completed.

---------

Co-authored-by: Gaurav Tarlok Kakkar <[email protected]>
Co-authored-by: Joy Arulraj <[email protected]>
Co-authored-by: Joy Arulraj <[email protected]>
Co-authored-by: Kaushik Ravichandran <[email protected]>


          [WIP] Improving error handling messages for Custom Functions (#1330)

35b772b

Added separate error handling for ModuleNotFoundError and
FileNotFoundError
	modified:   evadb/utils/generic_utils.py


          logging an error message for invalid files while loading (#1334)

64219f1

Issue - [721](#721)

Currently, we abort the entire process when the load executor encounters
a corrupted file.


          checkpoint

bb45db4


          Verified that issue #1067 is resolved and added documentation for loa…

7a49bec

…d pdf functionality. (#1343)

Issue #1067 about not being able to load pdf files, was verified to be
working with evadb documentation pdf and a new page for loading pdf is
added to the documentation.
<img width="1310" alt="Screenshot 2023-11-07 at 1 33 01 AM"
src="https://github.com/georgia-tech-db/evadb/assets/32676813/af2fa40b-c8c1-4f3d-b93f-98d0bf278a5b">

Co-authored-by: Lohith K S <[email protected]>


          Add train scores for ludwig in the create function handler. (#1342)

1c78b22

In the previous commit, we added the changes for displaying the train
scores and train times for XGBoost. We plan to add similar changes to
Ludwig integration as well.

---------

Co-authored-by: Jineet Desai <[email protected]>
Co-authored-by: Andy Xu <[email protected]>


          Make Logic Operators Case Insensitve (#1352)

995f7fe

- Fix the following queries:

```
SELECT * FROM postgres_data.home_rentals where neighborhood='downtown' and number_of_rooms=2;
```

- Improve the error message: Instead of throwing arbitrary mask error,
now we raise `Unsupported Logical Operator: ...`.


          Job scheduler implementation (#1308)

1fbb74f

This PR adds support for creating and dropping jobs in evadb based on
this [task](#1248).

1.  Jobs can be created using the create job query:

   
> CREATE JOB {job_name} AS {
>             {job_queries; ...}
>     }
>     START {start_time}
>     END {end_time}
>     EVERY {repeat_period} {repeat_unit}

2. Created jobs can be dropped using:

> DROP JOB {job_name}

3. The scheduled jobs will only be triggered if the job scheduler
process is started explicitly using:

> EvaDBConnection.start_jobs()

4. The job scheduler process can be stopped using:

> EvaDBConnection.stop_jobs()

---------

Co-authored-by: Gaurav Tarlok Kakkar <[email protected]>


          Fix python3.8 failing testcases due to type hint (#1364)

2575f4f


          Add feedback for forecasting (#1258)

69b39b8

Provide feedback when `Forecasting` UDF is called in the following ways:

- [x] Reporting confidence intervals
- [x] Returning a metric for the forecasting performance.
- [x] Providing suggestions in simple special cases, such as during Flat
predictions.

Eg:
```sql
SELECT HomeForecast();
```

```
SUGGESTION: Predictions are flat. Consider using LIBRARY 'neuralforecast' for more accrate predictions.
```

Partially fixes #1257 and #1243.

---------

Co-authored-by: Andy Xu <[email protected]>


          Adding changes for Flaml Sklearn integration (#1361)

0c25a44

Flaml provides support for Sklearn models like Random Forests, KNN,
Extra Trees Regressor, and Logistic Regression with regularization. We
plan to integrate these ML models into EVADB.
Link for Flaml documentation:
https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML

---------

Co-authored-by: Jineet Desai <[email protected]>


          Migrate ChatGPT function to openai v1.0 (#1368)

5aaa447

Migrate ChatGPT function to openai v1.0.

The test is skipped in circleCI because we must supply the
`OPENAI_API_KEY`. The test passes on local machine.

- [x] Upgrade ChatGPT function.
- [x] Upgrade Dall-e function.
- [x] Update unit test cases.
- [x] Verify that notebooks work correctly.


          v0.3.9 - new release (#1367)

8a8fc46

a0x8o merged commit 8a8fc46 into alexxx-db:master

1 check passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet