forked from georgia-tech-db/evadb
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to OpenAI v1.0 #3
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bump Version to v0.3.9+dev --------- Co-authored-by: Jiashen Cao <[email protected]> Co-authored-by: Gaurav Tarlok Kakkar <[email protected]>
This PR aims to solve the following issues: - [x] Throwing error when non-numeric characters are in the data (partially fixes #1243) - [x] Math domain error with `statsforecast`. - [x] Fix GPU support for `neuralforecast`. - ~Neuralforecast support for directly using batched data.~ - ~Auto frequency determination ( #1279).~ Will create separate PRs for the last two points. --------- Co-authored-by: Andy Xu <[email protected]>
1. Removed `config.yml` file. Users can directly use `SET` command. 2. Moved `OPENAI_KEY` to `OPENAI_API_KEY` --------- Co-authored-by: hershd23 <[email protected]> Co-authored-by: Andy Xu <[email protected]>
Please suggest if this feature needs more test cases --------- Co-authored-by: Lohith K S <[email protected]>
…i-Ankith-db_doc_note
We shall add XGBoost classification support in EVADB. --------- Co-authored-by: Jineet Desai <[email protected]> Co-authored-by: Andy Xu <[email protected]>
Profiling on Vector Scan showed that we are spending a lot of time in the post-processing logic doing a Nested Join. This is an initial commit to change that into a Join using Pandas. Change showed ~50% improvement in Similarity Queries.
Moved Function Expression Binder to a separate file
Integrated Milvus vector store into EvaDB. Added a `MilvusVectorStore` class and Milvus type for query parsing and execution. Below are environment values for the use of the Milvus index: * `MILVUS_URI` is the URI of the Milvus instance (which would be http://localhost:19530 when running locally). **This value is required** * `MILVUS_USER` is the name of the user for the Milvus instance. * `MILVUS_PASSWORD` is the password of the user for the Milvus instance. * `MILVUS_DB_NAME` is the name of the database to be used. This will default to the `default` database if not provided. * `MILVUS_TOKEN` is the authorization token for the Milvus instance. --------- Co-authored-by: Andy Xu <[email protected]>
Example notebook added for XGBoost regression and classification. --------- Co-authored-by: Jineet Desai <[email protected]>
…oost (#1327) Let us show the validation score and training time for the XGBoost AutoML model trained. This shall give us fair enough idea on how the model trained on the training data set. --------- Co-authored-by: Jineet Desai <[email protected]>
The environment created in the setup instructions in the documentation calls the environment `test_eva_db`
This PR fixes an issue in CREATE INDEX IF NOT EXISTS command wherein if 'IF NOT EXISTS' is passed, we had an unreferenced variable issue. Added Unit Tests to check the correctness of both the cases. Also reverted the index changes while merging dataframes after vector scan, as it's failing for some cases where indexes can be undefined.
Support semi-colons in string literals for queries of the form: ``` """SELECT ChatGPT("Here's a; question", "This is the context") FROM TAIPAI;""" ``` Also support string escape to run ChatGPT queries more easily: ``` """SELECT ChatGPT('Here\\'s a question', 'This is the context') FROM TAIPAI;""" ```
This PR introduces a generic interface to support 3rd party apps in EVADB. As an example. the template for integrating slack has been added. In a subsequent PR the integration with slack will be completed. --------- Co-authored-by: Gaurav Tarlok Kakkar <[email protected]> Co-authored-by: Joy Arulraj <[email protected]> Co-authored-by: Joy Arulraj <[email protected]> Co-authored-by: Kaushik Ravichandran <[email protected]>
Added separate error handling for ModuleNotFoundError and FileNotFoundError modified: evadb/utils/generic_utils.py
Issue - [721](#721) Currently, we abort the entire process when the load executor encounters a corrupted file.
…d pdf functionality. (#1343) Issue #1067 about not being able to load pdf files, was verified to be working with evadb documentation pdf and a new page for loading pdf is added to the documentation. <img width="1310" alt="Screenshot 2023-11-07 at 1 33 01 AM" src="https://github.com/georgia-tech-db/evadb/assets/32676813/af2fa40b-c8c1-4f3d-b93f-98d0bf278a5b"> Co-authored-by: Lohith K S <[email protected]>
In the previous commit, we added the changes for displaying the train scores and train times for XGBoost. We plan to add similar changes to Ludwig integration as well. --------- Co-authored-by: Jineet Desai <[email protected]> Co-authored-by: Andy Xu <[email protected]>
- Fix the following queries: ``` SELECT * FROM postgres_data.home_rentals where neighborhood='downtown' and number_of_rooms=2; ``` - Improve the error message: Instead of throwing arbitrary mask error, now we raise `Unsupported Logical Operator: ...`.
This PR adds support for creating and dropping jobs in evadb based on this [task](#1248). 1. Jobs can be created using the create job query: > CREATE JOB {job_name} AS { > {job_queries; ...} > } > START {start_time} > END {end_time} > EVERY {repeat_period} {repeat_unit} 2. Created jobs can be dropped using: > DROP JOB {job_name} 3. The scheduled jobs will only be triggered if the job scheduler process is started explicitly using: > EvaDBConnection.start_jobs() 4. The job scheduler process can be stopped using: > EvaDBConnection.stop_jobs() --------- Co-authored-by: Gaurav Tarlok Kakkar <[email protected]>
Provide feedback when `Forecasting` UDF is called in the following ways: - [x] Reporting confidence intervals - [x] Returning a metric for the forecasting performance. - [x] Providing suggestions in simple special cases, such as during Flat predictions. Eg: ```sql SELECT HomeForecast(); ``` ``` SUGGESTION: Predictions are flat. Consider using LIBRARY 'neuralforecast' for more accrate predictions. ``` Partially fixes #1257 and #1243. --------- Co-authored-by: Andy Xu <[email protected]>
Flaml provides support for Sklearn models like Random Forests, KNN, Extra Trees Regressor, and Logistic Regression with regularization. We plan to integrate these ML models into EVADB. Link for Flaml documentation: https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML --------- Co-authored-by: Jineet Desai <[email protected]>
Migrate ChatGPT function to openai v1.0. The test is skipped in circleCI because we must supply the `OPENAI_API_KEY`. The test passes on local machine. - [x] Upgrade ChatGPT function. - [x] Upgrade Dall-e function. - [x] Update unit test cases. - [x] Verify that notebooks work correctly.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Remove empty evadb.db file.
Move github test into long intergration test so they are run on the
Clean up test_relational_api.py
Update link in the github data source documentation
Fix doc links
Merge remote-tracking branch 'origin/roadmap' into staging
checkpoint
checkpoint
Merge remote-tracking branch 'origin/fix_staging' into staging
docs: Update README.md
docs: Update README.md
Add the validation score and training time for create_function in XGB…
Add test_eva_db to gitignore (georgia-tech-db#1336)
Fix georgia-tech-db#1333 dependency and CMD in DockerFile (georgia-te…
CREATE INDEX IF NOT EXISTS is broken. (georgia-tech-db#1337)
Support semicolon and escaped strings in lark (georgia-tech-db#1339)
feat: third party app support in EVADB (georgia-tech-db#1033)
[WIP] Improving error handling messages for Custom Functions (georgia…
logging an error message for invalid files while loading (georgia-tec…
Verified that issue georgia-tech-db#1067 is resolved and added docume…
Add train scores for ludwig in the create function handler. (georgia-…
Make Logic Operators Case Insensitve (georgia-tech-db#1352)
Job scheduler implementation (georgia-tech-db#1308)
Fix python3.8 failing testcases due to type hint (georgia-tech-db#1364)
Add feedback for forecasting (georgia-tech-db#1258)
Adding changes for Flaml Sklearn integration (georgia-tech-db#1361)
Migrate ChatGPT function to openai v1.0 (georgia-tech-db#1368)