Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: add LLM based test generation tool #59626

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

YangKeao
Copy link
Member

What problem does this PR solve?

Issue Number: close #59625

Problem Summary:

What changed and how does it work?

  1. This PR introduced the LLM test generation tool. It provides a tool for developers to generate and run test cases for any feature with LLM easily. The code includes an example for expression, dml, and cte feature. However, it only found bugs for expression for now (it's kind of expected from my perspective. TiDB is good!).
  2. This tool will also include all generated test cases. It'll help the developers to record the current status of the test sql.

Reference to https://github.com/YangKeao/tidb/blob/add-llm-test/tests/llmtest/README.md if you want to know how to use it.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 18, 2025
@YangKeao YangKeao force-pushed the add-llm-test branch 3 times, most recently from b167ea9 to edd17d4 Compare February 18, 2025 18:00
@YangKeao
Copy link
Member Author

/retest

@YangKeao YangKeao force-pushed the add-llm-test branch 2 times, most recently from cca5df7 to f20e3ba Compare February 18, 2025 18:59
Copy link

codecov bot commented Feb 18, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.3852%. Comparing base (fdf33cf) to head (f7349e7).

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #59626        +/-   ##
================================================
+ Coverage   73.0024%   73.3852%   +0.3827%     
================================================
  Files          1697       1697                
  Lines        468869     468909        +40     
================================================
+ Hits         342286     344110      +1824     
+ Misses       105529     103740      -1789     
- Partials      21054      21059         +5     
Flag Coverage Δ
integration 42.7856% <ø> (?)
unit 72.1736% <ø> (-0.0289%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 45.0963% <ø> (+0.0366%) ⬆️


func createGenerateCmd() *cobra.Command {
var (
openaiToken string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I use the other LLM such as deepseek?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The API of deepseek is compatible with OpenAI. Actually, I personally used deepseek to run this tool on my machine.

I've ran this tool with OpenAI gpt-4o, claude sonnet and deepseek-r1. Deepseek R1 gave the best result. Nearly every SQLs generated by it are valid, while the SQLs generated by other models will have some grammar errors 🤦 and cannot handle the escape character well. (But I haven't tested with gpt-o1 or newer models).

@@ -0,0 +1,218 @@
{
"delete": [
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure whether it's good to submit all generated test cases to the repo 🤔 .


func createGenerateCmd() *cobra.Command {
var (
openaiToken string
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The API of deepseek is compatible with OpenAI. Actually, I personally used deepseek to run this tool on my machine.

I've ran this tool with OpenAI gpt-4o, claude sonnet and deepseek-r1. Deepseek R1 gave the best result. Nearly every SQLs generated by it are valid, while the SQLs generated by other models will have some grammar errors 🤦 and cannot handle the escape character well. (But I haven't tested with gpt-o1 or newer models).

@YangKeao YangKeao force-pushed the add-llm-test branch 3 times, most recently from a03170b to 0d0045e Compare February 19, 2025 06:03
Copy link
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please try to get CI passed.

Copy link

ti-chi-bot bot commented Feb 19, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bb7133

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Feb 19, 2025
Copy link

ti-chi-bot bot commented Feb 19, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-19 19:55:54.44326402 +0000 UTC m=+1077596.839486067: ☑️ agreed by bb7133.

@YangKeao
Copy link
Member Author

/retest

1 similar comment
@YangKeao
Copy link
Member Author

/retest

@YangKeao YangKeao force-pushed the add-llm-test branch 2 times, most recently from 26a1b2c to 9d6cce0 Compare February 20, 2025 12:13
@YangKeao
Copy link
Member Author

/retest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we generate this file on the fly, if so there's no need to upload to repo, quite large😓

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output is not stable, so it'll give different queries each time.

If someone want to generate queries which haven't been tested (and have no duplicated issues) yet, these history will be helpful (we could tell LLM to avoid existing queries).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge the LLM based test generation to the tidb code base
4 participants