tests: add LLM based test generation tool #59626

YangKeao · 2025-02-18T17:54:55Z

What problem does this PR solve?

Issue Number: close #59625

Problem Summary:

What changed and how does it work?

This PR introduced the LLM test generation tool. It provides a tool for developers to generate and run test cases for any feature with LLM easily. The code includes an example for expression, dml, and cte feature. However, it only found bugs for expression for now (it's kind of expected from my perspective. TiDB is good!).
This tool will also include all generated test cases. It'll help the developers to record the current status of the test sql.

Reference to https://github.com/YangKeao/tidb/blob/add-llm-test/tests/llmtest/README.md if you want to know how to use it.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

YangKeao · 2025-02-18T18:11:35Z

/retest

codecov · 2025-02-18T19:22:03Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.3852%. Comparing base (fdf33cf) to head (f7349e7).

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #59626        +/-   ##
================================================
+ Coverage   73.0024%   73.3852%   +0.3827%     
================================================
  Files          1697       1697                
  Lines        468869     468909        +40     
================================================
+ Hits         342286     344110      +1824     
+ Misses       105529     103740      -1789     
- Partials      21054      21059         +5

Flag	Coverage Δ
integration	`42.7856% <ø> (?)`
unit	`72.1736% <ø> (-0.0289%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`52.6910% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`45.0963% <ø> (+0.0366%)`	⬆️

hawkingrei · 2025-02-19T03:04:25Z

tests/llmtest/main.go

+
+func createGenerateCmd() *cobra.Command {
+	var (
+		openaiToken         string


Can I use the other LLM such as deepseek?

Sure. The API of deepseek is compatible with OpenAI. Actually, I personally used deepseek to run this tool on my machine.

I've ran this tool with OpenAI gpt-4o, claude sonnet and deepseek-r1. Deepseek R1 gave the best result. Nearly every SQLs generated by it are valid, while the SQLs generated by other models will have some grammar errors 🤦 and cannot handle the escape character well. (But I haven't tested with gpt-o1 or newer models).

YangKeao · 2025-02-18T17:58:16Z

tests/llmtest/testdata/dml.json

@@ -0,0 +1,218 @@
+{
+  "delete": [


Actually I'm not sure whether it's good to submit all generated test cases to the repo 🤔 .

YangKeao · 2025-02-19T04:28:21Z

tests/llmtest/main.go

+
+func createGenerateCmd() *cobra.Command {
+	var (
+		openaiToken         string


Sure. The API of deepseek is compatible with OpenAI. Actually, I personally used deepseek to run this tool on my machine.

I've ran this tool with OpenAI gpt-4o, claude sonnet and deepseek-r1. Deepseek R1 gave the best result. Nearly every SQLs generated by it are valid, while the SQLs generated by other models will have some grammar errors 🤦 and cannot handle the escape character well. (But I haven't tested with gpt-o1 or newer models).

bb7133

LGTM, but please try to get CI passed.

ti-chi-bot · 2025-02-19T19:55:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bb7133

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [bb7133]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2025-02-19T19:55:54Z

[LGTM Timeline notifier]

Timeline:

2025-02-19 19:55:54.44326402 +0000 UTC m=+1077596.839486067: ☑️ agreed by bb7133.

YangKeao · 2025-02-20T10:49:55Z

/retest

YangKeao · 2025-02-20T11:14:18Z

/retest

YangKeao · 2025-02-20T13:33:22Z

/retest

Signed-off-by: Yang Keao <[email protected]>

D3Hunter · 2025-02-21T04:04:26Z

tests/llmtest/testdata/expression.json

can we generate this file on the fly, if so there's no need to upload to repo, quite large😓

The output is not stable, so it'll give different queries each time.

If someone want to generate queries which haven't been tested (and have no duplicated issues) yet, these history will be helpful (we could tell LLM to avoid existing queries).

ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 18, 2025

YangKeao force-pushed the add-llm-test branch 3 times, most recently from b167ea9 to edd17d4 Compare February 18, 2025 18:00

YangKeao force-pushed the add-llm-test branch 2 times, most recently from cca5df7 to f20e3ba Compare February 18, 2025 18:59

hawkingrei reviewed Feb 19, 2025

View reviewed changes

YangKeao force-pushed the add-llm-test branch from f20e3ba to 20a72c8 Compare February 19, 2025 04:21

YangKeao commented Feb 19, 2025

View reviewed changes

YangKeao force-pushed the add-llm-test branch 3 times, most recently from a03170b to 0d0045e Compare February 19, 2025 06:03

bb7133 approved these changes Feb 19, 2025

View reviewed changes

ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Feb 19, 2025

YangKeao force-pushed the add-llm-test branch 2 times, most recently from 26a1b2c to 9d6cce0 Compare February 20, 2025 12:13

add LLM based test generation tool

f7349e7

Signed-off-by: Yang Keao <[email protected]>

YangKeao force-pushed the add-llm-test branch from 9d6cce0 to f7349e7 Compare February 20, 2025 13:59

D3Hunter reviewed Feb 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: add LLM based test generation tool #59626

tests: add LLM based test generation tool #59626

YangKeao commented Feb 18, 2025

YangKeao commented Feb 18, 2025

codecov bot commented Feb 18, 2025 •

edited

Loading

hawkingrei Feb 19, 2025

YangKeao Feb 19, 2025

YangKeao Feb 18, 2025

YangKeao Feb 19, 2025

bb7133 left a comment

ti-chi-bot bot commented Feb 19, 2025

ti-chi-bot bot commented Feb 19, 2025

YangKeao commented Feb 20, 2025

YangKeao commented Feb 20, 2025

YangKeao commented Feb 20, 2025

D3Hunter Feb 21, 2025

YangKeao Feb 21, 2025

tests: add LLM based test generation tool #59626

Are you sure you want to change the base?

tests: add LLM based test generation tool #59626

Conversation

YangKeao commented Feb 18, 2025

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

YangKeao commented Feb 18, 2025

codecov bot commented Feb 18, 2025 • edited Loading

Codecov Report

hawkingrei Feb 19, 2025

Choose a reason for hiding this comment

YangKeao Feb 19, 2025

Choose a reason for hiding this comment

YangKeao Feb 18, 2025

Choose a reason for hiding this comment

YangKeao Feb 19, 2025

Choose a reason for hiding this comment

bb7133 left a comment

Choose a reason for hiding this comment

ti-chi-bot bot commented Feb 19, 2025

ti-chi-bot bot commented Feb 19, 2025

[LGTM Timeline notifier]

YangKeao commented Feb 20, 2025

YangKeao commented Feb 20, 2025

YangKeao commented Feb 20, 2025

D3Hunter Feb 21, 2025

Choose a reason for hiding this comment

YangKeao Feb 21, 2025

Choose a reason for hiding this comment

codecov bot commented Feb 18, 2025 •

edited

Loading