Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unistore crash on data import #59651

Open
dveeden opened this issue Feb 19, 2025 · 8 comments
Open

unistore crash on data import #59651

dveeden opened this issue Feb 19, 2025 · 8 comments
Labels
component/unistore impact/crash crash/fatal type/bug The issue is confirmed as a bug.

Comments

@dveeden
Copy link
Contributor

dveeden commented Feb 19, 2025

Bug Report

panic: runtime error: index out of range [-1]

goroutine 385 [running]:
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*arena).get(0xc0c4262460?, 0xda8d7a?, 0x3a0b2ff?)
	/workspace/source/tidb/pkg/store/mockstore/unistore/lockstore/arena.go:79 +0x210
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*MemStore).newNode(0xc001210060, 0xc001192000, {0xc0c427a06c, 0x13, 0xc001106e58?}, {0xc1122b6000, 0xda8d4f, 0xc001d41340?}, 0x1)
	/workspace/source/tidb/pkg/store/mockstore/unistore/lockstore/lockstore.go:366 +0x38e
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*MemStore).PutWithHint(0xc001210060, {0xc0c427a06c, 0x13, 0x13}, {0xc1122b6000, 0xda8d4f, 0xda8d4f}, 0xc001106e58?)
	/workspace/source/tidb/pkg/store/mockstore/unistore/lockstore/lockstore.go:316 +0x245
github.com/pingcap/tidb/pkg/store/mockstore/unistore/tikv.writeLockWorker.run({0xc0011a6540?, 0xc00104d480?})
	/workspace/source/tidb/pkg/store/mockstore/unistore/tikv/write.go:170 +0x431
created by github.com/pingcap/tidb/pkg/store/mockstore/unistore/tikv.(*dbWriter).Open in goroutine 1
	/workspace/source/tidb/pkg/store/mockstore/unistore/tikv/write.go:207 +0x147

1. Minimal reproduce step (Required)

Load sql file created with mysqldump with mysql -e source some_file.sql

Config:

port = 3306
path = "/var/tidb"
txn-entry-size-limit = 16777216
allow-expression-index = true

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

v8.5.1 with unistore

@dveeden dveeden added impact/crash crash/fatal type/bug The issue is confirmed as a bug. labels Feb 19, 2025
@dveeden
Copy link
Contributor Author

dveeden commented Feb 19, 2025

Note, we don't have a reproducible way of triggering this yet

@dveeden
Copy link
Contributor Author

dveeden commented Feb 21, 2025

Managed to reproduce this:

cfg/tidb.toml

[performance]
txn-entry-size-limit = 125829120
podman run -p 3307:4000 -v /tmp/cfg:/cfg:Z -it pingcap/tidb:v8.5.1 -config /cfg/tidb.toml
create table t1(id int primary key auto_increment, b longtext);
INSERT INTO t1(b) VALUES(REPEAT('x',10000000));

@dveeden
Copy link
Contributor Author

dveeden commented Feb 21, 2025

This doesn't need a container:

./bin/tidb-server -config <(echo -en "[performance]\ntxn-entry-size-limit = 125829120\n")
$ mysql -h 127.0.0.1 -P 4000 -u root test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2097156
Server version: 8.0.11-TiDB-v8.5.1 TiDB Server (Apache License 2.0) Community Edition, MySQL 8.0 compatible

Copyright (c) 2000, 2025, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql-8.0.11-TiDB-v8.5.1> INSERT INTO t1(b) VALUES(REPEAT('x',10000000));
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1:4000' (111)
ERROR: 
Can't connect to the server

mysql-not_connected> 
[2025/02/21 15:04:18.732 +01:00] [INFO] [2pc.go:693] ["[BIG_TXN]"] [session=2097156] ["key sample"=74800000000000006e5f728000000000007531] [size=10000033] [keys=1] [puts=1] [dels=0] [locks=0] [checks=0] [txnStartTS=456169005703233536]
panic: runtime error: index out of range [-1]

goroutine 44 [running]:
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*arena).get(0xc01b16e9b0?, 0x9896f4?, 0x202aca5?)
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/lockstore/arena.go:79 +0x210
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*MemStore).newNode(0xc0017b8060, 0xc001bae000, {0xc01b16a06c, 0x13, 0x13?}, {0xc01df9e000, 0x9896c9, 0xc001e988c0?}, 0x1)
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/lockstore/lockstore.go:366 +0x38e
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*MemStore).PutWithHint(0xc0017b8060, {0xc01b16a06c, 0x13, 0x13}, {0xc01df9e000, 0x9896c9, 0x9896c9}, 0x3?)
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/lockstore/lockstore.go:316 +0x245
github.com/pingcap/tidb/pkg/store/mockstore/unistore/tikv.writeLockWorker.run({0xc000f9c380?, 0xc0008673c0?})
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/tikv/write.go:170 +0x431
created by github.com/pingcap/tidb/pkg/store/mockstore/unistore/tikv.(*dbWriter).Open in goroutine 1
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/tikv/write.go:207 +0x147

@dveeden
Copy link
Contributor Author

dveeden commented Feb 21, 2025

@dveeden
Copy link
Contributor Author

dveeden commented Feb 21, 2025

With some extra debug logging added:

diff --git a/pkg/store/mockstore/unistore/lockstore/arena.go b/pkg/store/mockstore/unistore/lockstore/arena.go
index bf02e5020e..649316047e 100644
--- a/pkg/store/mockstore/unistore/lockstore/arena.go
+++ b/pkg/store/mockstore/unistore/lockstore/arena.go
@@ -76,6 +76,7 @@ func (a *arena) get(addr arenaAddr, size int) []byte {
        if addr.blockIdx() >= len(a.blocks) {
                log.S().Fatalf("arena.get out of range. len(blocks)=%v, addr.blockIdx()=%v, addr.blockOffset()=%v, size=%v", len(a.blocks), addr.blockIdx(), addr.blockOffset(), size)
        }
+       log.S().Infof("arena.get: len(blocks)=%v, addr.blockIdx()=%v, addr.blockOffset()=%v, size=%v", len(a.blocks), addr.blockIdx(), addr.blockOffset(), size)
        return a.blocks[addr.blockIdx()].get(addr.blockOffset(), size)
 }
 
[2025/02/21 15:22:15.832 +01:00] [INFO] [arena.go:79] ["arena.get: len(blocks)=1, addr.blockIdx()=0, addr.blockOffset()=144, size=16"]
[2025/02/21 15:22:15.832 +01:00] [INFO] [arena.go:79] ["arena.get: len(blocks)=1, addr.blockIdx()=0, addr.blockOffset()=144, size=59"]
[2025/02/21 15:22:15.832 +01:00] [INFO] [arena.go:79] ["arena.get: len(blocks)=1, addr.blockIdx()=0, addr.blockOffset()=144, size=16"]
[2025/02/21 15:22:15.832 +01:00] [INFO] [arena.go:79] ["arena.get: len(blocks)=1, addr.blockIdx()=0, addr.blockOffset()=144, size=59"]
[2025/02/21 15:22:15.844 +01:00] [INFO] [2pc.go:693] ["[BIG_TXN]"] [session=2097154] ["key sample"=74800000000000006e5f728000000000015f91] [size=10000033] [keys=1] [puts=1] [dels=0] [locks=0] [checks=0] [txnStartTS=456169288061681664]
[2025/02/21 15:22:15.849 +01:00] [INFO] [arena.go:79] ["arena.get: len(blocks)=2, addr.blockIdx()=-1, addr.blockOffset()=0, size=10000132"]
panic: runtime error: index out of range [-1]

goroutine 83 [running]:
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*arena).get(0xc01339e0f0, 0x0, 0x989704)
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/lockstore/arena.go:80 +0x393
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*MemStore).newNode(0xc00121a150, 0xc0010a6000, {0xc0133ab06c, 0x13, 0x13?}, {0xc016c1e000, 0x9896c9, 0xc00119fc00?}, 0x3)
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/lockstore/lockstore.go:366 +0x38e
github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*MemStore).PutWithHint(0xc00121a150, {0xc0133ab06c, 0x13, 0x13}, {0xc016c1e000, 0x9896c9, 0x9896c9}, 0x0?)
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/lockstore/lockstore.go:316 +0x245
github.com/pingcap/tidb/pkg/store/mockstore/unistore/tikv.writeLockWorker.run({0xc001242460?, 0xc0011f8040?})
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/tikv/write.go:170 +0x431
created by github.com/pingcap/tidb/pkg/store/mockstore/unistore/tikv.(*dbWriter).Open in goroutine 1
	/home/dvaneeden/dev/pingcap/tidb/pkg/store/mockstore/unistore/tikv/write.go:207 +0x147

@dveeden
Copy link
Contributor Author

dveeden commented Feb 21, 2025

[2025/02/21 15:29:24.286 +01:00] [INFO] [2pc.go:693] ["[BIG_TXN]"] [session=2097154] ["key sample"=74800000000000006e5f72800000000002bf22] [size=10000033] [keys=1] [puts=1] [dels=0] [locks=0] [checks=0] [txnStartTS=456169400372822016]
> [Breakpoint 1] github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.(*arena).get() ./pkg/store/mockstore/unistore/lockstore/arena.go:80 (hits goroutine(419):1 total:1) (PC: 0x4db507b)
    75:	func (a *arena) get(addr arenaAddr, size int) []byte {
    76:		if addr.blockIdx() >= len(a.blocks) {
    77:			log.S().Fatalf("arena.get out of range. len(blocks)=%v, addr.blockIdx()=%v, addr.blockOffset()=%v, size=%v", len(a.blocks), addr.blockIdx(), addr.blockOffset(), size)
    78:		}
    79:		if addr.blockIdx() < 0 {
=>  80:			panic("blockidx negative")
    81:		}
    82:		return a.blocks[addr.blockIdx()].get(addr.blockOffset(), size)
    83:	}
    84:	
    85:	func (a *arena) alloc(size int) arenaAddr {
(dlv) p a
("*github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.arena")(0xc0132e0140)
*github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.arena {
	blockSize: 8388608,
	blocks: []*github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.arenaBlock len: 2, cap: 2, [
		*(*"github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.arenaBlock")(0xc00180c120),
		*(*"github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.arenaBlock")(0xc0132e8300),
	],
	writableQueue: []int len: 0, cap: 1, [],
	pendingBlocks: []github.com/pingcap/tidb/pkg/store/mockstore/unistore/lockstore.pendingBlock len: 0, cap: 0, nil,}
(dlv) p addr
nullArenaAddr (0)

So looks like something tries to do a get on an addr that's null?
(the check for < 0 was added by me to allow me to easily set a breakpoint)

@mjonss
Copy link
Contributor

mjonss commented Feb 21, 2025

There are no direct way of fixing this in Unistore in a small PR, so I would suggest to not allow bigger transactions than 8MB if the current engine is Unistore.

The issue is that every row must fit into a single arena block.

We could of course increase the block size, but what should a reasonable limit be in that case?

Another test:

func TestInsertLargeRow(t *testing.T) {
	store := testkit.CreateMockStore(t)
	tk := testkit.NewTestKit(t, store)
	tk.MustExec("use test")
	tk.MustExec("create table t (id int primary key, b longtext)")
	tk.MustExec("set tidb_txn_entry_size_limit = 1<<23")
	// the unistore arena blocksize is 8MB (8388608 bytes), so Unistore cannot handle larger rows than that!
	// since a row cannot span multiple arena blocks.
	tk.MustExec("insert into t values (1, REPEAT('t',8388493))")
}

@dveeden
Copy link
Contributor Author

dveeden commented Feb 22, 2025

@mjonss could we make it return an error instead of crash?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/unistore impact/crash crash/fatal type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

2 participants