Description
I am currently debugging this code that tries to delete the keys in a map (to empty it while retaining bucket storage)
https://github.com/corazawaf/coraza/blob/v3/dev/internal/corazawaf/rulegroup.go#L105
(for the below, note that I tried copying the keys into a slice before deleting to see if it was iterator+delete behavior problem but it was all the same so simplified back to normal map iteration).
I noticed a memory leak and after adding if len(transformationCache) > 0 { panic }
after the deletion could verify that the map wasn't getting cleared with tinygo wasm. It works fine with Go.
I have added this instrumentation to hashmap.go
https://github.com/anuraaga/tinygo/compare/custom-gc...anuraaga:tinygo:hashmapdelete-debug?expand=1
And shrunk down the test to similar one as last time
https://github.com/corazawaf/coraza-proxy-wasm/compare/main...anuraaga:coraza-1225?expand=1#diff-173fbfd8d8844658344b121461b4290d0a85230caae9825240705df8130e8b75R33
~/go/bin/tinygo build -scheduler=none -target=wasi -opt=2 -o mapbench.wasm
The debug output looks something like this
hashmapBinaryGet 0x0001019c 0x0000ffd0 16 1666840535
could find key
hashmapBinaryDelete 0x0001019c 0x0000ffb8 16 1666840535
delete could find key 0x0001019c 0x0000ffb8 1666840535
hashmapBinaryGet 0x0001019c 0x0000ffd0 16 1666840535
couldn't find key
The key address for get is 0x0000ffd0
and delete is 0x0000ffb8
. That being said the hash is the same in this example so it's being able to clear the map but with the same instrumentation when looking at the output for the original coraza code, the hash values were also different. I'm not sure why I wasn't able to reproduce this hash value difference, but either way, the key is in a local variable k
, which there is only one of, so intuitively I suppose those addresses must be the same and the difference is unexpected.
One weird thing I found while making the repro is it seems the value struct needs to be more than 2 strings worth of size to exhibit the behavior - with three fields, get and delete have different addresses, while with two fields they are the same.
Looked through the code in hashmap.go
and map.go
(compiler) and couldn't see anything suspicious, the code paths for get/lookup vs delete look basically the same for both, but the difference does cause our real-world code to fail with the map not being cleared. With the code looking the same, the issue may be in IR reduction?
Note the above simpleish test case approach is also applied to the real world code here (which is where I was observing the address+key value difference)
https://github.com/corazawaf/coraza/compare/v3/dev...anuraaga:scratch?expand=1
The output looks like this (we can see the different hash values)
[2022-12-28 04:04:56.210312][25][info][wasm] [source/extensions/common/wasm/context.cc:1170] wasm log coraza-filter my_vm_id: hashmapBinaryGet 0x09887b70 0x0000fb98 16 2748313615
[2022-12-28 04:04:56.210312][25][info][wasm] [source/extensions/common/wasm/context.cc:1170] wasm log coraza-filter my_vm_id: could find key
[2022-12-28 04:04:56.210312][25][info][wasm] [source/extensions/common/wasm/context.cc:1170] wasm log coraza-filter my_vm_id: hashmapBinaryDelete 0x09887b70 0x0000fbf8 16 3003228291
[2022-12-28 04:04:56.210313][25][info][wasm] [source/extensions/common/wasm/context.cc:1170] wasm log coraza-filter my_vm_id: delete couldn't find key 0x09887b70
[2022-12-28 04:04:56.210313][25][info][wasm] [source/extensions/common/wasm/context.cc:1170] wasm log coraza-filter my_vm_id: hashmapBinaryGet 0x09887b70 0x0000fb98 16 2748313615
[2022-12-28 04:04:56.210314][25][info][wasm] [source/extensions/common/wasm/context.cc:1170] wasm log coraza-filter my_vm_id: could find key