secp256k1: Harden const time field normalization.#2258
Conversation
ce6c9b3 to
264cbf8
Compare
264cbf8 to
77ac77d
Compare
rstaudt2
left a comment
There was a problem hiding this comment.
This looks good to me, nicely done. I don't see any issues with the logic and also compared the benchmark to what is currently in master.
I was also looking into the possibility of a dynamic approach to detecting a non-constant time function in go and didn't find anything that could quickly be applied, but the approach outlined in this paper is interesting: https://courses.csail.mit.edu/6.857/2015/files/dove-vasiliev.pdf
abd8a97 to
a886dcd
Compare
|
I updated the PR and commit descriptions to include the benchmark statistics across 5 runs of both the previous and new code. |
|
@rstaudt2 Thanks for the review. It is indeed challenging to prove constant time in general. The way I personally go about it is by looking at the assembly output to identify any DDTVs (data-dependent timing variations), but a more dynamic approach would certainly be useful. |
There was a problem hiding this comment.
Wow. That btcd pr sitting there since 2017...
For me, it looks like the bench has slowed from master. Are my commands correct?
$ git checkout master
Switched to branch 'master'
$ go test -bench=BenchmarkFieldNormalize -benchtime 20s > old.txt
$ git checkout 2258
Switched to branch '2258'
$ go test -bench=BenchmarkFieldNormalize -benchtime 20s > new.txt
$ benchstat old.txt new.txt
name old time/op new time/op delta
FieldNormalize-8 18.5ns ± 0% 19.2ns ± 0% ~ (p=1.000 n=1+1)
count 5 and not setting -benchtime
name old time/op new time/op delta
FieldNormalize-8 19.3ns ± 1% 20.1ns ± 1% +4.35% (p=0.008 n=5+5)
|
That |
|
For reference, here is the relevant assembly from both: You can see the old code is doing comparisons via Old: CMPL CX, $4194303
SETEQ DL
MOVL R8, R14
ANDL DI, R8
ANDL R9, R8
ANDL R10, R8
ANDL R11, R8
ANDL R12, R8
ANDL R13, R8
ANDL $67108863, R8
MOVBLZX DL, DX
ANDL $1, DX
CMPL R8, $67108863
MOVL $0, R8
CMOVLEQ DX, R8
ANDL $67108863, SI
ANDL $67108863, BX
LEAL 977(SI), DX
SHRL $26, DX
LEAL (DX)(BX*1), DX
LEAL 64(DX), DX
ANDL $1, R8
CMPL DX, $67108863
MOVL $0, DX
CMOVLHI R8, DX
MOVL CX, R8
SHRL $22, CX
MOVL DX, R15
ORL $1, DX
TESTL CX, CXNew: XCHGL AX, AX
MOVL R13, DX
ANDL R12, R13
ANDL R11, R13
ANDL R10, R13
ANDL R9, R13
ANDL R8, R13
ANDL DI, R13
ANDL $67108863, R13
MOVL CX, R14
XORL $4194303, CX
DECQ CX
SHRQ $63, CX
XORL $67108863, R13
DECQ R13
SHRQ $63, R13
ANDL R13, CX
LEAL 977(SI), R13
SHRL $26, R13
LEAL (R13)(BX*1), R13
LEAL 64(R13), R13
ADDQ $-67108863, R13
NEGQ R13
SHRQ $63, R13
ANDL CX, R13
MOVL R14, CX
SHRL $22, R14
ORL R14, R13 |
This updates the field normalization code to better secure against the possibility of non-constant time operations due to branch prediction and adds several tests to ensure the new logic is sound. The following benchmark results show that this implementation is within the margin of error for it to not be statistically relevant and thus has no performance impact. name old time/op new time/op delta ---------------------------------------------------------------------- FieldNormalize 22.1ns ± 1% 22.1ns ± 1% ~ (p=0.873 n=5+5)
a886dcd to
5b1b776
Compare
This updates the field normalization code to better secure against the possibility of non-constant time operations due to branch prediction and adds several tests to ensure the new logic is sound.
The following benchmark results show that this implementation is within the margin of error for it to not be statistically relevant and thus has no performance impact.
This is primarily based on the work of @bmperrea in btcsuite/btcd#1084 based on questions originally raised by @stevenroose. However, this differs in that it uses an ever so slightly faster implementation by reversing the comparison logic to reduce the number of primitives needed, uses internal functions for constant time comparison, and adds more complete tests for all of the possible combinations.