Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMS multi-tree signing crashes after first subtree is exhausted. #1966

Open
cr-marcstevens opened this issue Oct 29, 2024 · 1 comment
Open
Assignees

Comments

@cr-marcstevens
Copy link
Contributor

During benchmarking of statefull signatures the only two enabled LMS multi-tree algorithms (LMS_SHA256_H5_W8_H5_W8 and LMS_SHA256_H10_W4_H5_W8) both crash during the signing benchmark loop.

It appears to only crash when the benchmark duration is sufficiently large to exhaust the first subtree.
That is, for LMS_SHA256_H10_W4_H5_W8, generating 31 signatures is okay, but it crashes when generating 32 signatures.

Used command lines & output:

$ mkdir build; cd build
$ cmake -G "Unix Makefiles" -DOQS_ENABLE_SIG_STFL_XMSS=ON -DOQS_ENABLE_SIG_STFL_LMS=ON -DOQS_HAZARDOUS_EXPERIMENTAL_ENABLE_SIG_STFL_KEY_SIG_GEN=ON -DOQS_DIST_BUILD=OFF -DCMAKE_C_COMPILER=clang -DCMAKE_BUILD_TYPE=Debug -DUSE_SANITIZER=Address ..
$ make -j40

$ tests/speed_sig_stfl -d 49 -i LMS_SHA256_H10_W4_H5_W8
Configuration info
==================
Target platform:  x86_64-Linux-6.5.6-300.fc39.x86_64
Compiler:         clang (17.0.6 (Fedora 17.0.6-1.fc39))
Compile options:  [-march=native;-Wa,--noexecstack;-g3;-fno-omit-frame-pointer;-fno-optimize-sibling-calls;-fsanitize-address-use-after-scope;-fsanitize=address;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       71324732640cf4451cab425ac5d80ed9fd41a3af (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 3.1.1 30 May 2023)
AES:              NI
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_LIBJADE_BUILD USE_SANITIZER=Address OQS_OPT_TARGET=auto CMAKE_BUILD_TYPE=Debug
CPU exts compile-time:  ADX AES AVX AVX2 AVX512 BMI1 BMI2 PCLMULQDQ POPCNT SSE SSE2 SSE3

Speed test
==========
Started at 2024-10-29 10:18:53
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean          | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H10_W4_H5_W8              |            |                |                 |            |                           |
keypair                              |          1 |          1.159 |     1158788.000 |      0.000 |                2896982282 |          0
sign                                 |         31 |         49.238 |     1588318.581 |  84141.762 |                3832268891 |  715795960
verify                               |      10136 |         49.002 |        4834.500 |   4113.029 |                  12086165 |   10282590
public key bytes: 60, secret key bytes: 64, signature bytes: 3860
Ended at 2024-10-29 10:20:34

$ tests/speed_sig_stfl -d 50 -i LMS_SHA256_H10_W4_H5_W8
Configuration info
==================
Target platform:  x86_64-Linux-6.5.6-300.fc39.x86_64
Compiler:         clang (17.0.6 (Fedora 17.0.6-1.fc39))
Compile options:  [-march=native;-Wa,--noexecstack;-g3;-fno-omit-frame-pointer;-fno-optimize-sibling-calls;-fsanitize-address-use-after-scope;-fsanitize=address;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       71324732640cf4451cab425ac5d80ed9fd41a3af (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 3.1.1 30 May 2023)
AES:              NI
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_LIBJADE_BUILD USE_SANITIZER=Address OQS_OPT_TARGET=auto CMAKE_BUILD_TYPE=Debug
CPU exts compile-time:  ADX AES AVX AVX2 AVX512 BMI1 BMI2 PCLMULQDQ POPCNT SSE SSE2 SSE3

Speed test
==========
Started at 2024-10-29 10:13:07
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean          | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H10_W4_H5_W8              |            |                |                 |            |                           |
keypair                              |          1 |          1.159 |     1159033.000 |      0.000 |                2897595644 |          0
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2504729==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000018 (pc 0x000000989c80 bp 0x7ffe3e40f800 sp 0x7ffe3e40f160 T0)
==2504729==The signal is caused by a READ memory access.
==2504729==Hint: address points to the zero page.
    #0 0x989c80 in OQS_LMS_NAMESPACE_hss_generate_signature /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/external/hss_sign.c:652:44
    #1 0x98d295 in OQS_LMS_NAMESPACE_hss_sign_init /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/external/hss_sign_inc.c:82:20
    #2 0x8487fc in oqs_sig_stfl_lms_sign /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/sig_stfl_lms_functions.c:591:8
    #3 0x8480fe in OQS_SIG_STFL_alg_lms_sign /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/sig_stfl_lms_functions.c:92:6
    #4 0x50b8d1 in OQS_SIG_STFL_sign /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/sig_stfl.c:1024:42
    #5 0x5084e0 in sig_speed_wrapper /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/tests/speed_sig_stfl.c:114:3
    #6 0x50734b in main /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/tests/speed_sig_stfl.c:249:8
    #7 0x7f8aafc46149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 788cdd41a15985bf8e0a48d213a46e07d58822df)
    #8 0x7f8aafc4620a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2820a) (BuildId: 788cdd41a15985bf8e0a48d213a46e07d58822df)
    #9 0x42a4c4 in _start (/export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/build/tests/speed_sig_stfl+0x42a4c4) (BuildId: 454b0b10237bae477b14dc9bfa56067b6a6cee78)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/external/hss_sign.c:652:44 in OQS_LMS_NAMESPACE_hss_generate_signature
==2504729==ABORTING

Note that the local changes to the repo are minimal and do not affect this bug:

  1. Enable additional LMS algorithms
  2. Add a break statement inside the keygen benchmark loop to just generate 1 key and then continue to benchmark signing. I used this to rule out whether the bug is triggered by calling keygen many times. The bug still appears when calling keygen only once. I kept this change because it speeds up debugging by skipping long keygen benchmarking.
$ git diff
diff --git a/src/oqsconfig.h.cmake b/src/oqsconfig.h.cmake
index dae1baba..1e841f13 100644
--- a/src/oqsconfig.h.cmake
+++ b/src/oqsconfig.h.cmake
@@ -304,3 +304,7 @@
 #cmakedefine OQS_ALLOW_STFL_KEY_AND_SIG_GEN 1
 #cmakedefine OQS_ALLOW_XMSS_KEY_AND_SIG_GEN 1
 #cmakedefine OQS_ALLOW_LMS_KEY_AND_SIG_GEN 1
+#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w1 1
+#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w2 1
+#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w4 1
+#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w8 1
diff --git a/tests/speed_sig_stfl.c b/tests/speed_sig_stfl.c
index ac09ca7b..ac57655c 100644
--- a/tests/speed_sig_stfl.c
+++ b/tests/speed_sig_stfl.c
@@ -105,6 +105,7 @@ static OQS_STATUS sig_speed_wrapper(const char *method_name, uint64_t duration,
                                printf("keygen error. Exiting.\n");
                                exit(-1);
                        }
+                       break;
                        secret_key = reset_secret_key(sig, secret_key);
                })
                // benchmark sign: need to generate new secret key after available signatures have been exhausted

Also note that the allocated virtual memory usage of LMS is huge:

top - 10:04:41 up 145 days, 15:38,  3 users,  load average: 0.11, 0.39, 0.83
Tasks: 823 total,   2 running, 820 sleeping,   1 stopped,   0 zombie
%Cpu(s):  1.3 us,  0.0 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 1546757.+total, 1476665.+free,  16781.3 used,  61316.5 buff/cache
MiB Swap:  16384.0 total,  16384.0 free,      0.0 used. 1529976.+avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                       
2504709 stevens   20   0   20.0t 469120   6400 R  99.7   0.0   0:05.79 speed_sig_stfl                
@ashman-p
Copy link
Contributor

@cr-marcstevens I will take a look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants