Skip to content

Bug: coredump at llama_grammar_accept_token() #1297

@alex1284B

Description

@alex1284B

What happened?

llama-server + mistral vibe, after some work it fails with core dump. After restarting the llama-server and repeating the context llama fails again.

(gdb)  bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x0000772d5904527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x0000772d590288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000772d594a5ff5 in __gnu_cxx::__verbose_terminate_handler () at ../../../../src/libstdc++-v3/libsupc++/vterminate.cc:95
#6  0x0000772d594bb0da in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../src/libstdc++-v3/libsupc++/eh_terminate.cc:48
#7  0x0000772d594a5a55 in std::terminate () at ../../../../src/libstdc++-v3/libsupc++/eh_terminate.cc:58
#8  0x0000772d594bb391 in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x5cc26c0bf790 <typeinfo for std::runtime_error@GLIBCXX_3.4>, dest=0x772d594d2150 <std::runtime_error::~runtime_error()>)
    at ../../../../src/libstdc++-v3/libsupc++/eh_throw.cc:98
#9  0x00005cc26ae7aef5 in llama_grammar_accept_token(llama_grammar&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .cold] ()
#10 0x00005cc26b20a624 in llama_grammar_accept_impl(llama_grammar&, llama_vocab const*, llama_sampling const*, int) ()
#11 0x00005cc26b09175a in common_sampler_accept(llama_sampling_context*, llama_context*, int, bool) ()
#12 0x00005cc26af7af74 in server_context::process_batch_tokens(int&) ()
#13 0x00005cc26af7c280 in server_context::update_slots() ()
#14 0x00005cc26af1be67 in server_queue::start_loop() ()
#15 0x00005cc26ae943ee in main ()

Name and Version

./work/ik_llama.cpp/build/bin/llama-server --version
version: 4191 (1fdbc0d)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

INFO [                    main] HTTP server listening | tid="131036636012736" timestamp=1771667791 n_threads_http="23" port="8080" hostname="0.0.0.0"
INFO [              slots_idle] all slots are idle | tid="131036636012736" timestamp=1771667791
======== Prompt cache: cache size: 0, n_keep: 0, n_discarded_prompt: 0, cache_ram_n_min: 0, f_keep: 0.00, cache_ram_similarity: 0.50
INFO [   launch_slot_with_task] slot is processing task | tid="131036636012736" timestamp=1771667847 id_slot=0 id_task=0
======== Cache: cache_size = 0, n_past0 =  0, n_past1 =  0, n_past_prompt1 = 0,  n_past2 =  0, n_past_prompt2 =  0
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667847 id_slot=0 id_task=0 p0=0
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667851 id_slot=0 id_task=0 p0=2048
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667858 id_slot=0 id_task=0 p0=4096
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667866 id_slot=0 id_task=0 p0=6144
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667875 id_slot=0 id_task=0 p0=8192
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667887 id_slot=0 id_task=0 p0=10240
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667900 id_slot=0 id_task=0 p0=12288
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667915 id_slot=0 id_task=0 p0=14336
INFO [           release_slots] slot released | tid="131036636012736" timestamp=1771667933 id_slot=0 id_task=0 n_ctx=32768 n_past=14840 n_system_tokens=0 n_cache_tokens=14840 truncated=false
slot print_timing: id  0 | task -1 |                                             
prompt eval time =   70315.84 ms / 14586 tokens (    4.82 ms per token,   207.44 tokens per second)
       eval time =   15744.81 ms /   255 tokens (   61.74 ms per token,    16.20 tokens per second)
      total time =   86060.65 ms / 14841 tokens                                  
INFO [              slots_idle] all slots are idle | tid="131036636012736" timestamp=1771667933
INFO [      log_server_request] request | tid="131015620921024" timestamp=1771667933 remote_addr="127.0.0.1" remote_port=38840 status=200 method="POST" path="/v1/chat/completions" params={}
======== Prompt cache: cache size: 14840, n_keep: 0, n_discarded_prompt: 0, cache_ram_n_min: 0, f_keep: 1.00, cache_ram_similarity: 0.50
INFO [   launch_slot_with_task] slot is processing task | tid="131036636012736" timestamp=1771667933 id_slot=0 id_task=263
======== Cache: cache_size = 14840, n_past0 =  14765, n_past1 =  14765, n_past_prompt1 = 14765,  n_past2 =  14767, n_past_prompt2 =  14766
Common part does not match fully                                                 
cache : parameter=todos>                                                         
[                                                                                
  {                                                                              
    "id": "1",                                                                   
    "content": "xxxxxxxx xxxxx xxxxx                                        
prompt: parameter=todos>                                                         
[{'id': '1', 'content': 'xxxxxxxx xxxxx xxxxx xxxx           
INFO [    batch_pending_prompt] kv cache rm [p0, end) | tid="131036636012736" timestamp=1771667933 id_slot=0 id_task=263 p0=14765
terminate called after throwing an instance of 'std::runtime_error'              
  what():  Unexpected empty grammar stack after accepting piece: =search (96598) 
Aborted (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions