Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval bug: llama-cpp-deepseek-r1.jinja template will miss the <think> tag #12107

Open
Sherlock-Holo opened this issue Feb 28, 2025 · 1 comment

Comments

@Sherlock-Holo
Copy link

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: Tesla T4, compute capability 7.5, VMM: yes
version: 4790 (438a839)

Operating systems

Linux

GGML backends

CUDA

Hardware

tesla t4

Models

gguf deepseek-r1:14b, downloaded from ollama

Problem description & steps to reproduce

/root/git/llama.cpp/build/bin/llama-server \
    -m /usr/share/ollama/.ollama/models/blobs/sha256-6e9f90f02bb3b39b59e81916e8cfce9deb45aeaeb9a54a5be4414486b907dc1e \
    -ngl 99 \
    --temp 0.6 \
    --no-webui \
    --host 0.0.0.0 \
    --port 20004 \
    --jinja \
    -fa \
    --chat-template-file /root/git/llama.cpp/models/templates/llama-cpp-deepseek-r1.jinja

this will use llama-cpp-deepseek-r1.jinja as the template, however, when using stream mode, the output content will miss the start <think> tag, but the </think> still exists, if remove the flag --chat-template-file /root/git/llama.cpp/models/templates/llama-cpp-deepseek-r1.jinja, problem gone

First Bad Commit

No response

Relevant log output

/root/git/llama.cpp/build/bin/llama-server \
    -m /usr/share/ollama/.ollama/models/blobs/sha256-6e9f90f02bb3b39b59e81916e8cfce9deb45aeaeb9a54a5be4414486b907dc1e \
    -ngl 99 \
    --temp 0.6 \
    --no-webui \
    --host 0.0.0.0 \
    --port 20004 \
    --jinja \
    -fa \
    --chat-template-file /root/git/llama.cpp/models/templates/llama-cpp-deepseek-r1.jinja

ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: Tesla T4, compute capability 7.5, VMM: yes

system info: n_threads = 20, n_threads_batch = 20, total_threads = 20

system_info: n_threads = 20 (n_threads_batch = 20) / 20 | CUDA : ARCHS = 750 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | AVX512 = 1 | AVX512_VNNI = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 |

Web UI is disabled
main: HTTP server is listening, hostname: 0.0.0.0, port: 20004, http threads: 19
main: loading model
srv    load_model: loading model '/usr/share/ollama/.ollama/models/blobs/sha256-6e9f90f02bb3b39b59e81916e8cfce9deb45aeaeb9a54a5be4414486b907dc1e'
llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14814 MiB free
llama_model_loader: loaded meta data with 26 key-value pairs and 579 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-6e9f90f02bb3b39b59e81916e8cfce9deb45aeaeb9a54a5be4414486b907dc1e (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Qwen 14B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Qwen
llama_model_loader: - kv   4:                         general.size_label str              = 14B
llama_model_loader: - kv   5:                          qwen2.block_count u32              = 48
llama_model_loader: - kv   6:                       qwen2.context_length u32              = 131072
llama_model_loader: - kv   7:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv   8:                  qwen2.feed_forward_length u32              = 13824
llama_model_loader: - kv   9:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  10:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  12:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  15:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  16:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  18:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  19:                tokenizer.ggml.bos_token_id u32              = 151646
llama_model_loader: - kv  20:                tokenizer.ggml.eos_token_id u32              = 151643
llama_model_loader: - kv  21:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  22:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  23:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  24:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  25:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  241 tensors
llama_model_loader: - type q4_K:  289 tensors
llama_model_loader: - type q6_K:   49 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 8.37 GiB (4.87 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 0
print_info: n_ctx_train      = 131072
print_info: n_embd           = 5120
print_info: n_layer          = 48
print_info: n_head           = 40
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 5
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: n_ff             = 13824
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 2
print_info: rope scaling     = linear
print_info: freq_base_train  = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 131072
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 14B
print_info: model params     = 14.77 B
print_info: general.name     = DeepSeek R1 Distill Qwen 14B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151646 '<|begin▁of▁sentence|>'
print_info: EOS token        = 151643 '<|end▁of▁sentence|>'
print_info: EOT token        = 151643 '<|end▁of▁sentence|>'
print_info: PAD token        = 151643 '<|end▁of▁sentence|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|end▁of▁sentence|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = true)
load_tensors: offloading 48 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors:   CPU_Mapped model buffer size =   417.66 MiB
load_tensors:        CUDA0 model buffer size =  8148.38 MiB
...........................................................................................
llama_init_from_model: n_seq_max     = 1
llama_init_from_model: n_ctx         = 4096
llama_init_from_model: n_ctx_per_seq = 4096
llama_init_from_model: n_batch       = 2048
llama_init_from_model: n_ubatch      = 512
llama_init_from_model: flash_attn    = 1
llama_init_from_model: freq_base     = 1000000.0
llama_init_from_model: freq_scale    = 1
llama_init_from_model: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_kv_cache_init: kv_size = 4096, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 48, can_shift = 1
llama_kv_cache_init:      CUDA0 KV buffer size =   768.00 MiB
llama_init_from_model: KV self size  =  768.00 MiB, K (f16):  384.00 MiB, V (f16):  384.00 MiB
llama_init_from_model:  CUDA_Host  output buffer size =     0.58 MiB
llama_init_from_model:      CUDA0 compute buffer size =   307.00 MiB
llama_init_from_model:  CUDA_Host compute buffer size =    18.01 MiB
llama_init_from_model: graph nodes  = 1495
llama_init_from_model: graph splits = 2
common_init_from_params: setting dry_penalty_last_n to ctx_size = 4096
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
srv          init: initializing slots, n_slots = 1
slot         init: id  0 | task -1 | new slot n_ctx_slot = 4096
main: model loaded
main: chat template, chat_template: {%- if not add_generation_prompt is defined -%}
    {%- set add_generation_prompt = false -%}
{%- endif -%}
{%- set ns = namespace(is_first=false, is_tool_outputs=false, is_output_first=true, system_prompt='') -%}
{%- for message in messages -%}
    {%- if message['role'] == 'system' -%}
        {%- set ns.system_prompt = message['content'] -%}
    {%- endif -%}
{%- endfor -%}
{{bos_token}}
{%- if tools %}
You can call any of the following function tools to satisfy the user's requests: {{tools | map(attribute='function') | tojson(indent=2)}}

Example function tool call syntax:

<|tool▁calls▁begin|><|tool▁call▁begin|>function<|tool▁sep|>example_function_name

{
  "arg1": "some_value"
  ...
}

<|tool▁call▁end|><|tool▁calls▁end|>

{% endif -%}
{{ns.system_prompt}}
{%- macro flush_tool_outputs() -%}
    {%- if ns.is_tool_outputs -%}
        {{- '<|tool▁outputs▁end|><|end▁of▁sentence|>' -}}
        {%- set ns.is_tool_outputs = false -%}
    {%- endif -%}
{%- endmacro -%}
{{- flush_tool_outputs() -}}
{%- for message in messages -%}
    {%- if message['role'] != 'tool' -%}
        {{- flush_tool_outputs() -}}
    {%- endif -%}
    {%- if message['role'] == 'user' -%}
        {{- '<|User|>' + message['content'] + '<|end▁of▁sentence|>' -}}
    {%- endif -%}
    {%- if message['role'] == 'assistant' and message['content'] is none -%}
        {{- '<|Assistant|><|tool▁calls▁begin|>' -}}
        {%- set ns.is_first = true -%}
        {%- for tc in message['tool_calls'] -%}
            {%- if ns.is_first -%}
                {%- set ns.is_first = false -%}
            {%- else -%}
                {{- '\n' -}}
            {%- endif -%}
            {%- set tool_name = tc['function']['name'] -%}
            {%- set tool_args = tc['function']['arguments'] -%}
            {{- '<|tool▁call▁begin|>' + tc['type'] + '<|tool▁sep|>' + tool_name + '\n' + '' + '\n' + tool_args + '\n' + '' + '<|tool▁call▁end|>' -}}
        {%- endfor -%}
        {{- '<|tool▁calls▁end|><|end▁of▁sentence|>' -}}
    {%- endif -%}
    {%- if message['role'] == 'assistant' and message['content'] is  not none -%}
        {{- flush_tool_outputs() -}}
        {%- set content = message['content'] -%}
        {%- if '</think>' in content -%}
            {%- set content = content.split('</think>')[-1] -%}
        {%- endif -%}
        {{- '<|Assistant|>' + content + '<|end▁of▁sentence|>' -}}
    {%- endif -%}
    {%- if message['role'] == 'tool' -%}
        {%- set ns.is_tool_outputs = true -%}
        {%- if ns.is_output_first -%}
            {{- '<|tool▁outputs▁begin|>' -}}
            {%- set ns.is_output_first = false -%}
        {%- endif -%}
        {{- '\n<|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>' -}}
    {%- endif -%}
{%- endfor -%}
{{- flush_tool_outputs() -}}
{%- if add_generation_prompt and not ns.is_tool_outputs -%}
    {{- '<|Assistant|><think>\n' -}}
{%- endif -%}, example_format: 'You are a helpful assistant<|User|>Hello<|end▁of▁sentence|><|Assistant|>Hi there<|end▁of▁sentence|><|User|>How are you?<|end▁of▁sentence|><|Assistant|><think>
'
main: server is listening on http://0.0.0.0:20004 - starting the main loop
srv  update_slots: all slots are idle
srv  params_from_: Chat format: DeepSeek R1 (extract reasoning)
slot launch_slot_: id  0 | task 0 | processing task
slot update_slots: id  0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 12
slot update_slots: id  0 | task 0 | kv cache rm [0, end)
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 12, n_tokens = 12, progress = 1.000000
slot update_slots: id  0 | task 0 | prompt done, n_past = 12, n_tokens = 12
slot      release: id  0 | task 0 | stop processing: n_past = 84, truncated = 0
slot print_timing: id  0 | task 0 |
prompt eval time =     128.41 ms /    12 tokens (   10.70 ms per token,    93.45 tokens per second)
       eval time =    3607.62 ms /    73 tokens (   49.42 ms per token,    20.23 tokens per second)
      total time =    3736.03 ms /    85 tokens
srv  update_slots: all slots are idle
srv  log_server_r: request: POST /v1/chat/completions 10.30.200.19 200

use http POST http://gpu-dev:20004/v1/chat/completions \
    model=deepseek-r1-14b-cpp \
    messages:='[{"role": "user", "content": "非常详细地介绍一下你自己?"}]' \
    stream:=true

to send request, and get 

HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Type: text/event-stream
Keep-Alive: timeout=5, max=100
Server: llama.cpp
Transfer-Encoding: chunked

data: {
    "choices": [
        {
            "delta": {
                "content": "您好"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "!"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "我是"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "由"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "中国的"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "深度"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "求"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "索"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "("
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Deep"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Seek"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": ")"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "公司"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734646,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "开发"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "的"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "智能"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "助手"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Deep"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Seek"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "-R"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "1"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "。"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "如"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "您"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "有任何"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "任何"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "问题"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": ","
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "我会"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "尽"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "我"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "所能"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "为您提供"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734647,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "帮助"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "。\n"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "</think>"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "\n\n"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "您好"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "!"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "我是"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "由"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "中国的"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "深度"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "求"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "索"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "("
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Deep"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Seek"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": ")"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "公司"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "开发"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "的"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "智能"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734648,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "助手"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Deep"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "Seek"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "-R"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "1"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "。"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "如"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "您"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "有任何"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "任何"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "问题"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": ","
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "我会"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "尽"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "我"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "所能"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "为您提供"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "帮助"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": "。"
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {
                "content": ""
            },
            "finish_reason": null,
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392"
}

data: {
    "choices": [
        {
            "delta": {},
            "finish_reason": "stop",
            "index": 0
        }
    ],
    "created": 1740734649,
    "id": "chatcmpl-huCXGDWAI250H1Sid6SfwT2F6QcrjN8p",
    "model": "deepseek-r1-14b-cpp",
    "object": "chat.completion.chunk",
    "system_fingerprint": "b4790-438a8392",
    "timings": {
        "predicted_ms": 3607.618,
        "predicted_n": 73,
        "predicted_per_second": 20.234958357564466,
        "predicted_per_token_ms": 49.41942465753424,
        "prompt_ms": 128.408,
        "prompt_n": 12,
        "prompt_per_second": 93.45212136315494,
        "prompt_per_token_ms": 10.700666666666665
    },
    "usage": {
        "completion_tokens": 73,
        "prompt_tokens": 12,
        "total_tokens": 85
    }
}

data: [DONE]
@Sherlock-Holo
Copy link
Author

if remove --chat-template-file /root/git/llama.cpp/models/templates/llama-cpp-deepseek-r1.jinja flag, problem gone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant