Misc. bug: Calculating the position of kv cache error in llama sever

### Name and Version

llama-sever --cache-reuse 1 ...


### Operating systems

_No response_

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell

```

### Problem description & steps to reproduce
Bug for cache reuse：When using the `llama_kv_cache_seq_rm` function, the positions of tokens after `head_c` are offset due to the `kv_shift`. If `head_c` is updated incorrectly or not properly adjusted after the shift, it may cause valid tokens to be removed in subsequent operations. Here's a clear explanation of the process:

1. **Initial KV Cache State**:
   ```
   Cache Tokens: a b c d e  f g h j
   Cell Positions:  0 1 2 3 4 5 6 7 8
   New Tokens: a b e f h j
               0 1 - - - -
   ```

2. **First Operation**:
   - `head_p` is set to 2, and `head_c` is also set to 2.
   - The token 'e' is found, so `head_c` is updated to 4, and `n_match` is set to 2.
   - `kv_shift` is set to -2.
   - Tokens from `head_p` to `head_c` (positions 2 to 4: tokens 'c', 'd') are removed.
       ```
     Cache Tokens: a b c d  e f g h j
     Cell Positions:  0 1 -  -  4 5 6 7 8
     ```
   - The remaining tokens' positions are updated by adding `kv_shift` (-2):
     ```
     Cache Tokens: a b c d e f g h j
     Cell Positions:  0 1 -  - 2 3 4 5 6
     ```
   - `head_p` is updated to `head_p + n_match` (2 + 2 = 4).
   - `head_c` is updated to `head_c + n_match` (4 + 2 = 6).

3. **Second Operation**:
   - `head_p` is 4, and `head_c` is 6.
   - The token 'h' is found, so `head_c` is updated to 7.
   - Tokens from `head_p` to `head_c` (positions 4 to 7: tokens 'g', 'h', 'j') are removed.
     ```
     Cache Tokens:   a b c d e f g h j
     Cell Positions: 0 1 - - 2 3 - - -
     ```

   - After this operation, valid tokens('h', 'j') in the cache are removed because their positions have been shifted incorrectly.

This demonstrates how improper handling of `kv_shift` and `head_c` updates can lead to the unintended removal of valid tokens in the KV cache.




### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Calculating the position of kv cache error in llama sever #12160

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Calculating the position of kv cache error in llama sever #12160

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions