Skip to content

Conversation

@essos-bot
Copy link
Contributor

Motivation

NO.52 功能模块 fastdeploy/model_executor/guided_decoding/ernie_tokenizer.py 单测补充

Modifications

NO.52 功能模块 fastdeploy/model_executor/guided_decoding/ernie_tokenizer.py 单测补充

Usage or Command

no need

Accuracy Tests

no

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

essos-bot and others added 2 commits November 15, 2025 14:17
Add test_ernie_tokenizer.py with unit tests covering:
- Tokenizer initialization with default and custom parameters
- Vocabulary size and token conversion methods
- Tokenization and decoding functionality
- Special tokens handling and sequence building
- Vocabulary saving and serialization
- Edge cases (empty text, Unicode, large vocab)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@paddle-bot
Copy link

paddle-bot bot commented Nov 15, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Nov 15, 2025
essos-bot and others added 3 commits November 15, 2025 14:33
This file appears to be duplicate or misplaced, removing it to clean up the test structure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Remove unused imports (numpy, various protocol classes)
- Fix unused variables by prefixing with underscore
- Format code with black and isort
- Address flake8 linting issues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Update test_vocab_size_property to allow multiple get_piece_size calls
- Modify test_get_vocab to handle additional tokens beyond base vocab size
- Remove incorrect IdToPiece mock expectation since convert_ids_to_tokens uses different implementation
- Fix assertion errors: get_piece_size called 3x, vocab size 1004 vs 1000, IdToPiece call count
- Ensure tests pass in both local and CI environments with proper mock behavior
- All tests now pass pre-commit checks (black, isort, flake8, ruff)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant