Skip to content

Conversation

@wsttiger
Copy link
Collaborator

Add comprehensive configuration support for TensorRT decoder in the real-time decoding system.

Add TensorRT Decoder Configuration Support for Real-Time Decoding

Summary

This PR adds comprehensive configuration support for the TensorRT decoder in the real-time decoding system, enabling users to configure TRT decoders through the standard decoder_config interface with full YAML serialization support.

Motivation

The TensorRT decoder previously lacked integration with the real-time decoding configuration system. This made it impossible to configure TRT decoders using YAML files or the standard decoder_config API that other decoders (nv-qldpc, sliding_window, etc.) support.

Changes

1. Core Configuration Struct

Added trt_decoder_config struct in decoding_config.h with five parameters:

  • onnx_load_path (string): Path to ONNX model file (builds TRT engine)
  • engine_load_path (string): Path to pre-built TensorRT engine (faster startup)
  • engine_save_path (string): Path to save built engine for reuse
  • precision (string): Inference precision mode (fp16, bf16, int8, fp8, tf32, best)
  • memory_workspace (size_t): Workspace memory size in bytes (default: 1GB)

2. Integration with Decoder Config System

  • Added trt_decoder_config to the decoder_config variant
  • Implemented to_heterogeneous_map() and from_heterogeneous_map() methods
  • Added YAML serialization support via MappingTraits<trt_decoder_config>
  • Integrated with set_decoder_custom_args_from_heterogeneous_map() for type dispatch

3. Python Bindings

  • Exposed trt_decoder_config to Python via pybind11
  • All fields are accessible as properties with proper type checking
  • Exported as cudaq_qec.trt_decoder_config()

4. Comprehensive Tests

Added 3 test functions covering:

  • Default value initialization (all fields default to None)
  • Field operations: set, get, clear (parameterized across all 5 fields)
  • YAML roundtrip: serialize to YAML and deserialize back

Usage Example

Python:
import cudaq_qec as qec

Create TRT decoder config

trt = qec.trt_decoder_config()
trt.engine_load_path = "/path/to/model.engine"
trt.precision = "fp16"

Use with decoder_config

dc = qec.decoder_config()
dc.type = "trt_decoder"
dc.set_decoder_custom_args(trt)

YAML roundtrip

yaml_str = dc.to_yaml_str()
dc2 = qec.decoder_config.from_yaml_str(yaml_str)C++:
using namespace cudaq::qec::decoding::config;

// Create TRT decoder config
trt_decoder_config trt;
trt.engine_load_path = "/path/to/model.engine";
trt.precision = "fp16";

// Use with decoder_config
decoder_config dc;
dc.type = "trt_decoder";
dc.decoder_custom_args = trt;

// YAML roundtrip
auto yaml_str = dc.to_yaml_str();
auto dc2 = decoder_config::from_yaml_str(yaml_str);### Testing

All tests pass: ✅ 34/34 in test_decoding_config.py

Test breakdown:

  • 7 new TRT decoder config tests (3 functions, parameterized to 7 test cases)
  • 27 existing tests continue to pass
  • Coverage includes: initialization, field operations, type validation, YAML serialization

Run tests with:
pytest libs/qec/python/tests/test_decoding_config.py -k "trt_decoder" -v### Files Changed

Add comprehensive configuration support for TensorRT decoder in the
real-time decoding system.

Changes:
- Add trt_decoder_config struct with 5 parameters:
  - onnx_load_path: Path to ONNX model file
  - engine_load_path: Path to pre-built TensorRT engine
  - engine_save_path: Path to save built engine
  - precision: Inference precision mode (fp16, bf16, int8, fp8, etc.)
  - memory_workspace: Workspace memory size in bytes
- Add to decoder_config variant and integration with existing config system
- Implement to_heterogeneous_map() and from_heterogeneous_map() conversions
- Add YAML serialization support with MappingTraits template
- Add Python bindings exposing trt_decoder_config to cudaq_qec module
- Add comprehensive test suite (3 test functions, 7 test cases total):
  - Test default values initialization
  - Test field set/get operations (parameterized for all 5 fields)
  - Test YAML roundtrip serialization/deserialization

All tests pass (34/34 in test_decoding_config.py).

Files modified:
- libs/qec/include/cudaq/qec/realtime/decoding_config.h (+25, -1)
- libs/qec/lib/realtime/config.cpp (+40)
- libs/qec/python/bindings/py_decoding_config.cpp (+18)
- libs/qec/python/cudaq_qec/__init__.py (+1)
- libs/qec/python/tests/test_decoding_config.py (+78)

Signed-off-by: Scott Thornton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant