You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HI,
with javacpp-pytorch ,we are develop big language model rewrite in java to use multiple GPU, like llma ,deepseek,but tensorRT and Triton are not suit that , so we want to use some MP DP PP MODEL SPLIT Inference tools. like DeepSpeed,but DeepSpeed is write in python ,so we find other tool . now we found
Nvidia TensorRT-LLM https://github.com/NVIDIA/TensorRT-LLM, and
TensorPipe https://github.com/torchpipe/torchpipe/tree/main
all core kernel code write in cpp and has API for python .。
thanks,
The text was updated successfully, but these errors were encountered:
HI,
with javacpp-pytorch ,we are develop big language model rewrite in java to use multiple GPU, like llma ,deepseek,but tensorRT and Triton are not suit that , so we want to use some MP DP PP MODEL SPLIT Inference tools. like DeepSpeed,but DeepSpeed is write in python ,so we find other tool . now we found
Nvidia TensorRT-LLM https://github.com/NVIDIA/TensorRT-LLM, and
TensorPipe https://github.com/torchpipe/torchpipe/tree/main
all core kernel code write in cpp and has API for python .。
thanks,
The text was updated successfully, but these errors were encountered: