Could we bring TensorRT-LLM and TensorPipe whole project with javacpp for big language model GPU inference with pytorch ！ #1568

mullerhai · 2025-01-11T03:05:36Z

HI，
with javacpp-pytorch ,we are develop big language model rewrite in java to use multiple GPU, like llma ,deepseek,but tensorRT and Triton are not suit that , so we want to use some MP DP PP MODEL SPLIT Inference tools. like DeepSpeed,but DeepSpeed is write in python ,so we find other tool . now we found
Nvidia TensorRT-LLM https://github.com/NVIDIA/TensorRT-LLM, and
TensorPipe https://github.com/torchpipe/torchpipe/tree/main
all core kernel code write in cpp and has API for python .。

thanks，

mullerhai · 2025-01-11T03:52:31Z

and openai triton https://github.com/triton-lang/triton

saudet added enhancement help wanted labels Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could we bring TensorRT-LLM and TensorPipe whole project with javacpp for big language model GPU inference with pytorch ！ #1568

Could we bring TensorRT-LLM and TensorPipe whole project with javacpp for big language model GPU inference with pytorch ！ #1568

mullerhai commented Jan 11, 2025

mullerhai commented Jan 11, 2025

Could we bring TensorRT-LLM and TensorPipe whole project with javacpp for big language model GPU inference with pytorch ！ #1568

Could we bring TensorRT-LLM and TensorPipe whole project with javacpp for big language model GPU inference with pytorch ！ #1568

Comments

mullerhai commented Jan 11, 2025

mullerhai commented Jan 11, 2025