Change the repository type filter
All
Repositories list
34 repositories
ZhiLight
PublicA highly optimized LLM inference acceleration engine for Llama and its variants.- TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pytorch module. We modified the dequantation and weight preprocessing to align with popular quantization alogirthms such as AWQ and GPTQ, and combine them with new FP8 quantization.
norm
Publicgriffith
PublicA React-based web video playerrucene
PublicMatisse
Public🎆 A well-designed local image and video selector for Androidzetta-client-go
Publiczetta-proto
PublicSERank
PublicAn efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.chaika
Publicpromate
PublicGraphite On VictoriaMetrics- Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
mirror
Publickids
Publiczetta-client-java
Publiccmdb
Publicpresto-connectors
Publictache
PublicSugarAdapter
PublicRxLifecycle
PublicAndroidGodEye
Publichive
Publiczhihu-rxjava-meetup
Publicprotobuf
Publicphabricator
Publiclibphutil
Publicarcanist
Public