-
Stanford University
- Plovdiv, Bulgaria
- https://www.linkedin.com/in/radostin-cholakov-bb4422146/
- @radi_cho
Highlights
- Pro
Stars
Official PyTorch implementation for "Large Language Diffusion Models"
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems
A library for advanced large language model reasoning
Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024
[NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Compressing Large Language Models using Low Precision and Low Rank Decomposition
Free to use landing pages for SaaS developers, freelancers, agencies and businesses
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
Domain Specific Language for the Abstraction and Reasoning Corpus
Diffusion on syntax trees for program synthesis
Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Universal LLM Deployment Engine with ML Compilation
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
A collection of guides and examples for the Gemma open models from Google.
Course "Practical Introduction to Machine Learning with Python" at Sofia University
The official implementation of Self-Play Fine-Tuning (SPIN)
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…
Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability.
Fast K-Medoids clustering in Python with FasterPAM