- Triton: making custom triton kernels for better optimizations, working on some big kernel projects
- Cuda: cuda architecture for better understanding of kernels and triton
- Deep Learning: comp vision, NLP etc. : )
- Languages: Python, CUDA, C++
- Frameworks & Libraries: Pytorch, Pandas, Matplotlib, triton, Mpi4py
- Tools & Platforms: GitHub, Docker, Vercel, Neovim, Vscode, Jupyter Notebook, Aws
- Machine Learning Specialist: Proficient in statistical analysis, predictive modeling (Regression, Decision Trees, Random Forest), and advanced algorithms (CatBoost, SGD) with strong focus on optimization and accuracy.
- GPU Sanghathan: Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
- Cuda writer: writing cuda kernels from scratch vec_add to flash_attention and model implementation from scratch.
- Flash attention: Implementation of flash attention in tritonutilization
-
Paligemma-Google: Implemented paligemma vision language model by google from scratch paper
-
Transformer: Implemented Transformer language model by Google from scratch paper
-
Mixture of Experts: Mixture of Experts (MoE) model with a focus on efficient routing and expert
-
Triton/CUDA kernels in my free time : )