train-llm-from-scratch
Developed by FareedKhan-dev
This project is a complete, from-scratch implementation of a Transformer language model using pure PyTorch. Built upon the 'Attention is All You Need' architecture, it enables users to pretrain million- to billion-parameter LLMs. It features a fully hand-written modern post-training and alignment pipeline, including SFT, Reward Model, PPO, DPO, and GRPO algorithms, bypassing high-level wrappers. It also includes a Streamlit UI for training, evaluation, and chatting.
- Pure PyTorch Transformer implementation from scratch
- Full modern alignment suite (SFT, RM, PPO, DPO, GRPO)
- Scalable from 13M to billion-parameter LLMs
- Multi-GPU training support via DDP and bf16
- Built-in Streamlit UI for training and interaction
desktopweb