train-llm-from-scratch

Developed by FareedKhan-dev

ABOUT

This project is a complete, from-scratch implementation of a Transformer language model using pure PyTorch. Built upon the 'Attention is All You Need' architecture, it enables users to pretrain million- to billion-parameter LLMs. It features a fully hand-written modern post-training and alignment pipeline, including SFT, Reward Model, PPO, DPO, and GRPO algorithms, bypassing high-level wrappers. It also includes a Streamlit UI for training, evaluation, and chatting.

CAPABILITIES

Pure PyTorch Transformer implementation from scratch
Full modern alignment suite (SFT, RM, PPO, DPO, GRPO)
Scalable from 13M to billion-parameter LLMs
Multi-GPU training support via DDP and bf16
Built-in Streamlit UI for training and interaction

SUPPORTED PLATFORMS

desktopweb

EXTERNAL RESOURCES

GitHub Repository ↗