t

train-llm-from-scratch

Developed by FareedKhan-dev

This project is a complete, from-scratch implementation of a Transformer language model using pure PyTorch. Built upon the 'Attention is All You Need' architecture, it enables users to pretrain million- to billion-parameter LLMs. It features a fully hand-written modern post-training and alignment pipeline, including SFT, Reward Model, PPO, DPO, and GRPO algorithms, bypassing high-level wrappers. It also includes a Streamlit UI for training, evaluation, and chatting.

  • Pure PyTorch Transformer implementation from scratch
  • Full modern alignment suite (SFT, RM, PPO, DPO, GRPO)
  • Scalable from 13M to billion-parameter LLMs
  • Multi-GPU training support via DDP and bf16
  • Built-in Streamlit UI for training and interaction
desktopweb