@marc.kaz · Marc Kaz
Saved 2026-06-03 · Posted 2026-05-29 · Status: New
This repo proves you don’t need a datacenter or millions of dollars to train real LLMs.
Ships with:
• Full GPT-style model training from scratch
• Tokenization, architecture & training loop
• Advanced techniques that run on consumer hardware
• Distributed training tricks optimized for single GPU
Everything is open source and explained step-by-step.
👉 https://github.com/FareedKhan-dev/train-llm-from-scratch
Who’s training their own billion-parameter model today? Drop a 🔥
Content ideas (0)
No ideas generated yet. Run /instagram-sync ideate from Claude Code to create some.
Comments (15)
Yeah and the output it generates is complete garbage - worse than gpt 2. Reminds me of markov-chain gibberish from 15 years ago
Why people who does not know shit talks about AI and LLMs
Been training ml models since 2017, let me tell you something, scripts don't matter, if your hardware and dataset is limited. Once you solve those two, than you can decide what scripts are the best.
Idgi, I've been doing this in Claude.
Kinda wanna touch into this , isn’t new been around since 2017-2019, just his own version of the transformer. Ur still heavily limited based on hardware this isn’t a hardware workaround, your not gonna be able to train a good billion parameter model on ur 3090 or 4090 (keyword good) so kinda overhyped with ur wording but still impressive for the dude
Takes 300+ days
🔥
But.......
Bro don’t yap nonsense. This repo is about a small GPT-style transformer built to explain the fundamentals not to train any 1-2b params model. I've made a similar project repo on github, ive made it from scratch and it covers the architecture, tokenization, self-attention, next-token prediction, training, and theory behind how GPT-like models work. If anyone wants to learn the internals rather than just use APIs, feel free to check it out.
You still need like a 3090
🔥
🔥
🔥
🔥
Is this ragebait? Please be ragebait