LLM Wiki
A personal knowledge base for machine learning, large language models, and adjacent topics — inspired by Karpathy's LLM Wiki, but broader in scope.
Concepts explained clearly, with code, math, and references.
Browse by Category
- Fundamentals Core concepts of machine learning and neural networks 2 articles →
- Architectures Model architectures: transformers, RNNs, CNNs and beyond 2 articles →
- Training Training techniques, optimization and fine-tuning strategies 1 articles →
- Applications Real-world applications, RAG, agents and tooling 1 articles →
Recent Articles
View all →- Applications
Retrieval-Augmented Generation
Combining retrieval systems with generative models to produce accurate, grounded responses that go beyond the training data.
- Training
Fine-tuning LLMs
Adapting pre-trained LLMs to specific tasks: instruction tuning, RLHF, LoRA, and QLoRA.
- Architectures
Attention Mechanisms
A deep dive into attention mechanisms: scaled dot-product attention, cross-attention, and flash attention.
- Architectures
Transformer Architecture
The "Attention Is All You Need" paper revolutionized NLP. Understanding the encoder-decoder structure, multi-head attention, and positional encodings.