Favorite Papers
Attention Is All You Need
Attention Is All You Need
#01
Vaswani et al.
NeurIPS
The death of RNNs and the birth of the parallelizable sequence model.
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
#02
Brown et al.
NeurIPS
GPT-3 and the realization that scale is a quality of its own.
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
#03
Wei et al.
NeurIPS
A simple prompting trick that unlocked emergent logical capabilities.
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
#04
Ouyang et al.
NeurIPS
The core mechanics of RLHF; aligning predicted text with human intent.
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
#05
Bai et al.
Anthropic
Scalable oversight: using a model to supervise another model's safety.
Computing Machinery and Intelligence
Computing Machinery and Intelligence
#06
Alan Turing
Mind
The original question: 'Can machines think?' and the imitation game.
A Mathematical Theory of Communication
A Mathematical Theory of Communication
#07
Claude Shannon
The Bell System Technical Journal
Information entropy defined. The bedrock of every bit we transmit.
Mastering the game of Go with deep neural networks and tree search
Mastering the game of Go with deep neural networks and tree search
#08
Silver et al.
Nature
AlphaGo and the triumph of MCTS pair with deep reinforcement learning.
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
#09
Kaplan et al.
OpenAI
The empirical predictability of error as compute and data grow.
Direct Preference Optimization
Direct Preference Optimization
#10
Rafailov et al.
NeurIPS
Removing the complex reward model from RLHF for simpler alignment.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
#11
Devlin et al.
NAACL
The masked language modeling revolution for context-aware embeddings.
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
#12
He et al.
CVPR
ResNets and the identity mapping that enabled training 1000+ layers.
Generative Adversarial Nets
Generative Adversarial Nets
#13
Goodfellow et al.
NeurIPS
The zero-sum game that redefined synthetic data generation.
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
#14
Hu et al.
ICLR
Fine-tuning billions of parameters by updating only a tiny fraction.
Chinchilla: Training Compute-Optimal Large Language Models
Chinchilla: Training Compute-Optimal Large Language Models
#15
Hoffmann et al.
DeepMind
Challenging the assumption that bigger is always better; data matters.
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
#16
Yao et al.
NeurIPS
Enabling models to branch, look ahead, and backtrack during reasoning.
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
#17
Wang et al.
ICLR
Agents that learn continuously in Minecraft through a code-based skill library.
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Self-Instruct: Aligning Language Models with Self-Generated Instructions
#18
Wang et al.
ACL
Bootstrapping an instruction-tuned model from a raw base model.
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
#19
Touvron et al.
Meta AI
Democratizing state-of-the-art performance for the open-source community.
DALL·E: Zero-Shot Text-to-Image Generation
DALL·E: Zero-Shot Text-to-Image Generation
#20
Ramesh et al.
ICML
Bridging the gap between conceptual text and high-fidelity visuals.
Reinforcement Learning from Human Feedback
Reinforcement Learning from Human Feedback
#21
Christiano et al.
NeurIPS
Teaching a model to backflip using only human preferences.
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
#22
Silver et al.
Science
AlphaZero and the endgame of superhuman generalized game intelligence.
The Second Law of Thermodynamics
The Second Law of Thermodynamics
#23
Clausius / Kelvin
Historical
The inevitable heat death of every system. My favorite physics constraint.
Bitter Lesson
Bitter Lesson
#24
Rich Sutton
Essay
The hard truth: compute-heavy methods eventually crush human-engineered cleverness.
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks
#25
Krizhevsky et al.
NeurIPS
AlexNet: The Big Bang of the modern deep learning era.