Andrej Karpathy on the Dawn of Software 3.0: Programming with Natural Language and AI Agents
LectureY Combinator•111,724 views•Jun 19, 2025
Exploring how large language models are transforming software development into a new era where English prompts become code and AI acts as partial autonomy agents.
Blurb
Key Points from Andrej Karpathy's AI Startup School Keynote
- Software is undergoing a fundamental shift into "Software 3.0," where large language models (LLMs) become programmable computers using natural language prompts.
- LLMs combine traits of utilities, fabs, and operating systems, representing a new computing era akin to the 1960s mainframe time-sharing model.
- These models behave like "people spirits," stochastic simulations with superhuman knowledge but cognitive quirks and fallibility.
- The future of software involves building partially autonomous applications with human-in-the-loop verification, using GUIs to audit AI-generated outputs efficiently.
- Programming in English democratizes software development, enabling "vibe coding" where anyone can create software with AI assistance.
- The ecosystem is evolving with tools like Cursor and Perplexity that orchestrate multiple LLM calls and provide autonomy sliders to control AI assistance levels.
- AI agents are emerging as a new class of digital information consumers and manipulators, requiring new infrastructure and documentation formats optimized for LLMs.
- Karpathy emphasizes cautious progress with AI autonomy, advocating for partial autonomy products rather than fully autonomous agents, likening the approach to an "Iron Man suit" augmentation.
- The talk highlights the vast opportunities and challenges ahead as developers rewrite software for this new paradigm.
Want the big picture?
Highlighted Clips
Introduction to Software 3.0
Karpathy introduces the concept of Software 3.0, where natural language prompts program large language models, marking a new era in software development.
LLMs as Utilities, Fabs, and Operating Systems
Explains how LLMs share characteristics with utilities, semiconductor fabs, and operating systems, drawing analogies to 1960s computing models.
The Psychology of LLMs: People Spirits
LLMs are described as stochastic simulations of people with encyclopedic knowledge but cognitive limitations and hallucinations.
Building Partially Autonomous Applications
Discusses the design of LLM-powered apps like Cursor and Perplexity that balance AI autonomy with human oversight using GUIs and autonomy sliders.
Introduction to Software Evolution and Software 3.0
Andrej Karpathy opens by emphasizing the fundamental shifts in software development, noting that software has changed twice rapidly in recent years after decades of relative stability. He introduces the concept of Software 3.0, where large language models (LLMs) represent a new kind of computer that is programmed using natural language, specifically English.
"Software is changing quite fundamentally again... LLMs are a new kind of computer, and you program them in English."
0
He contrasts the traditional software (Software 1.0) — explicit code written by humans — with Software 2.0, where neural networks are trained rather than explicitly coded. Now, with Software 3.0, the programming interface is natural language, making programming more accessible and fundamentally different.
Key points:
- Software 1.0: Traditional code written by programmers.
- Software 2.0: Neural networks trained via data and optimization.
- Software 3.0: LLMs programmed by natural language prompts.
- Programming in English is a revolutionary shift.
- The software ecosystem is rapidly evolving, requiring fluency in all paradigms.
LLMs as Utilities, Fabs, and Operating Systems
Karpathy draws analogies between LLMs and historical computing infrastructure. He explains that LLMs share properties with utilities (like electricity), fabs (semiconductor fabrication plants), and operating systems.
"LLM labs... spend capex to train the LLMs... and then there's opex to serve that intelligence over APIs... we demand low latency, high uptime, consistent quality."
He highlights that LLMs are currently centralized and accessed via cloud APIs, similar to how electricity is distributed. Users can switch between different LLM providers, akin to switching electricity sources. However, unlike physical utilities, multiple LLM providers coexist without direct competition for physical space.
Karpathy also compares LLMs to operating systems, where the LLM acts as a CPU, context windows as memory, and the entire system orchestrates compute and memory for problem-solving.
"LLMs are complicated operating systems... the LLM is a new kind of a computer... context windows are kind of like the memory."
He notes that this era resembles the 1960s in computing, where expensive centralized computers were accessed via time-sharing, and personal computing had not yet emerged.
Key points:
- LLMs are like utilities: centralized, metered, and reliable services.
- They require massive capital investment (capex) akin to fabs.
- LLMs function as operating systems with complex software ecosystems.
- The current model is centralized cloud-based time-sharing.
- Personal LLM computing is nascent but emerging.
- The ecosystem includes closed-source and open-source players, similar to Windows/MacOS vs. Linux.
The Psychology of LLMs: "People Spirits"
Karpathy introduces a vivid metaphor: LLMs as "people spirits," stochastic simulations of human-like behavior generated by autoregressive transformers trained on vast human text data.
"LLMs = 'people spirits', stochastic simulations of people... they have a kind of emergent psychology."
He discusses their strengths and weaknesses:
- Encyclopedic knowledge and memory far beyond any individual.
- Superhuman capabilities in some problem-solving domains.
- Cognitive deficits such as hallucinations, factual errors, and inconsistent self-knowledge.
- Jagged intelligence: excelling in some areas while making bizarre mistakes in others.
- Lack of long-term memory consolidation (they suffer from "anterograde amnesia"), meaning they don't learn or improve over time without retraining.
- Vulnerability to prompt injection and security risks.
Karpathy stresses the importance of understanding these traits to work effectively with LLMs.
Key points:
- LLMs simulate human-like thought but are fallible.
- They have vast knowledge but can hallucinate or err.
- They lack persistent memory and self-awareness.
- Security and prompt injection are real concerns.
- Effective collaboration requires managing their strengths and weaknesses.
Opportunities: Partial Autonomy and LLM Apps
Shifting to practical applications, Karpathy highlights the rise of partially autonomous applications powered by LLMs, using Cursor (an AI coding assistant) as a prime example.
"Many of you use Cursor... it has a traditional interface for manual work plus LLM integration for bigger chunks."
He outlines key features of successful LLM apps:
- LLMs handle complex context management.
- Orchestration of multiple LLM calls (e.g., embeddings, chat models).
- Application-specific GUIs that allow humans to audit and verify AI outputs visually.
- An "autonomy slider" that lets users control how much autonomy the AI has, from simple suggestions to full autonomous actions.
Karpathy stresses the importance of fast generation-verification loops and keeping AI "on a leash" to avoid overreactive or unsafe behavior.
"We have to keep the AI on the leash... it's not useful to get a diff of 10,000 lines of code all at once."
He compares this to his experience with Tesla's autopilot, where autonomy gradually increased but humans remained in the loop for safety.
Key points:
- Partial autonomy is the practical path forward.
- GUIs are essential for human oversight and fast verification.
- Autonomy sliders allow flexible control over AI assistance.
- Human supervision remains critical due to AI fallibility.
- Incremental, small changes are safer and more manageable.
Programming in English and the Rise of "Vibe Coding"
Karpathy celebrates the accessibility of programming via natural language, coining the term vibe coding to describe the intuitive, exploratory style enabled by LLMs.
"Everyone is a programmer because everyone speaks natural language like English."
He shares personal anecdotes about building apps without deep knowledge of traditional programming languages, enabled by LLMs. For example, he created an iOS app and a menu image generator app (MenuGen) quickly using natural language prompts.
"I can't actually program in Swift but I was able to build a super basic app in a day."
However, he notes that while generating code is easier, the surrounding infrastructure (authentication, deployment, payments) remains complex and time-consuming.
"The code was the easy part... the devops stuff was really hard and slow."
Key points:
- Natural language programming lowers barriers to software creation.
- Vibe coding enables rapid prototyping and experimentation.
- Infrastructure and deployment remain bottlenecks.
- The future may see more tools to automate these surrounding tasks.
Building for Agents: The New Consumers of Digital Information
Karpathy introduces the idea that LLMs and AI agents are a new class of digital consumers and manipulators, alongside humans (via GUIs) and programs (via APIs).
"Agents are humanlike computers... 'people spirits' on the internet that need to interact with software infrastructure."
He proposes new conventions like lm.txt files (analogous to robots.txt) to communicate with LLMs about website content and behavior, making it easier for agents to understand and interact with digital resources.
"We can just directly speak to the LLM... a huge amount of documentation is currently written for people, not LLMs."
Karpathy highlights early adopters like Vercel and Stripe who provide documentation in LLM-friendly markdown formats and replace ambiguous instructions ("click here") with actionable commands (e.g., curl commands) that agents can execute.
He also praises tools that transform GitHub repositories into LLM-readable formats, enabling agents to understand and query codebases effectively.
Key points:
- LLMs require new infrastructure and documentation formats.
- Markdown and explicit commands improve LLM comprehension.
- Agents will increasingly automate interactions with digital systems.
- Tools exist to convert human-oriented data into LLM-friendly formats.
- Meeting LLMs halfway accelerates adoption and utility.
Conclusion: A New Era of Software and Collaboration
Karpathy closes by reflecting on the unprecedented opportunity to rewrite vast amounts of software in this new paradigm.
"An amazing time to get into the industry... LLMs are like utilities, fabs, and operating systems all at once."
He reiterates the importance of understanding LLMs as fallible "people spirits" and adapting software infrastructure accordingly. He envisions a future where autonomy sliders allow gradual increases in AI independence, akin to an Iron Man suit that augments human capabilities.
"We're going to take the slider from left to right... less Iron Man robots, more Iron Man suits."
He expresses excitement about building this future together.
Key points:
- Software 3.0 is a major shift requiring new skills and mindsets.
- LLMs are powerful but imperfect collaborators.
- Infrastructure and tooling must evolve to support LLMs and agents.
- Autonomy will increase gradually with human oversight.
- The future blends augmentation and autonomy in software.
This detailed breakdown captures Andrej Karpathy’s narrative style and key messages, illustrating the profound transformation underway in software development driven by large language models and natural language programming.
Key Questions
Software 3.0 is the new era of software development where large language models are programmed using natural language prompts, effectively making English the programming language.
Have more questions?
Analyzing video...
This may take a few moments.