
Two leading AI researchers have proposed “experiential learning” as the next phase in artificial intelligence. Their theory is detailed in “The Era of Experience,” an excerpt from the forthcoming book “Designing an Intelligence” from MIT Press. David Silver and Richard S. Sutton describe their next-generation AI agents as the path to “superhuman intelligence.”
“In key domains such as mathematics, coding, and science, the knowledge extracted from human data is rapidly approaching a limit,” Silver and Sutton wrote.
Plus, generative AI can’t invent useful things or find “valuable new insights … beyond the current boundaries of human understanding.”
Who are these AI researchers?
Computer scientist David Silver was a key developer behind AlphaGo, the pivotal Go-playing program that defeated world champion Lee Sedol in 2016.
Richard S. Sutton, a prominent figure in reinforcement learning, developed several foundational algorithms for the field. In a 2019 essay, he argued that computer scientists should incorporate “meta-methods” — techniques that enable them to learn from the “arbitrary, intrinsically complex, outside world,” rather than relying solely on structured data.
Redefining AI’s development into three distinct eras
Silver and Sutton created new categories for the development of AI over the last 10 years. Under this model:
- AlphaGo and other machine learning techniques took place in the Era of Simulation.
- GPT-3 marked the beginning of the Era of Human Data.
- The Era of Experience began in 2024 with AlphaProof, a reinforcement learning-based AI system developed by Google DeepMind.
They point out AlphaProof earned a medal in the International Mathematical Olympiad through a reinforcement algorithm using “continual interaction with a formal proving system.” Instead of teaching the model math, they taught it to want certain rewards that doing math would provide.
The authors suggest AI learning could be reinforced by the world itself, whether through a world model simulation or by utilizing data such as profit, exam results, or energy consumption.
“This data must be generated in a way that continually improves as the agent becomes stronger; any static procedure for synthetically generating data will quickly become outstripped,” they wrote.
SEE: More powerful AI means more strain on the Earth’s resources.
Future AI agents will persist on long-term goals
These AI agents of the Era of Experience will differ in various ways.
- They will be able to persist long-term on “ambitious goals.”
- They will draw passively from their environment as well as directly from human input.
- They will be motivated by “their experience of the environment,” not “human judgement.”
- They will plan or reason about the things they experience, which are independent of their human user.
Their proposed future AI extends beyond “directly answering a user’s question” to pursue a long-term goal. By contrast, current AI models can remember users’ preferences and thread questions elsewhere in a conversation into their answers.
They are aware of the risks, too: job displacement, safety risks in cases where humans have fewer chances to intervene in the agent’s actions, or future AI systems being difficult to interpret.