20 subscribers
Looks like the publisher may have taken this series offline or changed its URL. Please contact support if you believe it should be working, the feed URL is invalid, or you have any other concerns about it.
¡Desconecta con la aplicación Player FM !
LW - Open Problems in AIXI Agent Foundations by Cole Wyeth
Series guardadas ("Feed inactivo" status)
When? This feed was archived on November 22, 2024 12:25 (
Why? Feed inactivo status. Nuestros servidores no pudieron recuperar un podcast válido durante un período sostenido.
What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.
Manage episode 439648069 series 2997284
I believe that the theoretical foundations of the AIXI agent and variations are a surprisingly neglected and high leverage approach to agent foundations research. Though discussion of AIXI is pretty ubiquitous in A.I. safety spaces, underscoring AIXI's usefulness as a model of superintelligence, this is usually limited to poorly justified verbal claims about its behavior which are sometimes questionable or wrong. This includes, in my opinion, a serious exaggeration of AIXI's flaws.
For instance, in a recent post I proposed a simple extension of AIXI off-policy that seems to solve the anvil problem in practice - in fact, in my opinion it has never been convincingly argued that the anvil problem would occur for an AIXI approximation. The perception that AIXI fails as an embedded agent seems to be one of the reasons it is often dismissed with a cursory link to some informal discussion.
However, I think AIXI research provides a more concrete and justified model of superintelligence than most subfields of agent foundations [1]. In particular, a Bayesian superintelligence must optimize some utility function using a rich prior, requiring at least structural similarity to AIXI. I think a precise understanding of how to represent this utility function may be a necessary part of any alignment scheme on pain of wireheading.
And this will likely come down to understanding some variant of AIXI, at least if my central load bearing claim is true: The most direct route to understanding real superintelligent systems is by analyzing agents similar to AIXI. Though AIXI itself is not a perfect model of embedded superintelligence, it is perhaps the simplest member of a family of models rich enough to elucidate the necessary problems and exhibit the important structure.
Just as the Riemann integral is an important precursor of Lebesgue integration, despite qualitative differences, it would make no sense to throw AIXI out and start anew without rigorously understanding the limits of the model. And there are already variants of AIXI that surpass some of those limits, such as the reflective version that can represent other agents as powerful as itself.
This matters because the theoretical underpinnings of AIXI are still very spotty and contain many tractable open problems. In this document, I will collect several of them that I find most important - and in many cases am actively pursuing as part of my PhD research advised by Ming Li and Marcus Hutter.
The AIXI (~= "universal artificial intelligence") research community is small enough that I am willing to post many of the directions I think are important publicly; in exchange I would appreciate a heads-up from anyone who reads a problem on this list and decides to work on it, so that we don't duplicate efforts (I am also open to collaborate).
The list is particularly tilted towards those problems with clear, tractable relevance to alignment OR philosophical relevance to human rationality. Naturally, most problems are mathematical. Particularly where they intersect recursion theory, these problems may have solutions in the mathematical literature I am not aware of (keep in mind that I am a lowly second year PhD student). Expect a scattering of experimental problems to be interspersed as well.
To save time, I will assume that the reader has a copy of Jan Leike's PhD thesis on hand. In my opinion, he has made much of the existing foundational progress since Marcus Hutter invented the model.
Also, I will sometimes refer to the two foundational books on AIXI as UAI = Universal Artificial Intelligence and Intro to UAI = An Introduction to Universal Artificial Intelligence, and the canonical textbook on algorithmic information theory Intro to K = An...
2447 episodios
Series guardadas ("Feed inactivo" status)
When?
This feed was archived on November 22, 2024 12:25 (
Why? Feed inactivo status. Nuestros servidores no pudieron recuperar un podcast válido durante un período sostenido.
What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.
Manage episode 439648069 series 2997284
I believe that the theoretical foundations of the AIXI agent and variations are a surprisingly neglected and high leverage approach to agent foundations research. Though discussion of AIXI is pretty ubiquitous in A.I. safety spaces, underscoring AIXI's usefulness as a model of superintelligence, this is usually limited to poorly justified verbal claims about its behavior which are sometimes questionable or wrong. This includes, in my opinion, a serious exaggeration of AIXI's flaws.
For instance, in a recent post I proposed a simple extension of AIXI off-policy that seems to solve the anvil problem in practice - in fact, in my opinion it has never been convincingly argued that the anvil problem would occur for an AIXI approximation. The perception that AIXI fails as an embedded agent seems to be one of the reasons it is often dismissed with a cursory link to some informal discussion.
However, I think AIXI research provides a more concrete and justified model of superintelligence than most subfields of agent foundations [1]. In particular, a Bayesian superintelligence must optimize some utility function using a rich prior, requiring at least structural similarity to AIXI. I think a precise understanding of how to represent this utility function may be a necessary part of any alignment scheme on pain of wireheading.
And this will likely come down to understanding some variant of AIXI, at least if my central load bearing claim is true: The most direct route to understanding real superintelligent systems is by analyzing agents similar to AIXI. Though AIXI itself is not a perfect model of embedded superintelligence, it is perhaps the simplest member of a family of models rich enough to elucidate the necessary problems and exhibit the important structure.
Just as the Riemann integral is an important precursor of Lebesgue integration, despite qualitative differences, it would make no sense to throw AIXI out and start anew without rigorously understanding the limits of the model. And there are already variants of AIXI that surpass some of those limits, such as the reflective version that can represent other agents as powerful as itself.
This matters because the theoretical underpinnings of AIXI are still very spotty and contain many tractable open problems. In this document, I will collect several of them that I find most important - and in many cases am actively pursuing as part of my PhD research advised by Ming Li and Marcus Hutter.
The AIXI (~= "universal artificial intelligence") research community is small enough that I am willing to post many of the directions I think are important publicly; in exchange I would appreciate a heads-up from anyone who reads a problem on this list and decides to work on it, so that we don't duplicate efforts (I am also open to collaborate).
The list is particularly tilted towards those problems with clear, tractable relevance to alignment OR philosophical relevance to human rationality. Naturally, most problems are mathematical. Particularly where they intersect recursion theory, these problems may have solutions in the mathematical literature I am not aware of (keep in mind that I am a lowly second year PhD student). Expect a scattering of experimental problems to be interspersed as well.
To save time, I will assume that the reader has a copy of Jan Leike's PhD thesis on hand. In my opinion, he has made much of the existing foundational progress since Marcus Hutter invented the model.
Also, I will sometimes refer to the two foundational books on AIXI as UAI = Universal Artificial Intelligence and Intro to UAI = An Introduction to Universal Artificial Intelligence, and the canonical textbook on algorithmic information theory Intro to K = An...
2447 episodios
Todos los episodios
×
1 No new episodes will be published here. To keep listening to the EAF & LW, listen to this episode for instructions. 0:33

1 LW - Augmenting Statistical Models with Natural Language Parameters by jsteinhardt 16:41

1 LW - Glitch Token Catalog - (Almost) a Full Clear by Lao Mein 2:50:10

1 LW - Investigating an insurance-for-AI startup by L Rudolf L 26:00

1 LW - Applications of Chaos: Saying No (with Hastings Greer) by Elizabeth 3:39

1 LW - Work with me on agent foundations: independent fellowship by Alex Altair 6:20

1 LW - o1-preview is pretty good at doing ML on an unknown dataset by Håvard Tveit Ihle 3:14

1 EA - The Best Argument is not a Simple English Yud Essay by Jonathan Bostock 6:35

1 LW - Interested in Cognitive Bootcamp? by Raemon 2:05

1 LW - Laziness death spirals by PatrickDFarley 13:04

1 LW - We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap by johnswentworth 7:41

1 LW - AI #82: The Governor Ponders by Zvi 43:47

1 LW - Which LessWrong/Alignment topics would you like to be tutored in? [Poll] by Ruby 2:03

1 EA - What Would You Ask The Archbishop of Canterbury? by JDBauman 0:43

1 LW - [Intuitive self-models] 1. Preliminaries by Steven Byrnes 39:21
Bienvenido a Player FM!
Player FM está escaneando la web en busca de podcasts de alta calidad para que los disfrutes en este momento. Es la mejor aplicación de podcast y funciona en Android, iPhone y la web. Regístrate para sincronizar suscripciones a través de dispositivos.