¡Desconecta con la aplicación Player FM !
LW - Open Problems in AIXI Agent Foundations by Cole Wyeth
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 26, 2024 16:04 ()
What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 439648069 series 2997284
I believe that the theoretical foundations of the AIXI agent and variations are a surprisingly neglected and high leverage approach to agent foundations research. Though discussion of AIXI is pretty ubiquitous in A.I. safety spaces, underscoring AIXI's usefulness as a model of superintelligence, this is usually limited to poorly justified verbal claims about its behavior which are sometimes questionable or wrong. This includes, in my opinion, a serious exaggeration of AIXI's flaws.
For instance, in a recent post I proposed a simple extension of AIXI off-policy that seems to solve the anvil problem in practice - in fact, in my opinion it has never been convincingly argued that the anvil problem would occur for an AIXI approximation. The perception that AIXI fails as an embedded agent seems to be one of the reasons it is often dismissed with a cursory link to some informal discussion.
However, I think AIXI research provides a more concrete and justified model of superintelligence than most subfields of agent foundations [1]. In particular, a Bayesian superintelligence must optimize some utility function using a rich prior, requiring at least structural similarity to AIXI. I think a precise understanding of how to represent this utility function may be a necessary part of any alignment scheme on pain of wireheading.
And this will likely come down to understanding some variant of AIXI, at least if my central load bearing claim is true: The most direct route to understanding real superintelligent systems is by analyzing agents similar to AIXI. Though AIXI itself is not a perfect model of embedded superintelligence, it is perhaps the simplest member of a family of models rich enough to elucidate the necessary problems and exhibit the important structure.
Just as the Riemann integral is an important precursor of Lebesgue integration, despite qualitative differences, it would make no sense to throw AIXI out and start anew without rigorously understanding the limits of the model. And there are already variants of AIXI that surpass some of those limits, such as the reflective version that can represent other agents as powerful as itself.
This matters because the theoretical underpinnings of AIXI are still very spotty and contain many tractable open problems. In this document, I will collect several of them that I find most important - and in many cases am actively pursuing as part of my PhD research advised by Ming Li and Marcus Hutter.
The AIXI (~= "universal artificial intelligence") research community is small enough that I am willing to post many of the directions I think are important publicly; in exchange I would appreciate a heads-up from anyone who reads a problem on this list and decides to work on it, so that we don't duplicate efforts (I am also open to collaborate).
The list is particularly tilted towards those problems with clear, tractable relevance to alignment OR philosophical relevance to human rationality. Naturally, most problems are mathematical. Particularly where they intersect recursion theory, these problems may have solutions in the mathematical literature I am not aware of (keep in mind that I am a lowly second year PhD student). Expect a scattering of experimental problems to be interspersed as well.
To save time, I will assume that the reader has a copy of Jan Leike's PhD thesis on hand. In my opinion, he has made much of the existing foundational progress since Marcus Hutter invented the model.
Also, I will sometimes refer to the two foundational books on AIXI as UAI = Universal Artificial Intelligence and Intro to UAI = An Introduction to Universal Artificial Intelligence, and the canonical textbook on algorithmic information theory Intro to K = An...
2447 episodios
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 26, 2024 16:04 ()
What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 439648069 series 2997284
I believe that the theoretical foundations of the AIXI agent and variations are a surprisingly neglected and high leverage approach to agent foundations research. Though discussion of AIXI is pretty ubiquitous in A.I. safety spaces, underscoring AIXI's usefulness as a model of superintelligence, this is usually limited to poorly justified verbal claims about its behavior which are sometimes questionable or wrong. This includes, in my opinion, a serious exaggeration of AIXI's flaws.
For instance, in a recent post I proposed a simple extension of AIXI off-policy that seems to solve the anvil problem in practice - in fact, in my opinion it has never been convincingly argued that the anvil problem would occur for an AIXI approximation. The perception that AIXI fails as an embedded agent seems to be one of the reasons it is often dismissed with a cursory link to some informal discussion.
However, I think AIXI research provides a more concrete and justified model of superintelligence than most subfields of agent foundations [1]. In particular, a Bayesian superintelligence must optimize some utility function using a rich prior, requiring at least structural similarity to AIXI. I think a precise understanding of how to represent this utility function may be a necessary part of any alignment scheme on pain of wireheading.
And this will likely come down to understanding some variant of AIXI, at least if my central load bearing claim is true: The most direct route to understanding real superintelligent systems is by analyzing agents similar to AIXI. Though AIXI itself is not a perfect model of embedded superintelligence, it is perhaps the simplest member of a family of models rich enough to elucidate the necessary problems and exhibit the important structure.
Just as the Riemann integral is an important precursor of Lebesgue integration, despite qualitative differences, it would make no sense to throw AIXI out and start anew without rigorously understanding the limits of the model. And there are already variants of AIXI that surpass some of those limits, such as the reflective version that can represent other agents as powerful as itself.
This matters because the theoretical underpinnings of AIXI are still very spotty and contain many tractable open problems. In this document, I will collect several of them that I find most important - and in many cases am actively pursuing as part of my PhD research advised by Ming Li and Marcus Hutter.
The AIXI (~= "universal artificial intelligence") research community is small enough that I am willing to post many of the directions I think are important publicly; in exchange I would appreciate a heads-up from anyone who reads a problem on this list and decides to work on it, so that we don't duplicate efforts (I am also open to collaborate).
The list is particularly tilted towards those problems with clear, tractable relevance to alignment OR philosophical relevance to human rationality. Naturally, most problems are mathematical. Particularly where they intersect recursion theory, these problems may have solutions in the mathematical literature I am not aware of (keep in mind that I am a lowly second year PhD student). Expect a scattering of experimental problems to be interspersed as well.
To save time, I will assume that the reader has a copy of Jan Leike's PhD thesis on hand. In my opinion, he has made much of the existing foundational progress since Marcus Hutter invented the model.
Also, I will sometimes refer to the two foundational books on AIXI as UAI = Universal Artificial Intelligence and Intro to UAI = An Introduction to Universal Artificial Intelligence, and the canonical textbook on algorithmic information theory Intro to K = An...
2447 episodios
All episodes
×Bienvenido a Player FM!
Player FM está escaneando la web en busca de podcasts de alta calidad para que los disfrutes en este momento. Es la mejor aplicación de podcast y funciona en Android, iPhone y la web. Regístrate para sincronizar suscripciones a través de dispositivos.