Tegmark's Mathematical Universe Hypothesis without Simplicity

2025-05-18

Epistemic Status: Extraordinarily speculative, might fall apart the moment someone with more knowledge of computational complexity theory reads this. Treat this as a math fan-fic.

Max Tegmark's M.U.H. is pretty great. Quoting a synopsis from ACX:

Tegmark's hypothesis says: all possible mathematical objects exist.

Consider a mathematical object like a cellular automaton - a set of simple rules that creates complex behavior. The most famous is Conway’s Game of Life; the second most famous is the universe. After all, the universe is a starting condition (the Big Bang) and a set of simple rules determining how the starting condition evolves over time (the laws of physics).

Some mathematical objects contain conscious observers. Conway’s Life might be like this: it’s Turing complete, so if a computer can be conscious then you can get consciousness in Life. If you built a supercomputer and had it run the version of Life with the conscious being, then you would be “simulating” the being, and bringing it into existence. There would be something it was like to be that being; it would have thoughts and experiences and so on.

Tegmark argues this is also true if you don’t build the supercomputer and run it. The fact that the version of Life with the conscious being exists in possibility-space is enough for the being to in fact be experiencing it.

He continues (apologies for the long quote):

By existing, you are a random draw from the set of possible conscious beings. You can’t make a random draw from an infinite set, but the accepted solution is some kind of measure weighted by simplicity. So even though every possible mathematical object exists, simpler ones exist more. Most conscious beings exist in very simple universes, ones that (like Life) are just a few short rules which produce surprisingly complex behavior.

I don't actually have a problem with a simplicity-weighting. It does introduce a degree of arbitraity, as you need to define some basis of computation in which to measure your Kolmogorov complexity, but:

The choice of a basis is a lot less arbitrary than any python vs haskell vs language-that-describes-westeros-in-the-bit-zero,-and-everything-else-in-a-bijilion-bits comparison might suggest, since there exist some very elegant baseis of computation that have the vibe of being fundamental.

(I'm partial to binary lambda calculus, there are others. Loose criteria: "would an alien civ reinvent this?")
You can typically emulate any one basis of computation using a very small number of bits in any other, so if you're looking to compare $\mathcal{K}$ -complexities of universes described by a rather large number of bits, the exact basis you choose doesn't matter that much.

That is to say, if you're interested in comparing the complexities of system $\alpha$ and $\beta$ in the language Binary Turing Machine, where $\alpha$ has a very concise representation in Forth and $\beta$ a very concise representation in Brainfuck, you can just prepend a Forth interpreter to the front of $\alpha$ and a Brainfuck interpreter to the front of $\beta$ , and proceed with only a small constant $\mathcal{K}$ -penalty compared to measuring each within a more natural basis.

Ignoring pathological counterexamples (looking at you, malbolge), this seems to be a general property of computation: concise emulators between reasonable languages seem to exist.

Even still, it's an inelegant corner of what's otherwise the best metaphysics I've ever seen.

Probably the Correct Solution:

Personally, I fully expect we'll develop something like a Jeffreys prior for Kolmogorov complexity - invariant under a change in coordinates from one basis of computation to another - and the problem will evaporate¹.

Something Less Correct, More Fun:

Here's where it gets a bit silly.

What if we could draw from an infinite set? Selecting a random item from

\big\{A, B, C, A, B, C, A, B, C, A, B, C, \ldots\big\}

will yield you one of $A$ , $B$ , or $C$ with equal probability.
Slight problem: sets can't contain duplicate items. The fix is easy enough:

\left\{\begin{array}{ll} A + i^{-1} & \text{if } i \% 3 = 0 \\ B + i^{-1} & \text{if } i \% 3 = 1 \\ C + i^{-1} & \text{if } i \% 3 = 2 \end{array}\right\}_{i\in 1\ldots\infty}

Taking a random draw from this set, and defining some small $\epsilon$ , you can be arbitrarily confident that $1/i < \epsilon$ . I'm comfortable saying this means our random draw is equal to $A$ , $B$ , or $C$ with probability $1/3$ ².

Imagine, then, that it just so happened that a non-vanishing fraction of arbitrarily complex programs had very simple automata encoded in their limiting behaviour. As an intuition pump, the program

s = r'''
TFJ9^M;`cB6jB7VHKWhSjCL0j]H<A1V480Z\A>>L
B\SHKD@QIaClgm7SOJ9gX2PY0PFPE`BW;^I5\A0f
7Rg[?6aI<JD_o0mc7c5J4dfgUUkB^e?\QA3=@AWd
02@4mP0ZJ18dJR\:mZ7Cn9m>hQm\O8PLTd^>NB:9
FBXj^;[iHh]6S=FZDY_G2[62AnVOFBg9c_YlB3oW
5ZXi=?X2a9e7b`^KDP=L:@oL`f;;:lS80eR@M^Z^
abWEB@7FK;\je;fKgKJK3AQdAb34gB_JIDFWj^l[
n\Y^cT[_67X;iBJOUj\CHKae8VUc0eJHJ6m\O1FE
Q`=M_<UX3DlZ5?nV>fSE=;A5i[;L[O\]VjQSh9ia
LmFFGH5Fh3IQ\n7^<^0XiJcE9YC\g[RC6Y?ok:Vn
jQ4Xc`gij=SP41CIQV=8ZK^ABEmTII_7H0oPGP_@
RjC<X?8K8X=BXl3jl3GAj14S]UT:b?@PkQBDh[EU
\^;<:O^b3PHI9BhbN;BfXM8ehVj_9QE>Hnl3P`FN
;0<2j1?XO_g6BY=ghTH7kG3aieA^Ej<gH[1C4I91
             .........
k;]LmfQan?ObXag`2EbTS7k3f9LDOTYlEL94^3@5
T?]JFi_h`UPHCiX^G:VLjmR8RE1R>Q6lb=]]\>3V
^2c2=H^lT^JS;gdTBHElDLF]H@[G9jlYeCI[=XN`
VejdOH?MPj2XSi3=dXo\laJ4cj7Q4`2\:J_a\BKA
546na<PH\WJFF5V:IcAY^V2PTbAMST3oiMZ7hmJX
\2\ndX6>XHV3@m]ZM5keAYeQoDfRbDbGF`c24L>`
7d5iblc@376H==617a@K1Io^=akZYX7US;feTiJ;
G:LgRQ6Po<AA?^LcWRR1D0\@ml>T=Flo>Y^l0?Md
PHS^51f:iS65D>k<A4o\Cd_QblDNI?UM>?nKEPeC
QKb7jF\=[Tl;fNCk_TZma[EU4\YMX__3TRPO0K3=
A:_DV\[16IRO8VI<@CC^ol[6o:n^m5;ah;F1PdMI
^Fj2@DC7kF0moL8WfQAVm9[@]ieMJ=E;eF4gZbK4
O3jdSD=o]\H=XgTRP?gkVE5D0GFQB?9M@Z7]E5e0
'''

x = 0
for c in s.replace('\n', ''):
    x = (x << 6) + (ord(c) - ord('0'))

while True:
    if x & 1 == 0:
        x = x // 2
    else:
        x = 3 * x + 1

can have an arbitrarily high $\mathcal{K}$ -complexity (depending on what we insert for ...), but after a finite preamble it's conjectured to end up in the loop $\ldots \to 4 \to 2 \to 1 \to 4\ldots$ forever.³

This is cheating in number of ways, though eliding the complexity hidden in Python's big-int implementation might not be one of them. There's no reason a simple conceptual language can't natively support Nats. Of greater concern:

The limiting system we end up with isn't Turing complete.
This doesn't really capture the essence of "limiting behaviour of an infinitely complex program". It implies we would require infinitely complex programs eventually settle into some simple regime...

Instead of a program running for a very long time and eventually settling into some more simple terminal state, it might be more useful to think of incomprehensibly complex program transitioning between broad regimes of behaviour as it goes about its incomprehensibly complex work, with the "limiting behaviour" as the state-machine that describes these macro regime-transitions.⁴

However you chose to encode a simple machine inside the operations of an unfathomably complex one, to beings existing in a universe emulated by the simpler machine, it would appear as if their universe followed simple rules.

Thoughts on this:

Doesn't this just shift the problem up a level, so we now need a simplicity prior to decide which simple programs are encoded more or less frequently some properties of the operations of infinitely complex programs?

Yes! But, the nature of this distribution now becomes a matter of empirics, rather than a first-order feature of our metaphysics.
Wouldn't we expect most observers to be fleeting Boltzmann brains, existing in the very complex machines directly, instead of in our higher-level encoded simple machines?

Maybe not. Any being that exists directly in one particular infinitely complex program takes up $\lim_{\epsilon\to 0}\ \epsilon$ percent of the set of all programs. It's $A + 0.1$ . By contrast, the behaviour of Rule 110 seeded with ...0001000... would exist, encoded in some way or another, over a non-vanishing fraction of the total program set. $A + \epsilon$ .

If there's no way to tell which infinitely complex program is emulating your simple universe, we might as well say your existence is spread over the entire non-vanishing fraction in a simulation capture ish way.
A basis shift - changing the underlying computation language - changes the distribution of what programs are encoded in few bits, and what takes many bits to describe. Wouldn't such a basis shift change the distribution of our encoded programs, too, meaning we still need to make a choice about the basis of computation?

Maybe, yeah. It's possible that our idea falls apart here. But the set of infinitely complex programs defined under some computational basis is a really weird object; I wouldn't rule out the possibility that the distribution of encoded programs in the limit could be invariant to the basis you use to construct the infinitely complex program-set.

Or not. We're just having fun here.

Let me know what you think.

Erratum

After talking with my friend Asha, I'm pretty sure this falls apart. We can probably always change the order in which we construct the set of all programs, and arrive at a different distribution of embedded programs. So while we're not privileging program simplicity, we are still privileging some sampling order, which is basically the same thing.

I retreat to the original claim: we really need a basis-invariant Jeffreys-style prior for Kolmogorov complexity.

You could probably approximate this basis-invariant Kolmogorov complexity by taking some finite subset of reasonable sounding baseis, trying your hardest to find the Kolmogorov complexity of your target program in each, and then averaging over the results, weighted per language by the inverse of the basis-invariant Kolmogorov complexity of that language's emulator (defined such that - within any language - the language's emulator is the empty string).

$K_\beta(\alpha) = \text{minimum number of bits to represent the program } \alpha \text{ in the language } \beta$

$K_{\text{basis-invariant}}(\alpha) = \frac{1}{n} \cdot \sum_{\beta \in \{\lambda\text{-calc, TMs, C, python}\ldots\}} \frac{K_\beta(\alpha)}{K_\text{basis-invariant}(\text{emulator for } \beta)}$

Is it a problem that this is recursive? Not really. You can start with $\forall \alpha : K(\alpha) = 1$ , and iterate a large number of rounds until you stabilise towards the fixed point (we're very much hoping that there exists a fixed point here).

Unfortunately - taking the naïve limit of this process doesn't produce a recipe for the universe's weighting function. Better researchers (or maybe myself (if I ever get around to conquering the prerequisite material), or some superintelligence - bored after killing us all) will need to figure this one out the hard way. ↩︎
If you're uncomfortable with this assertion, or with the idea of sampling from an infinite set in the first place - don't worry. The "math" is about to get much more sketchy, making this concern relatively minor in comparison. ↩︎
https://en.wikipedia.org/wiki/Collatz_conjecture ↩︎
Your CPU renders this page, executing an incomprehensibly complex dance over hundreds of thousands of cycles. Pick a random angle to dimensionally reduce along, and the progression instead looks like the workings of some tiny finite state machine, moving with a stuttering slow clock in fits and bursts (or, more often, as random noise). ↩︎