Large Numbers, Part 4: Epsilon and Phi
Wednesday, June 22, 2011
Direct continuation of: Large Numbers, Part 3: Functions and Ordinals
In the first post of this series, I built a basic understanding of magnitude through the use of exponentiation and Knuth up-arrows that was then thrown out the window in the second post when we started going through Graham’s Number and Conway chained-arrows. Then, in the third post, I built a new system of notating large numbers based on ordinals, and showed how it matched almost all of the large notations built earlier, and had even more potential for growth.
This system of ordinal notation is based on things called ordinal numbers, which are different kinds of infinite sets. For example, ω, the smallest infinite ordinal, is also the set that contains every finite number, no matter how large: [0, 1, 2, 3, 4, 5, 6, ..., 8091380, 8091381, ..., 3↑↑↑↑3, ..., G, ...].
By the end of that post, I built a system of notation based on the following twelve rules:
- f0(x) = x+1
- fa(x) = fa-1x(x)
- fω(x) = fx(x)
- fωa(x) = fω(a-1)+x(x)
- fωa+b(x) = fωa+(b-1)x(x)
- fω^2(x) = fωx(x)
- fω^2+b(x) = fω^2+(b-1)x(x)
- f(ω^2)a(x) = f(ω^2)*(a-1) + ωx(x)
- f(ω^2)a+b(x) = f(ω^2)a+(b-1)x(x)
- f(ω^2)a+ωb(x) = f(ω^2)a + ωb + x(x)
- f(ω^2)a+ωb+c(x) = f(ω^2)a + ωb + c-1x(x)
- fω^3(x) = f(ω^2)x(x)
Now, I’m going to expand this system.
From ω^3 to ω^ω
Its quite obvious how to build up the ordinal system by this point. You can take any starting point, such as where we left off fω^3(x) and then build on that using what we already know: first, use basic function recursion, such as fω^33(x) = fω^3(fω^3(fω^3(x))). Then use addition to diagonalize that recursion, such as fω^3+1(x) = fω^3x(x).
Then use ω to diagonalize the levels of addition, such as fω^3+ω(x) = fω^3+x(x). Then repeat that whole process over again, getting to fω^3+ω2(x) = fω^3+ω+x(x). Then again, getting to fω^3+ω3(x) = fω^3+ω2+x(x). Then diagonalize the amount of times you repeat this process, getting to fω^3+ω^2(x) = fω^3+ωx(x).
Then start over, building addition, then multiplication all over again, and you get to fω^3+(ω^2)*2(x) = fω^3+ω^2+ωx(x). Repeating this larger process again gets you to fω^3+(ω^2)*3(x) = fω^3+(ω^2)*2+ωx(x). From here, it should be fairly intuitive, and diagonalizing this larger process will get you to f(ω^3)*2(x) = fω^3+(ω^2)*x(x). Once you diagonalize this yet even larger process, you finally get to fω^4(x) = f(ω^3)*x(x).
Now we can keep building, getting to fω^4+ω^3(x) and f(ω^4)*2(x) and finally to fω^5(x). Eventually this whole process can be additionally diagonalized to fω^ω(x), which is fω^x(x).
Note that each increase in the exponent is an increasingly large jump, since you have to add all the exponents before it. For example, going from fω^2(x) to fω^3(x) is not as large as going from fω^6(x) to fω^7(x), because in order to get to fω^6(x), you must first get to things like f(ω^5)*91+(ω^4)*43+(ω^3)*166+(ω^2)*12+ω+4209(x).
Also note that like all other ordinals, ω^ω is a set that contains ω^n for every finite n: [0, ω, ω^2, ω^3, ω^4, ω^5, ω^6, ..., ω^8091380, ω^8091381, ..., ω^(3↑↑↑↑3), ..., ω^G, ..., ω^(fω^ω(x)), ...]. Yes, because fω^ω(x) is a finite number, ω^(fω^ω(x)) is actually less than ω^ω.
Now we can encompass everything we’ve just said in a slightly different set of twelve rules:
- f0(x) = x+1
- fa(x) = fa-1x(x)
- fω(x) = fx(x)
- fωa(x) = fω(a-1)+x(x)
- fωa+b(x) = fωa+(b-1)x(x)
- fω^2(x) = fωx(x)
- fω^a(x) = fω^(a-1) * x(x)
- fω^p+b(x) = fω^p(b-1)x(x)
- f(ω^p)a(x) = f(ω^p)*(a-1) + (ω^(p-1))*x(x)
- f(ω^p)a+b(x) = f(ω^p)a+(b-1)x(x)
- f(ω^p)a+ωb(x) = f(ω^p)a + ωb + x(x)
- f(ω^p)a+ωb+c(x) = f(ω^p)a + ωb + c-1x(x)
As an example, fω^ω(3) = fω^3(3) = f(ω^2)*3(3) = f(ω^2)*2+ω3(3) = f(ω^2)*2+ω2+3(3) = f(ω^2)*2+ω2+23(3) = f(ω^2)*2+ω2+2(f(ω^2)*2+ω2+2(f(ω^2)*2+ω2+2(3))) = …
Another Magnitude Check
So now we have vastly outshadowed ω^3, we can again revisit some of the large non-ordinal based number notation systems and compare them. When we last left off, a→bc was approximately f(ω^2)*b(a). What does this mean for C(a)?
Well since C(a) = a→aa, this implies that C(a) ≈ f(ω^2)*a(a).
[[[THIS PART OF THE ESSAY REMOVED DUE TO TECHNICAL DIFFICULTIES. I HOPE TO FIX IT SHORTLY.]]]
So it turns out what we can notate with fω^5(a) is actually larger the extension of the extension of Conway chained-arrow notation. That’s fairly impressive.
From ω^ω to Epsilon Naught
So when we have ω^ω, where do we go next? Well, we get to ω^ω+1, then ω^ω+ω, then ω^ω+ω^2, then (ω^ω)*2, then (ω^ω)*ω, which is ω^(ω+1); skipping a lot of steps along the way. We then can continue as normal going to ω^(ω+2) and ω^(ω+3) and then eventually ω^(ω2).
One of the advantages of ordinal notation, in addition to how large the numbers can be made, is how intuitive the system is. If I tell you that eventually we get to ω^(ω^2), you know and can define exactly the steps to get there, without having to write out each individual step (and there are many).
Eventually we will get to ω^(ω^ω) and things get a bit more complicated, because there are things like ω^(ω^(ω+1)) and ω^(ω^ω+1) which look similar but are not — ω^(ω^ω+1) is on its way to become ω^((ω^ω)*2) whereas ω^(ω^(ω+1)) is on its way to become ω^(ω^(ω2)), which is significantly larger.
Also remember that something like ω^(ω^(ω2)) is indeed a set, containing an infinite amount of ordinals in this pattern: [ω^(ω^ω), ω^(ω^(ω+1)), ω^(ω^(ω+2)), , ω^(ω^(ω+3)), ..., , ω^(ω^(ω+4209)), ...].
Moreover, each added exponent is a dramatic shift and gets more dramatic — ω^(ω^ω) to ω^(ω^(ω^ω)) is not nearly as dramatic as ω^(ω^(ω^(ω^(ω^ω)))) to ω^(ω^(ω^(ω^(ω^(ω^ω))))).
Eventually the stack of exponents gets large enough that we want to make an fO(x) where the O is an ordinal that takes x and defines the number of stacks. This O is an actual ordinal referred to as epsilon naught, or ε0. I will write this as ε(0) in order to avoid multi-layered subscripts for at least a bit longer.
ε0 is the set [ω, ω^ω, ω^(ω^ω), ω^(ω^(ω^ω)), ω^(ω^(ω^(ω^ω))), ...] and fε(0)(x) selects the “x”th value from this set and replaces ε(0) with it. Therefore:
fε(0)(2) = fω^ω(2) = fω^2(2) = …
fε(0)(3) = fω^(ω^ω)(3) = fω^(ω^3)(2) = …
fε(0)(5) = fω^(ω^(ω^(ω^ω)))(3) = …
The General Definition of Ordinal Function
Now it’s time to combine the fact that ordinals are sets and make this very easy to notate, rather than remembering some fifteen rules. We know that every ordinal is a set, and that every fO(x) selects the “x”th member of that ordinal O and replaces O with it. Therefore, we can go off these four rules and define every possible ordinal, including ordinals much bigger than we’ve currently defined:
- f(a) = a+1
- fa(x) = fa-1x(x)
- fO(x) = fD(O)(x)
- fO+a(x) = fO+a-1x(x)
…where a is any finite number, O is any infinite ordinal, and D(O) represents taking O and replacing it with the “x”th member of the set defined by O.
For example, something like fε(0)+ω^4+ω^3+ω(x) = fD(ε(0)+ω^4+ω^3+ω)(20) = fε(0)+ω^4+ω^3+20(20) = fε(0)+ω^4+ω^3+1920(20). (If you are of keen interest to this subject of making large numbers, it should be noted that this number is much larger than anything that has been defined by the infamous anything in Johnathan Bower’s two-dimensional notation.)
From Epsilon Naught to Zeta Naught
But how do we get to ε(1) (also written as ε1 and pronounced epsilon one) from ε(0)? We actually have a lot of journeying to do. Again with the adding, getting to things like ε(0)+1, ε(0)+ω^15, and ε(0)+ω^(ω^ω). Eventually you get to ε(0)+ε(0), which is equivalent to ε(0)*2. But this is not enough, because from here we need to build to ε(0)*ω and then to things like ε(0)*(ω^3) and ε(0)*(ω^(ω^(ω^ω))). Eventually we get to ε(0)*ε(0), which is equivalent to ε(0)^2.
From here, yet more building is needed. We must get to things like ε(0)^22 and ε(0)^(ω^ω). Eventually we’ll get to ε(0)^ε(0). After this, you keep building again and again until you get stacks of arbitrary height, like ε(0)^ε(0)^ε(0)^ε(0) or ε(0)^ε(0)^ε(0)^ε(0)^ε(0)^ε(0)^ε(0). This is what ε(1) represents — ε(1) is to ε(0) as ε(0) was to ω. ε(1) is the set [ε(0), ε(0)^ε(0), ε(0)^ε(0)^ε(0), ε(0)^ε(0)^ε(0)^ε(0), ...].
But this isn’t nearly the limit of our building abilities, as we can get to yet larger ordinals, such as ε(2) or ε(3). ε(2) is to ε(1) as ε(0) was to ω, just as ε(3) is to ε(2) as ε(0) was to ω. It all builds up the same way — to get to ε(3), you must first get to ε(2)^ε(2)^ε(2)^ε(2)^ε(2)^ε(2) and beyond, and in order to do that you must first get to ε(2)^ε(2)^ε(2)^ε(2)^ε(2)^ε(1)^ε(1)^ε(1)^ε(1)^ε(1)^ε(1) and beyond. Therefore, with this stacking, jumping between ε(4) and ε(5) is, predictably, a larger jump than between ε(2) and ε(3).
Eventually you can just use ε(ω) with our general definition and know that fε(ω)(x) = fε(x)(x), and the fun starts all over again, with us getting to things like ε(ω+1) and ε(ω2) and even ε(ε(0)). Using the general definition, even fε(ε(0))(x) is easy — fε(ε(0))(3) = fε(ω^ω^ω)(3) = fε(ω^ω^3)(3) = fε(ω^(ω^2+3))(3) = …
Now all of this fun continues to an arbitrary amount of nested “ε(0)”s, getting things like ε(ε(ε(ε(ε(ε(0)))))). This gives rise to an ordinal referred to as zeta naught and written as ζ(0) or, more commonly, ζ0. ζ(0) is the set [ε(0), ε(ε(0)), ε(ε(ε(0))), ε(ε(ε(ε(0)))), ...].
To my knowledge, fζ(0)(1000) is larger than any number I’ve ever encountered that hasn’t also been defined using ordinals (or hydras, which is something ordinal-like and complicated that I don’t really need to get into). It is larger than anything in this pdf file, which spends 75 pages continually defining larger and larger numbers. It exceeds anything Bowers ever came up with, including multi-dimensional exploding arrays and what not. This is saying something for the power of ordinal notation.
The Veblen “Phi” Function
With a general definition that works for any ordinal and an intuitive method of expressing definitions, we have the luxury of skipping lots of intermediary steps. The first thing to keep in mind is that ζ(x) works pretty much like ε(x). In order to get to ζ(1), you must first get to ζ(0)^ζ(0)^ζ(0) and beyond, and in order to get to there, you must first get to ζ(0)^ζ(0)^ε(999)^ε(999) and beyond. Eventually you get to the set [ζ(0), ζ(0)^ζ(0), ζ(0)^ζ(0)^ζ(0), ζ(0)^ζ(0)^ζ(0)^ζ(0), ...], which is ε(ζ(0)+1). Now this might raise a bit of eyebrows — what is ε(ζ(0)+1) and when do we get to ζ(1)? Well, we know that because of fixed points, ε(ζ(0)) = ζ(0). But ε(ζ(0)+1) is different because of the “+1″, which means we’re asking to apply ε() to ζ(0) itself.
This gives us the basis to get to ζ(1), which is the set [ζ(0)+1, ε(ζ(0)+1), ε(ε(ζ(0)+1)), ε(ε(ε(ζ(0)+1), ...], which you see matches up intuitively with ζ(0). Then you’ll find that you can now get to ζ(x) for nearly any x by following the same steps: something like ζ(42) would be the set [ζ(41)+1, ε(ζ(41)+1), ε(ε(ζ(41)+1)), ε(ε(ε(ζ(41)+1), ...].
When you get to ζ(ζ(ζ(ζ(0)))) and beyond, you eventually get to an ordinal O that is the set [ζ(0), ζ(ζ(0)), ζ(ζ(ζ(0))), ζ(ζ(ζ(ζ(0)))), ...]. This ordinal O could have a symbol or something, but instead of keeping track of an infinite amount of symbols, it is best we make a function for this. Thus, the Phi Function, also called the Veblen Functions, were created: φ(3, 0) represents that O we just found; φ(2, 0) represents ζ(0); φ(1, 0) represents ε(0); and φ(0, 1) represents ω.
Fixed Points
To explain φ(a, b) fully requires introducing the concept of fixed points, another idea that only works because, while infinite ordinals can be used to describe/represent/notate finite numbers, the ordinals themselves are infinite. And remember while ordinals are sets, they are also the smallest number that is not contained in their set — for example the smallest number not in the set of all finite numbers [1, 2, 3, 4, 5, ..., 1932103819, ...] is infinity, which we write as ω.
Now, fixed points: a function g(x) has a fixed point at x when “x = g(x)” is true. For example, “g(x) = 9+x” has its first fixed point at ω because “ω = g(ω) = 9+ω”, due to the mysterious properties of infinite numbers. ω is just the first (smallest) fixed point, however — “ω+1″, “ω+2″, and “ω+3″ are all the next fixed points, of which this g(x) has an infinite amount.
If g(x) were something else, like “g(x) = ω^x”, the fixed point would be different — in this case, the first (smallest) fixed point is ε(0), since ω^ε(0) = ε(0). The second (next smallest) fixed point is ε(1). The “x+1″th fixed point is ε(x).
φ(a, b)
φ(a, b) represents these fixed points — namely φ(a, b) is the “b+1″th fixed point of the function φ(a, 0). φ(a+1, 0) is the set [φ(a,0), φ(a,φ(a,0)), φ(a,φ(a,φ(a,0))), ...].
Something like fφ(4, 0)(3) = fφ(3, φ(3, φ(3, 0)))(3) = fφ(3, φ(3, φ(2, φ(2, φ(2, 0)))))(3) = fφ(3, φ(3, φ(2, φ(2, φ(1, φ(1, φ(1, 0)))))))(3) = fφ(3, φ(3, φ(2, φ(2, φ(1, φ(1, φ(0, φ(0, φ(0, 0)))))))))(3) = …
Keep in mind that exact same equation is the same as fφ(4, 0)(3) = fφ(3, φ(3, φ(3, 0)))(3) = fφ(3, φ(3, ζ(ζ(ζ(0)))))(3) = fφ(3, φ(3, ζ(ζ(ε(ε(ε(0)))))))(3) = fφ(3, φ(3, ζ(ζ(ε(ε(ω^ω))))))(3) = …
Eventually we’ll have something like fφ(ω, 0)(12) = fφ(12, 0)(12) = …
φ(1, 0, 0)
And now for our final stop for this installment of the “Large Numbers” series, we arrive upon φ(1, 0, 0), which is called the Feferman–Schütte ordinal, and is also written as Γ0 or Γ(0), with Γ(x) being φ(1, 0, x). But how do we get here?
φ(1, 0, 0) is the continuation of φ(a, 0) for larger and larger a, just like φ(a, b) was the continuation of b for larger and larger b. φ(1, 0, 0) is the set [φ(1,0), φ(φ(1,0),0), φ(φ(φ(1,0),0),0), φ(φ(φ(φ(1,0),0),0),0), ...], which is a pretty big jump from where we were. We have now left ζ(0) solidly in the dust, and I’m pretty sure it is physically impossible to wrap your head around the magnitude of these numbers.
…And we’re truly just getting started.
Continued in: Large Numbers, Part 5: SVO, LVO, and Beyond
Liked this Essay?
- You can get more Greatplay.net by looking at these categories: All, Mathematics.
- Or perhaps you'd be interested in a complete table of contents of all essays?
- You could also subscribing to the RSS feed, or use the sidebar to subscribe for email updates!
- Or you could follow me on Twitter or like me on Facebook
- If you feel particularly participatory, I'd love to hear from you. Feel free to leave a comment.



I’d glad to see that you decided to redo your large numbers essays. You have one of the best explanations of ordinal notation that I have ever seen. However I did notice a couple of mistakes. φ(0,n)=ω^n, so φ(0,0)=ω^0=1, not ω. φ(0,1)=ω^1=ω. Also, ordinals are not the largest number that is contained in their set; they are the smallest number not contained in the set. So for the set of all finite numbers in your example the ordinal would be the least infinite ordinal, which is of course ω. I think you underestimate Bowers’ notation, too. Some of his higher numbers have ordinals equal to or larger than anything in this part of your essay. For example, {10,1000,3} & 10 is equivalent to f_ζ(0)(1000). The Big Boowa, which is {3,3,3 / 2}, has an associated ordinal of Γ(0)+2. I haven’t figured out the ordinals for Bowers’ very largest numbers, because dealing with ordinals bigger than Γ(0) gives me a headache! That’s why I’m eagerly awaiting your next essay.
Wow, thanks.
Thanks. You’re right on all of those, so changes made.
I haven’t looked at it in that much depth, but I’m very skeptical it gets near Γ(0). I was basing my statement after this analysis.
They give me a headache too, but I’ll do my best. I’ll be heading into a discussion of ordinal collapsing functions next, it looks like.
ζ(1) is not [ζ(0),ζ(0)^ζ(0)...] it is [ζ(0)+1,ε(ζ(0)+1),ε(ε(ζ(0)+1)),...]
[ζ(0),ζ(0)^ζ(0)...] = ε(ζ(0)+1)?
Thomas is right with his correction. Peter’s reply is also correct, but it doesn’t negate what Thomas is saying.
More annoying corrections:
Same comments on the ten rules as last post. Same issues with the second set of ten rules – rule 8 is wrong, and you need additional rules for more polynomials of length greater than 2.
In “From w^3 to w^w”,
f_{ω^3+(ω^2)*2}(x) = f_{ω^3+ω^2+ω+x}(x) should be
f_{ω^3+(ω^2)*2}(x) = f_{ω^3+ω^2+ω*x}(x).
Similarly, f_{ω^3+(ω^2)*3}(x) = f_{ω^3+(ω^2)*2+ω+x}(x) should be
f_{ω^3+(ω^2)*3}(x) = f_{ω^3+(ω^2)*2+ω*x}(x).
The final line,
As an example, f_{ω^ω}(3) = f_{ω^3}(3) = f_{(ω^2)*3}(3) = f_{(ω^2)*2+3}(3) = f(ω^2)*2+23(3) = f(ω^2)*2+2(f(ω^2)*2+2(f(ω^2)*2+2(3))) = …
should be
As an example, f_{ω^ω}(3) = f_{ω^3}(3) = f_{(ω^2)*3}(3) = f_{(ω^2)*2+ω*3}(3) = f_{(ω^2)*2+ω*2+3}(3) = f_{(ω^2)*2+ω*2+2}^3(3) = f_{(ω^2)*2+ω*2+2}(f_{(ω^2)*2+ω*2+2}(f_{(ω^2)*2+ω*2+2}(3))) = …
In “Another Magnitude Check”, you probably want to show that f_{ω^3}(x) is about x ->_x x, so that f_{ω^4}(x) grows much faster. Or something like that.
In “From ω^ω to Epsilon Naught”, the line
ε_0 is the set [ω, ω^ω, ω^(ω^ω), ω^(ω^(ω^ω)), ω^(ω^(ω^(ω^ω))), ...], is not technically correct; ε_0 is the limit of the sequence ω, ω^ω, ω^(ω^ω), ω^(ω^(ω^ω)), ω^(ω^(ω^(ω^ω))), …, or alternatively, it is the union of ω, ω^ω, ω^(ω^ω), ω^(ω^(ω^ω)), ω^(ω^(ω^(ω^ω))), …. The sequence ω, ω^ω, ω^(ω^ω), ω^(ω^(ω^ω)), ω^(ω^(ω^(ω^ω))), … is the fundamental sequence of ε_0.
Regarding using D(O) for the xth member of the fundamental sequence of O: I prefer the notation O[x], since it is a function of x after all. But whatever you prefer.
Under “The Veblen “Phi” Function”,
O is actually φ(3,0), φ(2, 0) represents ζ(0); φ(1, 0) represents ε(0). You are correct that φ(0, 1) represents ω.
Under “Fixed Points”,
g(x) = x+9 actually has no fixed points; it always maps a to a+9, which is 9 greater. In think you are looking for the function g(x) = 9+x, which indeed has fixed points at all a >= ω.
Under “φ(a, b)”,
Same issue with φ(a+1, 0) is the set [φ(a,0), φ(a,φ(a,0)), φ(a,φ(a,φ(a,0))), ...]. Call it the limit instead.
The two expressions fφ(3, φ(3, φ(2, φ(2, φ(1, φ(1, φ(0, φ(0, φ(0, 0)))))))))(3) and fφ(3, φ(3, ζ(ζ(ε(ε(ω^ω^ω))))))(3) are not actually the same, since φ(0, φ(0, φ(0, 0))) = ω^ω, not ω^ω^ω. So you have to shift one sequence up or the other one down.
Anyway, let me say good job on a lucid explanation of ordinal hierarchies!
Fixed, though I want a better explanation of why we can take ε(ζ(0)+1), and what that “+1″ means.
~
It’s better than being wrong.
I believe I made all of your corrections, except for the following…
~
I’m not sure what you meant by that, but I think I have an idea for this section that I need to add. My LaTeX math renderer failed me here.
~
I thought that ordinals were also sets of all ordinals less than itself, for instance ω could be either the smallest infinite number or the set of all finite numbers. I’d just like this cleared up before I go to the work of changing set to fundamental sequence in a whole bunch of places.
~
I’ll keep it as is for now.
~
Thanks!
I think I can clear up the confusion over ε_0 being the limit of ω, ω^ω, ω^(ω^ω)… rather than the set {ω, ω^ω, ω^(ω^ω)…} As you correctly state an ordinal is the set of all ordinals less then itself. But {ω, ω^ω, ω^(ω^ω)…} doesn’t contain ALL the ordinals less than ε_0, for example, it doesn’t contain any finite ordinals. That’s why the limit definition is correct.