Large Numbers, Part 3: Functions and Ordinals

Monday, May 16, 2011

Direct continuation of: Large Numbers, Part 2: Graham and Conway

In the first post of this series, we built a basic understanding of magnitude through the use of exponentiation and Knuth up-arrows that was then thrown out the window in the second post when we started going through Graham’s Number and Conway chained-arrows.

Now we’re going to start all over again, and build an even larger number. But to do this, we’re going to start at the most basic unit — the function.

 

The Basics of Growing a Function

A function is something like “y = f(x)” where x is an input and y is an output — it turns every input into a unique output. A function like f(x) = x+1 will take every input, add one to it, and turn that into the output. It does this for every x, ever.

Our first function will be just that: f(x) = x+1, and we’re only interested in integers 0 and larger. Therefore, the first output will be 0+1 = 1, the second will be 1+1 = 2, the third will be 2+1 = 3, etc. Continuting this, we get 4, 5, 6, 7, 8, 9, …

 

Functional Powers

Obviously, this function alone isn’t going to get us anywhere. So we need to do yet another step. The next step is to plug the function back into itself, so we look for f(f(x)), not f(x). This means that every input will be (x+1)+1, or x+2. Then we can repeat this again and try for f(f(f(x))), and then try for f(f(f(f(x)))), and onward.

The amount of nested functions can be written in something called functional powers, written with a superscript to denote how many copies of a function there will be.

f^a(b) = \underbrace{f(f(f(\cdots(f(b)))\cdots)))}_{\text{a nested functions}}

 

The First Diagonalization Function

Then we can make a function and call it f[1](x) that is fx(x), or f(f(f(…f(f(x)))…))) with x nested functions.

f[1](x) produces values that look like this: f(1), f(f(2)), f(f(f(3))), f(f(f(f(4)))), f(f(f(f(f(5))))), f(f(f(f(f(f(6)))))), …

Or, once evaluated: 2, 4, 6, 8, 10, 12, …

This function has changed from adding one to doubling the function. f1(x) = x + x = 2x

 

This function is also the first thing of what we call a diagonalization function. Diagonalization functions take the “x”th value from the “x”th function in a previous series of functions. “f(x), f(f(x)), f(f(f(x))), …” is a series of functions, with each function individually having an infinite amount of values.

By combining them into a diagonalization function (a process called diagonalization), we get a new function that will grow faster than all the functions in the series.

 

How Diagonalization Works

So all those words were just said… but how do they make sense? What is a diagonalization? Well, in order to diagonalize, you first make an infinite list of functions, such as “f(x), f(f(x)), f(f(f(x))), f(f(f(f(x)))), …”. Then you take the 1st value from the 1st function, f(1) = 2; the 2nd value from the 2nd function, f(f(2)) = 4; the 3rd value from the 3rd function, f(f(f(3))) = 6; and so on, constructing a new function by taking all of the values on the diagonal. We will call this f1(x):

 

Growing “Faster”

So what does it mean for f1(x) to grow faster? Since f1000(2) = 1002 and f1(2) = 4, why is f1(2) considered the “larger” function? Well, it all has to do with this thing called growth rate.

The growth rate is a measure of how much the output of a function changes as the input changes. In the example of f(x), as x is increased by 1, f(x) is increased by 1; f(a+1) = f(a) + 1. This means the function has a growth rate of “1″.

Generally, if you have a function g(x) and a function h(x), if g(x) is the “faster function”, then “g(a) – g(b)” will be greater than “h(a) – h(b)” for more values of a and b than it is false, provided “a > b”. In layman’s terms, the “faster function” is the one that, in the end, produces the larger numbers.

For example, consider “g(x) = 1000x” and “h(x) = x^x”. g(x) starts out growing larger, since the difference between g(0) and g(1) is 1000, whereas the difference between h(0) and h(1) is only 1. However, in the end, h(x) grows larger, since the difference between g(1000) and g(1001) is 1000, whereas the difference between h(1000) and h(1001) is about 10^3003.

h(x) = x^x in blue eventually grows much faster than g(x) = 1000x in purple.

 

Many More Diagonalizations

So now that we have f1(x), we can look for f1(f1(x)), and then f1(f1(f1(x))), and then f14(x), and then we can diagonalize over that and find a brand new function f2(x), which equals f1x(x).

f2(x) produces the values “f1(1), f1(f1(2)), f1(f1(f1(3))), f1(f1(f1(f1(4)))), f1(f1(f1(f1(f1(5))))), …”; or “2, 8, 24, 32, 80, …”.

f2(x) = f1x(x) = x * 2^x

But it gets better. Just like we have f1(x) = fx(x) and f2(x) = f1x(x) we can have an f3(x) = f2x(x) and a f4(x) = f3x(x)…

All the way to a rule for infinitely many of these functions: fk(x) = fk-1x(x)

So we can ask ourselves: “what’s f4209(685)?”

Easy. Follow the rule:

f_{4209}(685) = f_{4208}^{685}(685) = \underbrace{f_{4208}(f_{4208}(f_{4208}(\cdots(685)))\cdots)}_{685 \text{ nested functions}}

And while it doesn’t match up exactly, fk(x) will be on the same level of huge as 10→x→k using Conway Chained Arrows.

 

Following the Recursion

Now, we should also be able to notice a pattern with each increase in the subscript:

f(x) = x, add one.
f1(x) = x, add one, x times = x + x = 2x
f2(x) = (x, add one, x times), x times = x doubled, x times = x * 2^x
f3(x) = ((x, add one, x times), x times), x times = (x doubled, x times), x times = (x * 2^x) * 2^(x * 2^x) ≈ x↑↑x
f4(x) = … ≈ x↑↑↑x
f5(x) = … ≈ x↑↑↑↑x

fn(x) = … ≈ hyper(x, x, n) ≈ x→x→(n-2)

 

Diagonalizing the Diagonalizations

Where do we go from here? Well, the next step is to take a diganonalization of all the functions, and create a function from “f1(1), f2(2), f3(3), f4(4), …” that grows faster than any of the infinite functions in that series. But what do we call it?

We call it fω(x).

What is that? The little funny ω is the Greek letter Omega. But why? Turns out that ω is actually an ordinal number, something a bit complicated, but not too bad. Hold on to your stomach, and let’s explain.

 

The Most Basic Ordinal

ω is a set, specifically the set of all positive integers. ω = [0, 1, 2, 3, 4, 5, 6, ..., 8091380, 8091381, ..., 3↑↑↑↑3, ..., G, ...]; every number that is a positive integer is in there, no matter how large. There are an infinite number of numbers contained in ω. ω itself has a value, namely infinity — the smallest number larger than all finite numbers.

Don’t worry about the infinity bit. For now, the way we define fω(x) is to find the “x”th number in the ω set, and replace the “ω” symbol with that entry. More simply, fω(x) = fx(x).

We now have our next step. And we can do this for any ordinal, and there are more to come — if there is an ordinal in the subscript, find the “x”th entry in the set the ordinal represents, and replace that ordinal with it. Don’t worry — we’ll explain in more detail as we go on.

 

Adding to Ordinals

Now we can do something a bit more complicated with another rule. We define fω+1(x). It looks weird, and “ω+1″ is not an addition problem we can do. Instead, this is simply a notation for this, which we’ve seen before:

f_{\omega + 1}(x) = f_{\omega}^{x}(x) = \underbrace{f_{\omega}(f_{\omega}(f_{\omega}(\cdots(x)))\cdots)}_{x \text{ nested functions}}

 

And we can now do that for any number:
f_{\omega + n}(x) = f_{\omega + n-1}^{x}(x) = \underbrace{f_{\omega + n-1}(f_{\omega + n-1}(f_{\omega + n-1}(\cdots(x)))\cdots)}_{x \text{ nested functions}}

 

For example:
f_{\omega + 4209}(3) = f_{\omega + 4208}^3(3) = f_{\omega + 4208}(f_{\omega + 4208}(f_{\omega + 4208}(3))) =
f_{\omega + 4208}(f_{\omega + 4208}(f_{\omega + 4207}^3(3))) =
f_{\omega + 4208}(f_{\omega + 4208}(f_{\omega + 4207}(f_{\omega + 4207}(f_{\omega + 4207}(3))))) = \cdots

 

Comparing to Conway

Even the tiny fω+1(2) ends up making large numbers, generating f8(8), which is roughly 8↑↑↑↑↑↑↑↑8.

However, our other notations made big numbers too. While fω+1(10) is big, it’s still small compared to what Conway notation can do. fω+1(10) is about 10→10→10→2 and fω+2(10) is about 10→10→10→3. fω+10(10) is about 10→10→10→11, and we can see an obvious pattern.

However, if we keep on going with ordinals, we can go even larger than even 10→10→10→1010. It will just take a little bit of work to get there…

 

Multiplying Ordinals

So what happens if we do to ω what ω did to the natural numbers? What if we made a function that was the diagonalization of “fω+1(1), fω+2(2), fω+3(3), fω+4(4), …”.

It’s time to introduce yet another ordinal we’ll be working with: ω2.

fω2(x) = fω+x(x)

 

Now, ω2 itself is a set that contains ω + every positive integer: [ω, ω+1, ω+2, ω+3, ω+4, ..., ω+1000000, ω+1000001, ω+1000002, ..., ω+(10→10→10), ...]

You can figure this out because ω2 is not just written like “ω2″ to be silly… it is actually “ω*2″, or “ω+ω”. You then replace the rightmost ω with x when calculating fω2(x). Note that “2ω” is not the same, since that’s 2*2*2*2*2*…*2 an infinite amount of times, or just infinity, or ω.

It’s weird, but 2ω = ω. If that hurts your head, we can just ignore it. It’s not really that critical to understanding how these functions work.

 

Adding to the Multiplying

Just like how you can add to ω, creating “ω+1″ and a “ω+2″, you can add to ω2. And then that “ω2+1, ω2+2, ω2+3, ω2+4, …” can all be diagonalized in ω3. And then that pattern continues for ω times any number, ωn. That’s a lot of work we just did in one step, and we now have massively larger numbers. Let’s see what we just did.

Every ωn is a set [ω(n-1), ω(n-1) + 1, ω(n-1) + 2, ω(n-1) + 3, ω(n-1) + 4, ...]

ω67, for example is a set [ω66, ω66 + 1, ω66 + 2, ω66 + 3, ...].

And just like before, we select the “x”th member from the set, and replace ωn with that member.

 

Here’s everything we’ve learned in five easy rules, and the first one doesn’t even count. Remember these:

  1. f0(x) = x+1
  2. fa(x) = fa-1x(x)
  3. fω(x) = fx(x)
  4. fωa(x) = fω(a-1)+x(x)
  5. fωa+b(x) = fωa+(b-1)x(x)

 

So using our new knowledge, what is fω14+2(3)?

fω14+2(3) = fω14+13(3) = fω14+1(fω14+1(fω14+1(3))) = fω14+1(fω14+1(fω143(3))) = fω14+1(fω14+1(fω14(fω14(fω14(3))))) = fω14+1(fω14+1(fω14(fω14(fω13+3(3))))) = fω14+1(fω14+1(fω14(fω14(fω13+23(3))))) = …

That’s probably good enough — it will be some huge number. Time to move on.

 

Squaring Ordinals

Whoa. First we were adding them, ω+n. Then when we got to ω+ω, we made that ω2. Then we continued multiplying, going to ωn. Now what? Well, eventually we go to ωω, which we just write as ω^2. ω^2 is where the number in the function actually defines the n for ωn.

More specifically, ω^2 is another infinite set, containing all of the ωn’s: [ω, ω2, ω3, ω4, ω5, ω6, ω7, ..., ω9009, ...]

We then can make a function to select the “x”th value, and evaluate: fω^2(x) = fω*ω(x) = fωx(x).

 

Tacking on Continuations

And one of the reasons ordinals are powerful for making large numbers is that you can extend fω^2(x) through all the means we’ve already made…

fω^2+1(x) = fω^2x(x)
fω^2+ω(x) = fω^2+x(x)
fω^2+ω+1(x) = fω^2+ωx(x)
fω^2+ω2(x) = fω^2+ω+x(x)
fω^2+ω3(x) = fω^2+ω2+x(x)
f(ω^2)*2(x) = fω^2+ω^2(x) = fω^2+ωx(x)
f(ω^2)2+1(x) = f(ω^2)2x(x)

And you can probably see that we’re heading toward ω^3, a set [ω^2, (ω^2)2, (ω^2)3, (ω^2)4, (ω^2)5, ...], where fω^3(x) = f(ω^2)x(x).

 

Magnitude Comparsion

So as we surmised in Part 1, magnitude is going to be intuitively impossible to understand at this scale. However, we can still compare it to the large notations that were built in part 2.

Where would we find Graham’s Number, for example?

Well Graham’s Number itself is a recursive function. If g(x) = 3→3→x = 3↑↑…↑↑3 (with x Knuth up-arrows), then Graham’s Number would be g64(4). Since g(x) is approximately fx(x) = fω(x), that means Graham’s Number would be approximately fω64(4), which is smaller than fω+1(64).

 

Extended Conway Chained-Arrow Notation

Where would we find something like the basic extended arrow notation, say 10→210, which equals 10→10→10→10→10→10→10→10→10→10→10?

Well, we were able to get this pattern:
fω(10) ≈ 10→10→10
fω2(10) ≈ 10→10→10→10
fω3(10) ≈ 10→10→10→10→10
fω4(10) ≈ 10→10→10→10→10→10

So we find 10→210 being approximately fω10(10), or fω^2(10). What about something more advanced, like 10→1010?

Well, this same pattern helps a little:
fω^2+ω(10) ≈ 10→210→210
fω^2+ω2(10) ≈ 10→210→210→210
fω^2+ω3(10) ≈ 10→210→210→210→210
fω^2+ω4(10) ≈ 10→210→210→210→210→210

f(ω^2)2(10) ≈ 10→310

And then again:
f(ω^2)2+ω(10) ≈ 10→310→310
f(ω^2)2+ω2(10) ≈ 10→310→310→310
f(ω^2)2+ω3(10) ≈ 10→310→310→310→310

f(ω^2)3(10) ≈ 10→410

It doesn’t take much work to realize you’ll eventually find 10→1010 at approximately f(ω^2)10(10), or fω^3(10).

And that’s as far as we get with ω^3. The C(a) function remains undefeated. …For now. In the next post, we’ll move past ω^3 and defeat it.

 

For now, here’s all twelve rules we’ve learned so far, in one reference location:

  1. f0(x) = x+1
  2. fa(x) = fa-1x(x)
  3. fω(x) = fx(x)
  4. fωa(x) = fω(a-1)+x(x)
  5. fωa+b(x) = fωa+(b-1)x(x)
  6. fω^2(x) = fωx(x)
  7. fω^2+b(x) = fω^2+(b-1)x(x)
  8. f(ω^2)a(x) = f(ω^2)*(a-1) + ωx(x)
  9. f(ω^2)a+b(x) = f(ω^2)a+(b-1)x(x)
  10. f(ω^2)a+ωb(x) = f(ω^2)a + ωb + x(x)
  11. f(ω^2)a+ωb+c(x) = f(ω^2)a + ωb + c-1x(x)
  12. fω^3(x) = f(ω^2)x(x)

Continued in: Large Numbers, Part 4: Epsilon and Phi

Be Sociable, Share!

 

Liked this Essay?

3 Comments (RSS)

Read them below and add one yourself.

  1. Jesse says:

    Wow..

    You explained this with great clarity. Now I’m disappointed that I have to wait to read the next installment.

  2. Deedlit says:

    More pedantic corrections.

    Under “How Diagonalization Works”, the first row should be x, the second row f(x), the third f(f(x)), etc., instead of f(x), f(f(x)), f(f(f(x)))…

    Under “Comparing to Conway”, you probably meant f_{w+10}(10) = 10 -> 10 -> 10 -> 10 instead of f_{w+n}(10). Although, f_{w+1}(10) is actually closer to 10 -> 10 -> 10 -> 2, f_{w+2}(10) is actually closer to 10 -> 10 -> 10 -> 3, etc.

    Regarding the 10 rules:
    f(x) should probably be f_0(x) to be consistent with your definition. (Otherwise, rule 2 doesn’t define f_1(x) = f^x(x) )
    Rule 8 should be f_{(ω^2)a}(x) = f_{(ω^2)*(a-1) + ω*x}(x), not f_{(ω^2)*(a-1) + x}(x)
    Also, you need two additional rules for f_{(ω^2)a + ω*b}(x) and f_{(ω^2)a + ω*b + c}(x).

  3. More pedantic corrections.

    No worries, I welcome them to keep my work correct.

    ~

    Under “How Diagonalization Works”, the first row should be x, the second row f(x), the third f(f(x)), etc., instead of f(x), f(f(x)), f(f(f(x)))…

    I’m mostly disputing this because I don’t want to redraw the image, but what difference does it make?

    ~

    Under “Comparing to Conway”, you probably meant f_{w+10}(10) = 10 -> 10 -> 10 -> 10 instead of f_{w+n}(10). Although, f_{w+1}(10) is actually closer to 10 -> 10 -> 10 -> 2, f_{w+2}(10) is actually closer to 10 -> 10 -> 10 -> 3, etc.

    Fixed.

    ~

    f(x) should probably be f_0(x) to be consistent with your definition. (Otherwise, rule 2 doesn’t define f_1(x) = f^x(x)

    Fixed.

    ~

    Rule 8 should be f_{(ω^2)a}(x) = f_{(ω^2)*(a-1) + ω*x}(x), not f_{(ω^2)*(a-1) + x}(x)

    Fixed.

    ~

    Also, you need two additional rules for f_{(ω^2)a + ω*b}(x) and f_{(ω^2)a + ω*b + c}(x).

    Added.

Leave a Reply

Comment HTML: You can use HTML in comments. I reccomend <blockquote>Quote</blockquote> for quoting what others have said. <b>Text</b> is for bold, <i>Text</i> is for italic, and <a href="url">text</a> is for making links.