S[α] for strings of ordinals

08 Mar 2020

Update 2020/03/10: Added complexity analysis for S.substr(α, N)

In a previous blog post, I defined for each ordinal $\alpha$ a string $S_\alpha$ (made of the characters for the empty set, comma, opening brace and closing brace) that enumerates the element of $\alpha$ . I gave a simple formulas to calculate the length $L_\alpha$ of this string.

My colleague Ioanna was a bit disappointed that I didn’t provide a script for calculating the infinite $S_\alpha$ strings. Obviously, the complexity would be “ $\Omega(\omega)$ ” but it is still possible to evaluate the string at a given position: Given $\beta < L_\alpha$ , what is the character of $S_\alpha$ at position $\beta$ ?

Since the initial segments of the strings are compatible, another way to express this is by introducing the class $S = \bigcup_{\alpha \in {\mathrm{Ord}}} S_\alpha$ corresponding to a giant string enumerating the class $\mathrm{Ord}$ of all ordinals. Given an ordinal $\alpha$ , what is $S[\alpha]$ ?

A small generalization of this S.charAt(α) operation is S.substr(α, N) calculating the substring of length $N$ starting at $\alpha$ .

Example

$S_{2}$ is "∅,{∅}", so $S[0]$ is the character ∅, $S[1]$ is a comma, $S[2]$ is an opening brace and $S[4]$ is a closing brace.
$S_{\omega+1}$ is made of of an $\omega$ -concatenation of finite strings (the character ∅, a comma, $S_1$ surrounded by braces, a comma, $S_2$ surrounded by braces, a comma, $S_4$ surrounded by braces, a comma, etc), followed by a comma, an opening brace, the same $\omega$ -concatenation of finite strings and finally a closing brace. So $S[\omega]$ is a comma, $S[\omega+1]$ is an opening brace, $S[\omega+2]$ is the character "∅" and $S[\omega2]$ is a closing brace.

In the previous example, we have basically analyzed the string $S_{\alpha+1}$ at a given successor ordinal, splitting it into two copies of $S_\alpha$ , comma and braces. This suggests some easy values of $S$ :

Lemma

For any $n < \omega$ and $\beta \geq 1$ , $S[\alpha]$ is:

A comma if $\alpha$ can be written $2^n - 3$ or $\omega^\beta 2^n + n$
An opening brace if $\alpha$ can be written $2^n - 2$ or $\omega^\beta 2^n + n + 1$
A closing brace if $\alpha$ can be written $2^{n+1} - 4$ or $\omega^\beta 2^{n+1} + n$

Proof: For any ordinal $\alpha$ , by viewing the string $S_{\alpha+1}$ as a concatenation of $S_\alpha$ , a comma, an opening fence, $S_\alpha$ and a closing fence, we deduce that:

${S\left[L_{\alpha}\right]}$ is a comma.
${S\left[L_{\alpha} + 1 \right]}$ is an opening brace.
$L_{\alpha+1}$ is a successor ordinal and ${S\left[L_{\alpha+1} - 1 \right]}$ is a closing brace.

The lemma follows immediately from the calculation of string lengths performed in the previous blog post. □

Warning: The rest of the blog post gives the solution to this puzzle, so you might want to have fun solving it yourself first and then go back checking my proposed solution later 😉...

More generally, the proof of the lemma can be extended by saying that if we find $n$ such that ${2^n - 1} \leq \alpha \leq {2^{n+1} - 5}$ then $\delta = {\alpha - {(2^n - 1)}} \leq {2^n - 4}$ is the index of ${S[\alpha]}$ in the second substring $S_\alpha$ of $S_{\alpha+1}$ and so ${S[\alpha]} = {S[\delta]}$ .

Details will be provided in the theorem below but one can already write a simple JavaScript recursive program to evalute $S$ at finite ordinals:

Script for $\alpha < \omega$

The character of

S

at position

\alpha =

The following intermediary step will be helpful to evaluate $S$ at infinite ordinal $\alpha$ :

Proposition

Let $\alpha$ is infinite and $\beta = \log_{\omega}(\alpha) \geq 1$ . Let $1 \leq q < \omega$ and $0 \leq \rho < \omega^\beta$ be the quotient and remainder of the euclidean division of $\alpha$ by $\omega^\beta$ . Let’s define:

$n = \begin{cases} \left\lfloor {\log_2(q)} \right\rfloor & \text{ if } q \text{ is not a power of 2 or this value is } \leq \rho \\ \left\lfloor {\log_2(q)} \right\rfloor - 1 & \text{ otherwise.} \end{cases}$

Then $S[\alpha]$ is:

A comma if $\alpha = {\omega^{\beta} 2^n + n}$
An opening brace if $\alpha = {\omega^{\beta} 2^n + n + 1}$
An closing brace if $\alpha = {\omega^{\beta} 2^{n+1} + n}$
$S[\delta]$ otherwise where $\delta$ is the unique ordinal such that $\alpha = {\omega^\beta 2^n + n + 2 + \delta}$ and $\delta < {\omega^{\beta} 2^n + n}$ .

Proof: First by construction we have $\alpha = { {\omega^\beta q } + \rho}$ .

If the first case of the definition of $n$ , $2^n \leq q < 2^{n+1}$ so we always have $\alpha < {\omega^\beta {(q+1)}} \leq \omega^\beta 2^{n+1} \leq L_{\omega{\beta} + n + 1}$ . If additionnaly $q$ is not a power then $2^n < q$ and so $\alpha \geq {\omega^{\beta}q} \geq {\omega^{\beta}{(2^n+1)}} \geq L_{\omega{\beta} + n}$ . Otherwise $q = 2^n$ and $n \leq \rho$ and so again $\alpha \geq L_{\omega{\beta} + n}$ .

In the second case of the definition of $n$ , $\left\lfloor {\log_2(q)} \right\rfloor \geq \rho + 1$ so $n \geq \rho \geq 0$ . $q$ is a power of 2 and more precisely $q = 2^{n+1} > 2^n$ so we deduce the same way as in the previous case that $\alpha \geq L_{\omega{\beta} + n}$ . Moreover, $\rho < n + 1$ so again $\alpha = {\omega^\beta 2^{n+1} + \rho} < L_{\omega{\beta} + n + 1}$ .

We can thus view $S_{\omega \beta + n + 1}$ as a concatenation of $S_{\omega \beta+n}$ , a comma, an opening brace, $S_{\omega \beta+n}$ and a closing brace. We assume that ${ {\omega^{\beta} {2^n}} + n + 2} \leq \alpha < {\omega^{\beta} 2^{n+1} + n}$ as the three other cases are handled by the lemma. Then $\delta$ is well-defined and is actually the index of $S[\alpha]$ in the second copy of $S_{\omega \beta + n}$ so ${S[\alpha]} = S{[\delta]}$ . □

We are now ready to give a nice way to evaluate $S$ at any ordinal $\alpha$ :

Theorem

$S[\alpha]$ can be calculated inductively as follows:

If $\alpha = 0$ , $S[\alpha]$ is the character "∅".
If 0<α≤ω0 < \alpha \leq \omega then 2⌊log2(α+3)⌋−3≤α≤2⌊log2(α+3)⌋+1−4 \right\rfloor}} - 3} \leq \alpha \leq \right\rfloor + 1}} - 4} and S[α]S[\alpha] is equal to:
- A comma if $\alpha = {2^{\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor} - 3}$
- An opening brace if $\alpha = {2^{\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor} - 2}$
- A closing brace if $\alpha = {2^{\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor + 1}} - 4$
- $S[\delta]$ otherwise where $\delta = {\alpha - \left( 2^{\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor} - 1 \right)} \leq {2^{\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor} - 4} \leq \alpha - 3$ .
  Moreover $\delta$ compares against $\alpha$ as follows: $\frac{\delta}{\alpha} \leq \frac{1}{2} + \frac{1}{2^{\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor + 1}} \leq \frac{3}{4}$ and ${\left\lfloor {\log_2{(\delta+3)}} \right\rfloor} < {\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor}$
If α\alpha is infinite, let 0<c1<ω0 < c_1 < \omega be the coefficient in its Cantor normal form corresponding to the smallest infinite term and c0<ωc_0 < \omega be finite term if there is one, or zero otherwise. Let kk be the exponent corresponding to the smallest nonzero term in the binary decomposition of c1c_1. Then S[α]S[\alpha] is equal to:
- A closing brace if $c_0 \leq k - 1$ .
- A comma if $c_0 = k$ .
- An opening brace if $c_0 = k + 1$ .
- $S[{c_0 - {(k+2)}}]$ if $c_0 \geq k + 2$ .

Proof:

The case $\alpha = 0$ is clear. Suppose $0 < \alpha < \omega$ and let $l = {\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor}$ . By definition $2^l \leq \alpha + 3 < 2^{l+1}$ so $2^l - 3 \leq \alpha \leq 2^{l+1} - 4$ . Let’s consider the case $2^l - 1 \leq \alpha \leq 2^{l+1} - 5$ as the three other cases are already known from the Lemma. $S_{l+1}$ is made of the concatenation of $S_l$ , a comma and $S_l$ surrounded by braces. $\delta$ is actually the index of $S[\alpha]$ in the second copy of $S_l$ so ${S[\alpha]} = S{[\delta]}$ . The first equality is straighforward:

$\delta = {\alpha - {(2^l -1)}} \leq {2^{l+1} - 5 - {(2^l -1)}} = {2^l - 4} = {2^l - 1 - 3}\leq \alpha - 3$

Morever we have:

$\frac{\delta}{\alpha} \leq {1 - \frac{2^l - 1}{2^{l+1}+5}} \leq {1 - \frac{2^l - 1}{2^{l+1}}} \leq {\frac{1}{2} + \frac{1}{2^{l+1}}}$

and so the second inequality follows from the fact that $l \geq 1$ . Finally, the third inequality comes from:

$2^{\left\lfloor {\log_2{(\delta+3)}} \right\rfloor} \leq {\delta + 3} \leq {2^l - 4 + 3} < 2^l$

Let’s consider the case of an infinite $\alpha$ . For some $N \geq 1$ , $\beta_N > \beta_{N-1} > \dots > \beta_1 \geq 1$ and $c_0 < \omega$ and $0 < c_1, c_2, c_N < \omega$ , we can write Cantor’s Normal form as:

$\alpha = { \omega^{\beta_N} c_N} + { \omega^{\beta_{N-1}} c_{N-1} } + \dots + { \omega^{\beta_{1}} c_{1} } + c_0$

With the notation of the proposition, we have $\beta = \beta_N$ , $q = c_N$ and

$\rho = { \omega^{\beta_{N-1}} c_{N-1} } + \dots + { \omega^{\beta_{1}} c_{1} } + c_0$

If $N \geq 2$ , then $\rho$ is infinite and so $n = \left\lfloor {\log_2(q)} \right\rfloor$ and we are in the fouth bullet of the proposition. Moreover $q \geq 2^n$ and we can write:

$\alpha = { \omega^{\beta} 2^n} + {\omega^{\beta} \left(q-2^n\right)} + \rho = { \omega^{\beta} 2^n} + n + 2 + \delta$

where $\delta = {\omega^{\beta} \left(q-2^n\right)} + \rho$ is infinite and so cancels out the $n + 2$ term. It follows that $S{[\alpha]} = S\left[ {\omega^{\beta} \left(q-2^n\right)} + \rho \right]$ . Essentially, we have just removed from $c_{N}$ its term of highest exponent in its binary decomposition!

By repeated application of the theorem, we can remove each binary digit of the $c_i$ for $i$ going from $N$ to $2$ . When then arrive at $i = 1$ :

${S[\alpha]} = S\left[ \omega^{\beta_1} c_1 + c_0\right]$

With the notation of the proposition, we now have $\beta = \beta_1$ , $q = c_1$ and $\rho = c_0$ . If the binary decomposition of $c_1$ has more than one nonzero digit then so $q$ is not a power of 2. So although $\rho$ is now finite, we are still in the first case of the proposition and $\delta$ remains infinite. So we can remove all but the last digit of $c_1$ by repeated application of the proposition:

${S[\alpha]} = S\left[ \omega^{\beta_1} 2^k + c_0\right]$

where $k$ is the exponent corresponding to the smallest nonzero term in the binary decomposition of $c_1$ .

Using the lemma, $S[\alpha]$ is a comma if $k = c_0$ , an opening brace if $k = c_0 - 1$ and a closing brace if $k = c_0 + 1$ .

If $k \leq c_0 - 2$ then $k = \left\lfloor {\log_2(2^k)} \right\rfloor \leq c_0$ and writing

${\omega^{\beta_1} 2^k + c_0} = \omega^{\beta_1} 2^k + k + 2 + \left(c_0 - {(k + 2)}\right)$

we deduce from the proposition that ${S[\alpha]} = {S[{c_0 - {(k+2)}}]}$ .

Finally, if $k \geq c_0 + 2$ then $k = \left\lfloor {\log_2(2^k)} \right\rfloor > c_0$ and writing

${\omega^{\beta_1} 2^k + c_0} = {\omega^{\beta_1} 2^{k-1}} + k - 1 + 2 + {\omega^{\beta_1} 2^{k-1}} + c_0$

we deduce from the proposition that

${S[\alpha]} = S\left[ \omega^{\beta_1} 2^{k-1} + c_0\right]$

We have essentially decremented $k$ and we can repeat this until we reach the case $k = c_0 + 1$ for which we already said that the character is a closing brace. □

As an application of this theorem, here is a few simple exercises:

Exercise 1

$S\left[2^72 + 2\right]$ is an empty set.
$S\left[\omega^72\right]$ is a comma.
$S\left[\omega^72 72 + 72\right]$ is a closing brace.
$S\left[\omega^72 72 + \omega^42 42 + 12\right]$ is an opening brace.

Exercise 2

The evaluation of $S$ at

Limit ordinals is either a comma or a closing brace.
Epsilon numbers is a comma.
Indecomposable ordinals is a comma.

Corollary 1: Time complexity

Let $\alpha$ be an ordinal. Let $c_1 < \omega$ be the coefficient in its Cantor normal form corresponding to the smallest infinite term if there is one, or zero otherwise. Let $c_0 < \omega$ be its finite term if there is one, or zero otherwise. Then $S[\alpha]$ can be evaluated in:

$O\left( \log_2{(c_0+2)}^2 + \log_2{(c_1+2)} \right)$ elementary arithmetic operations and comparisons on integers.
$O\left( \log_2{(c_0+2)}\right)$ elementary operations if integers are represented in binary and leading/trailing zero counting and bit shifts are elementary operations.

Proof: First note that the “+ 2” is just to workaround for the edge cases $c_0 = 0$ or $c_1 = 0$ .

For the infinite case $c_1 > 0$ we need to calculate $k$ that is performing the find first set. A naive implementation can be done in $O(\log_2{(c_1+2)})$ steps by browsing the digits of $c_1$ to find the first nonzero for example by calculating the remainder modulo increasing power of 2. Each iteration requires only $O(1)$ elementary integer operations %, * and comparisons. Then returning the result of moving to the finite case requires $O(1)$ integer operations +, − and comparisons.

For the finite case $c_1 = 0$ and $\alpha = c_0$ , first notice that we only require $O\left( \log_2{(c_0+2)}\right)$ recursive calls given the inequality:

${\left\lfloor {\log_2{(\delta+3)}} \right\rfloor} < {\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor}$

The case $\alpha = 0$ only requires one comparison. For the case $\alpha > 0$ , we need to calculate $l = {\left\lfloor {\log_2{(\alpha+3)}} \right\rfloor}$ which is integer rounding of the binary logarithm or even just $2^l$ . As above, we can provide a naive implementation by calculating the quotient modulo increasing power of 2 in $O(l)$ comparisons and elementary integer operations /, *. Then returning the result of moving to a $\delta < \alpha$ requires only $O(1)$ integer operations and comparisons. In total, complexity is $O\left( \log_2{(c_0+2)}^2\right)$ .

Finally, this can be simplified if one calculates $k$ and $l$ by a simple leading/trailing zero counting (or similar) and $2^l$ by a bit shift. □

Script based on Cantor normal form

If the coefficients of Cantor normal form of

\alpha

corresponding the smallest infinite term and finite term are

c_1

c_0

=
Then the character of

S

at position

\alpha

Once we have an algorithm for S.charAt(α), it is easy to get an algorithm of $N$ times that complexity for S.substr(α, N) calculating the substring of length $N$ starting at position $\alpha$ by repeated calls to S.charAt(α). Let’s analyze a bit more carefully how we can make this recursive and more efficient:

Corollary 2: Algorithm for S.substr(α, N)

Let $\alpha$ be an ordinal and $N < \omega$ . Then S.substr(α, N) can be calculated as follows:

If $N = 0$ then it is the empty string.
If $\alpha = 0$ and $N = 1$ then it is the character "∅".
If 0<α<ω0 < \alpha < \omega, let l=⌊log2(α+N+2)⌋l = \left\lfloor {\log_2{(\alpha+N+2)}} \right\rfloor. We have 2l−3≤α+N−1≤2l+1−42^l-3 \leq {\alpha + N - 1} \leq 2^{l+1} - 4 and the result is obtained by concatenating the following strings:
1. If $\alpha \leq 2^l - 4$ , the substring at offset $\alpha$ and length $2^l - 3 - \alpha$ .
2. If $\alpha \leq {2^l - 3}$ , a comma.
3. If $\alpha \leq {2^l - 2} \leq {\alpha + N - 1}$ , an opening brace.
4. If ${\alpha + N - 1} \geq {2^l - 1}$ , the substring at offset $\delta = \mathrm{max}{(0, \alpha - {(2^l - 1)})}$ and length $1 + \mathrm{min}{({\alpha + N - 1}, {2^{l+1} - 5})} - \mathrm{max}{(\alpha, {(2^l - 1)})}$ .
5. If ${\alpha + N - 1} = {2^{l+1} - 4}$ , a closing brace.
If α\alpha is infinite, let 0<c1<ω0 < c_1 < \omega be the coefficient in its Cantor normal form corresponding to the smallest infinite term and c0<ωc_0 < \omega be finite term if there is one, or zero otherwise. Let kk be the exponent corresponding to the smallest nonzero term in the binary decomposition of c1c_1. Then the result is obtained by a concatenating the following strings:
1. If $c_0 \leq k - 1$ , ${\mathrm{min}{(N, k - c_0)}}$ closing braces.
2. If $c_0 \leq k \leq c_0 + N - 1$ , a comma.
3. If $c_0 \leq k + 1 \leq c_0 + N - 1$ , an opening brace.
4. If $c_0 + N - 1 \geq k + 2$ , the substring at offset $\delta = \mathrm{max}{({0, {c_0 - {(k + 2)}}})}$ and length $c_0 + N - {\mathrm{max}{(c_0, k + 2)}}$ .

Moreover, this only adds $O(N)$ compared to the complexity of evaluating to a single offset.

Proof: The algorithm is just direct application of the Theorem. For the case where $\alpha \geq \omega$ , the only change is that we add at most $N$ characters before moving to the finite case. The case $\alpha < \omega$ is essentially a divide-and-conquer algorithm and we have a relation of the form:

{T(L)} = 2{T\left(\frac{L}{2}\right)} + {f(L)}

where $L = 2^l = {\Theta{(\alpha + N)}}$ and $f{(L)}$ is $O(1)$ or $O(\log_2{L})$ depending on available operations, but in any case $O{(\sqrt{L})}$ . So from the master theorem, ${T{(L)}} = {O(L)} = {O(\alpha+N)}$ . In general, this bound is not as good as repeating $N$ calls to S.charAt!

However, we note that if we assume that $\alpha > N - 4$ then ${\alpha + 2 + N} < {2{(\alpha+3)}}$ and so

$\log_2\left(\alpha + 2 + N\right) < {1 + \log_2{(\alpha+3)}}$

We can easily discard the edge case where these start/end offsets point to a brace surrounding the right substring of the iterative step and so we get:

$\left\lfloor {\log_2{(\alpha+N+2)}} \right\rfloor = \left\lfloor {\log_2{(\alpha+3)}} \right\rfloor$

which means that the left substring is just empty and so the complexity is not changed compared to S.charAt. Finally, when $\alpha \leq N - 4$ the previous bound tells us that the steps are done in $O(N)$ . □

Script for S.substr(α, N)