So every once in a while I play around with the Ackermann Function. It is a brutally recursive algorithm that grows very, very fast. Ackermann(3,3) is trivial even on old hardware. Ackermann(4,3) in its straightforward form can take longer than the lifespan of the universe on typical hardware, and will likely outlast the best hardware.

So, I like to try to optimize it. One trick I picked up from this medium post Computing Ackermann function in Ruby by Aliser Zakir is to directly compute smaller values. Basically, there are simple, entirely non recursive algorithms for m=0 through m=2. This will cut out a lot of the recursion and return many values with simple addition and multiplication.

Using this brought ackermann(4,2) back in several seconds in SML/NJ, and with a switch to C++ and GMP, into the realm of tens of milliseconds. But while an impressive speedup from my previous “never going to happen ever”, I wanted to try pushing it further.

Direct computation of m=4 requires exponentian, the problem being that this is itself a recursive process. Straightforward exponentian algorithms are a problem because they are themselves recursive(though less brutally so than Ackermann) which limits the potential speedup. But here we only need powers of 2, which lead to my initial attempt of using a bit shift to emulate that.

This didn’t work, so I dug further into the GMP documentation and tried its power function. This got me to just under 1 millisecond.

Then I was thinking, did I do anything wrong in the bit shift? I did. I used long integers. What if I did the bitshift with unsigned long? So, back to my code, I implemented this.

Got a fast result, but it was wrong. I was overflowing.

I’ve tried getting to (4,3), but this optimized version returns -2 and a less optimized version either segfaults or locks up my computer, depending on how much stack space I give it. I’m pretty sure that I’d need to directly compute m=4 to get anywhere past (4,2).

So, back to the GMP power function and here’s my code. Ackermann(4,2) in just under a millisecond. There are some other optimizations on the page I took this idea from that i might try at some point.

```
include <gmpxx.h>
include <iostream>
include <chrono>
mpz_class mpz_pow_two(long int n)
{
mpz_t r;
mpz_init(r);
mpz_ui_pow_ui(r, 2, n);
mpz_class result(r);
return result;
}
mpz_class ackermann(mpz_class m, mpz_class n)
{
if(m == 0){
return n+1;
}else if(m == 1){
return n+2;
}else if(m == 2){
return 2 * n + 3;
}else if(m == 3){
return mpz_pow_two(n.get_ui() + 3) - 3;
}else if(n==0){
return ackermann(m-1, 1);
}else{
return ackermann(m-1, ackermann(m, n-1));
}
}
int main(){
auto begin = std::chrono::high_resolution_clock::now();
mpz_class result = ackermann(4,2);
auto end = std::chrono::high_resolution_clock::now();
auto elapsed = std::chrono::duration_cast(end-begin);
std::cout << "Answer: " << result << '\n';
std::cout << "Time: " << elapsed.count() << '\n';
return 0;
}
```