I understand why a non-standard compiler-specific implementation of int128 was not used (Besides being compiler specific, the point of the article is to walk through an implementation of it), but why use
> using u64 = unsigned long long;
? Although in practice, this is _usually_ an unsigned 64 bit integer, the C++ Standard does not technically guarantee this, all it says is that the type need to be _at least_ 64 bits. [0]
I would use std::uint64_t which guarantees a type of that size, provided it is supported. [1]
Re: Multiplication: regrouping our u64 digits
I am aware more advanced and faster algorithms exist, but I wonder if something simple like Karatsuba's Algorithm [2] which uses 3 multiplications instead of 4, could be a quick win for performance over the naive method used in the article. Though since it was mentioned that the compiler-specific unsigned 128 integers more closely resembles the ones created in the article, I suppose there must be a reason for that method to be used instead, or something I missed that makes this method unsuitable here.
Speaking of which, I would be interested to see how all these operations fair against compiler-specific implementations (as well as the comparisons between different compilers). [3]. The article only briefly mentioned their multiplication method is similar for the builtin `__uint128_t` [4], but did not go into detail or mention similarities/differences with their implementation of the other arithmetic operations.
[0] https://en.cppreference.com/w/cpp/language/types.html The official standard needs to be purchased, which is why I did not reference that. But it should be under the section basic.fundamental
[1] https://en.cppreference.com/w/cpp/types/integer.html
[2] https://en.wikipedia.org/wiki/Karatsuba_algorithm
[3] I suppose I could see for myself using godbolt, but I would like to see some commentary/discussion on this.
[4] And did not state for which compiler, though by context, I suppose it would be MSVC?