6 Comments
User's avatar
skybrian's avatar

I'm reminded of random number generators. Designing them seems to be a dark art, but one thing they try to do is to maximize the period. Eventually every finite state machine must repeat, but by having a long period, they put that off as long as possible.

From: https://www.pcg-random.org/rng-basics.html

> A pseudo-random number generator is like a big codebook. In fact, a 32-bit generator with a period of 2^32 is like a codebook that is 16 GB in size, which is larger than the compressed download for the whole of Wikipedia. But a 32-bit generator with a period of 2^64 is like 64 EiB (exbibytes), or about six times a mid-1999 estimate of the information content of all human knowledge. How big would an actual book of 2^64 random numbers be? It would be so large, it would require the entire biomass of planet earth to make!

> One difference from a spy's codebook is that in a typical random generator everyone has the book. Thus, whereas spies would keep their codebook secret and could allow anyone to know the page number, random number generators do the opposite—the sequence of the generator (the codebook in our analogy) is known, but if we want to avoid prediction, no one must know where in that sequence we are looking. If the sequence is large enough (and there is no discernible pattern to the numbers), it will be infeasible to figure out what numbers are coming next.

Also, if you have a large enough state space and a clever design, you can set up a random number generator to generate Shakespeare:

From: https://www.pcg-random.org/party-tricks.html

> An argument for k-dimensional equidistribution goes like this: Suppose you bought a lottery ticket every week—how would you feel if you discovered that the clerk at the store was handing you a fake ticket and pocketing the money because, at 259 million to one, you were never going to win anyway. You might, rightly, feel cheated. Thus as unlikely as any particular k-tuple might be, we ought to have a real chance, however remote, of producing it.

> An argument against providing k-dimensional equidistribution (for large k) is that infinitesimal probabilities aren't worth worrying about. You probably aren't going to win the lottery, and your monkeys won't write Shakespeare.

> But what if rather than blindly search for Shakespeare, we manufacture a generator that is just on the cusp of producing it. [...]

Jacob Bayless's avatar

There's a close connection! Many pseudorandom number generators are built using chaotic maps (which is what the pure-math people call chaotic systems). They tend to be discrete-time systems which means that they can get away with fewer than 3 dimensions (because the state vector can "teleport" without crossing itself) but they repeat eventually because the state vector has finite resolution.

It's definitely a dark art, and not only do you want the period to be as long as possible, you also want to make sure that every finite-length sub-sequence reoccurs as often as possible. Otherwise, an attacker could try to learn the state vector and guess the next random number by recording the previous k samples, and matching that pattern to some unique index, as you mentioned.

RNGs don't really have control inputs as far as I'm aware, so there's not much we can do to apply feedback to guide their future evolution into some desirable state (and that's probably a good thing...), but it's definitely an interesting application of chaos theory!

This suggests another interesting question -- with a chaotic system, we may struggle to predict the future state due to finite measurement precision of the present state. But what if we can supplement that with imprecise measurements of several previous states? Sensitivity to initial conditions goes both ways (for positive and negative time), meaning that other states within measurement error of the present state would have very different past histories than the trajectory we've observed; therefore we can rule them out as possibilities, and be more certain about the true state. If the system has "cryptographically-secure" chaos, that may not help much, but perhaps there are practical chaotic systems where we can radically improve our predictive power this way. I'm not sure.

Looking for Shakespeare in a PRNG also reminds me of looking for sequences within the digits of pi. I suppose some programs that calculate pi's digits can be formulated as a recurrence relationship, so maybe there's a family of pi-calculating chaotic maps...?

skybrian's avatar

In this case, the trick is that PCG has an “extended” generation scheme with a large state space that consists of many linear congruentual generators. Each of these can be set up to reach a particular value in a given number of steps, which when combined, output a particular value.

Unlike some generators, PCG doesn’t reveal all of its internal state in the random number sequence - there is a mixer function. So this predictability isn’t as bad as it might seem.

There are also “counter based” random number generators where the internal state is trivially predictable (it’s a simple counter), but the mix function is complicated and difficult to reverse. A cryptography secure random number generator can be counter-based if the mixer is good enough - a secure hash. These are too slow for some purposes, though.

Regarding the repetition of subsequences, the goal seems to be to have them repeat as often as they normally would by chance for a true random number generator. However, there’s a limit to how much of the period you can use - another reason why having a large enough period is important.

Patrick M Brennan's avatar

God may not play dice, but she does play billiards.

Fantastic article, I'm going to come back and savor it.

Jeff Houlahan's avatar

Just published the first instalment of a paper on chaotic dynamics at https://sportsscienceandfiction.substack.com/p/science-chaotic-dynamics You can download the full manuscript at www.jeffhoulahanecologist.com

Jasper's avatar

> You may be wondering, what if a 2D trajectory follows a space-filling curve? The ultimate solution to the game of snake, it’s an infinite curve that packs in a finite area and never needs to self-intersect.

I'm not a pro mathematician, but I think that I have an answer to this question. A "space filling curve", defined as a continuous surjection from R^1 (or some interval on it) to R^2 (or some connected open subset of it), cannot be injective. A continuous map which is both surjective and injective is a homeomorphism, or an equivalence of topological spaces. But lines and planes have different topological properties and thus are not topologically equivalent. For example, you can divide a line into two disconnected components by removing a single point, but you cannot do the same thing with the plane. Since the map cannot be injective, multiple points on the line map to the same point in the plane, and the curve must self-intersect.

There are a class of curves known as the Osgood curves that fill a finite area without self-intersection, but they are not space filling curves because they do not cover an open subset of the plane. An Osgood trajectory could be dense, that is pass arbitrarily close to any given point, but will not pass through almost every point.

A more fundamental issue with space filling curves is that we expect natural processes to be both continuous and differentiable, especially if we want to describe them with differential equations! Space filling curves are fractal and cannot be differentiable everywhere. So a differential equation describing a space filling curve or an Osgood curve must be filled with singularities. Calling such a system "second order" would be a bit misleading, since the behaviour would be dominated by an infinite number of singularities.

Thank you for the thought provoking article!