Thursday, September 19, 2024

Random Numbers

Random numbers are important for computers. Aside from making games like Solitaire more interesting, the use of randomness in generating passwords and encrypting data is critical to security.

Until fairly recently, cpu’s had no direct way to generate random numbers. Intel’s Pentium III introduced a hardware random number generator that uses thermal noise “to generate high-quality random and nondeterministic numbers” , but prior to that systems that needed good random numbers had to relay on add-on boards or other external input.

The problem, of course, is that the whole point of computer programming is predicability: given the same input, a program will always follow the same path to completion. That makes it impossible to generate random numbers with an algorithm.

For all we know, it may be impossible, period. While thermal noise from a semiconductor certainly seems random to us, if we really understood the physics, it might not be at all. If string theory is real, then the arrangement and orientation of these “strings” might affect their vibration, which in turn affects the behavior of electrons, etc. It’s quite possible that the universe is deterministic – we really aren’t in any position to even guess right now.

The real question is, does it matter? And the answer is both “yes, it does” and “no, it doesn’t”, depending on why you asked. On the one hand, we have apparently non-deterministic events like how many visitors this web site will have today. That’s an apparently very random number, determined by perhaps hundreds of conflicting factors such as news events, advent of security challenges, the weather, Google rankings: well beyond any ability to calculate. Yet, it’s pretty easy for me to guess within a few hundred what that number will be because in large enough populations, the random behavior of individuals is unimportant: statistics and other indicators can tell us what the general group behaviour will be. From this point of view, it doesn’t matter if individuals are non-deterministic because group behavior is not.

On the other side, things that really aren’t random at all can be quite random enough for a particular need. For example, if you had to “randomly” select from a group of 60 items using a computer program, a very simple way to do it would be to use seconds from the current time: if your user runs the program at 23 seconds past the minute, the program selects item 23. This isn’t random at all; it’s very deterministic. If there was any reason for your users to care about which of the sixy items were selected, they could find ways to “cheat” your program fairly easily. But if it isn’t important, the selection may in fact be quite random enough for your needs. The underlying method isn’t random, but it appears to be for the expected use, so the algorithm works. Apparent randomness from a very non-random source.

That is the problem with computer generated “random” numbers: they may not be very random at all. Almost all computer languages have some method of selecting a random number, but in fact it’s not really random. This Perl script will generate the same results every time it is run:

If you saw that run once, you’d think it was giving you random numbers. Run it again and you get the same ten numbers. The numbers will not change no matter how many times you run it, so obviously they aren’t random at all. That’s because it is really only pseudorandom. The argument given to srand seeds the random number generator, and it will always generate the same numbers in the same order for the same starting seed. However if we seed “srand” with something more random (or more apparently random), we get better results:

This has every appearance of generating random numbers. Yet, a cryptographer would warn you not to rely on that for encrypting bank accounts, because just like our simplistic selection from the sixty items, the seed actually isn’t random and given sufficient incentive, that flaw could be very costly.

So how do you get more random? Linux and other operating systems may provide a “/dev/random” and “/dev/urandom”. Don’t assume you know how these work: Mac OS X and Linux both have these, but the man pages are quite different. If your use is casual (you aren’t protecting Fort Knox), the differences are probably unimportant, but “man 4 random” is interesting reading anyway.

Perl has a “TrulyRandom” module that uses interrupt timing discrepancies to get random numbers; but do we know that the discrepancies are truly non-deterministic? Like the thermal activity of a semiconductor, it might actually be predictable. With sufficient knowledge and computing power, any of these things might be as trivial as our “random second” implementation.

Many systems have the PRNG library – many versions of SSH use that. If you’ve ever generated a GPG key, you probably remember this:

We need to generate a lot of random bytes. It is a good idea to perform some other action (type on the keyboard, move the mouse, utilize the disks) during the prime generation; this gives the random number generator a better chance to gain enough entropy.

These unpredicable actions on your part were used to seed the random number generation.

Much attention is paid to the importance of avoiding predicability in random numbers, and yet as we have seen from the most recent ssh exploits, the random numbers used in encryption are only a very small part of the total picture. It was other weaknesses in the ssh code that were exploited, and the encrypted keys were not important at all. The thickest deadbolts on your front door are useless if the back door is wide open.

A.P. Lawrence provides SCO Unix and Linux consulting services http://www.pcunix.com

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles