Even with all of the cryptologic and cryptographic technology that has existed in the world for the past 60 years, we still don't really know what encryption is good for or how to use it -- or, more importantly, why it's important. Maybe it's time for people and coders to actually start practicing how to use it, like any other skill.
"What are the important problems in cryptography?" I'm asking myself (and you, gentle reader) this particular question because it exposes what I think is a disconnect between cryptographic practice and cryptographic need. (That, and I got to reading, a few nights back, on the experiences of a software engineer at Bell Labs.)
There are two typical use-case scenarios that people generally view:
1) The corporate and governmental need to ensure that business is transacted securely, that they are not subject to some annoying hack/attack that could cause issues. In this case, a single strongly-bound identity seems to make sense. I've expressed earlier in this blog why I think that's a myth.
2) The "Chinese Dissident" use-case. In this case, a victim of human-rights violations wishes to communicate with other dissidents without the government (which is perpetrating the human-rights violations) knowing, or at least without being able to identify who's sending what, or even what they're sending. In this use-case, a strongly-bound identity is NOT acceptable nor desirable. (But then again, we left the cryptographic standards up to the governments and the corporations which flourish under government. They never imagined that the second use-case could possibly be appropriate ("dissidents usually are less tech-savvy and don't have access to the same type of equipment" seems to be the thought-view that pervades that mindset), so they never created a mechanism in their standards where it could be.)
But these are the boundary cases. These are the edges of what people do -- in our (Western) consumerist culture, we're taught to look at ourselves as 'consumers' -- that is, people who don't have anything to do except move money from one corporation to another, and then consume what is purchased so that it isn't there anymore. We're taught that 'disposable' is okay, because "we're consuming its usefulness". We're taught (especially in computing) that computers become obsolete and useless after 2 years. We're taught to want the latest and greatest... and don't worry about the messy details of life, the things we're losing, the rights and thoughts we're losing.
And in the US, we're finding that the assumptions that we've been making about our civil liberties aren't accurate. "I don't have anything to hide" -- but we still get uneasy when we find that our President has authorized massive-scale wiretapping against the citizens of his own country. Maybe that "Chinese Dissident" idea isn't so bad after all...
But I digress, and I'm sorry for slipping into the political realm for a moment there. The models above don't take into account "nicknames". They don't take into account anything that an individual may wish to do (such as assign a credential that is only meaningful to that individual, or that individual's organization). The models don't take into account the ways that people live their lives.
So, the
first important problem in the field of cryptography: How to integrate cryptography into our lives so that it's invisible, how to integrate it in such a way that consumers retain as much advantage as corporations, and how to integrate it in such a way that privacy and security can be maintained.
PGP was a good first step in this direction, but it suffered from several problems -- it wasn't easy to use, the "web of trust" idea was well-thought-out but horribly implemented, there wasn't a way to keep keyrings separate for separate reasons, it relied on everyone having a desire for a strongly-bound identities, it allowed for arbitrary names to be added to a key just by someone signing it, and relied on single keys for multiple purposes. There wasn't a real notion of "Identity", and so people filled the gap with real names -- no matter that it was possible for other identity systems to exist.
And then, when the DH/ElGamal patents expired, the incompatible PGP 5 was put out. Which was the main reason that PGP stopped being a viable contender.
Thus, the
second important problem in the field of cryptography: How to get disparate algorithms to function together even in the face of the inability to use one of them, or in the event of compromise of one of the algorithms.
We don't have anything that is easy to use. We don't have anything that is transparent. We don't have anything that reflects usage -- if I'm talking to my friend across the network, I want the same level of privacy as if I'm talking to him face to face. I want the same level of security in my knowledge that it's him I'm talking to as if I'm talking to him face to face. And I want to be able to call him what I know him as -- even if his real name's David, but he hates it and prefers to be called Andy instead, I want to have that be possible in everything that I get from him, without having to know his legal first name at all.
So, the
third important problem in the field of cryptography: How to selectively blind data in a handshake that allows for blinded data to be unblinded later on.
Now, these are not the same as the important problems in the field of cryptology, which are pretty well known: Algorithm development, Algorithm attack resistance, algorithm mode selection, knowing what each algorithm can guarantee and what it can't, random number generation and usage, and knowing how all the parts fit together into a cohesive whole cryptosystem.
(I'm making the distinction between cryptography and cryptology here along the same lines that the NSA makes that distinction. They are, without a doubt, experts in everything cryptologic and, within their peculiarly limited worldview, cryptographic. I must acknowledge that I stand on the shoulders of giants, here -- not just the NSA stuff that has been made publicly available, but also the individuals in the open cryptographic and cryptologic communities whose papers I've read or had references to. My own thoughts on what is important may not mesh with theirs directly, but at this point I see a set of needs that isn't being met, anywhere or anyhow. But, I thank everyone I've ever learned from, even if your name isn't on the list: from Bruce Schneier (Applied Cryptography) and Phil Zimmerman (PGP, PGPfone, Zfone) to Dr. Steve Henson of the OpenSSL project, as well as
Philipp Gühring of the CACert project. And to Wolf Armstrong, who got me thinking about implementation details of what the routing metric really is supposed to mean. To Keith Pepin, for inspiring long drawn-out debates about what theoretical attacks can be made. And to everyone who's listened to me rant and kept their peace while I blather on about esotertica. If I've missed anyone, I'm sorry, I'll likely remember you later. :) )
In cryptography, there are three major aspects to random numbers that people need to look at: How are they generated? How are they used? And, perhaps most importantly, how are they kept secret?
Modern operating systems have sources of entropy that they can use and replenish as the system works (perhaps even by 'thrashing' the system a bit, allocating a large amount of memory and writing to it, memory-mapping files and writing a pseudorandom amount of pseudorandom data to them, and measuring the times it takes for the hard drive read head to move back and forth, given the randomness of fluid dynamics given airflow within hard disks). However, as we move toward solid-state systems, this becomes less effective, and there isn't as much entropy to be found in such operations.
In the case of a system that does precisely the same operations every time it's turned on, it makes sense to save the last bit of entropy that it had available, to 'seed' the randomness next time the system awakens. Note that this is not foolproof: if the seed is read in and not changed before the next random number generation for an application, then a power failure could cause the same seed to be loaded in in successive runs... which could be disastrous. So, loadseed, do various other things, saveseed, then run application that requires random numbers. If there's enough entropy available already to do so.
/dev/random is your friend. This is a distillation of true entropy, and will block reads from it if it doesn't have enough entropy distilled.
/dev/urandom is not your friend. This is a pseudorandom source, and thus deterministic. When fed the seed, it outputs a long-period pseudorandom number of whatever size you ask for, even when it doesn't have enough entropy to make it as secure as the application usually calls for. (The details on the random number generator are left up to the implementor, as well, which means that it's often difficult or impossible to determine how it works except in open-source operating systems.)
Now, we're supposed to be able to determine entropy from various timings of various things, such as keypresses. There isn't really any good measurement for using entropy from what is typed, since languages have specific patterns and patterns can be exploited... but there is good measurement for using entropy from
when the keys are hit. So, if someone's sitting at the console, it's very likely that they can create enough entropy just by banging on the keys.
So. We have entropy, we have random numbers. Now, just what do we DO with them?
This varies from algorithm to algorithm. RSA takes random numbers, sets both the high and the low bits to 1 (the top bit to make sure that the key is actually as long as it's supposed to be, and the bottom bit to make it odd and thus have a chance to be prime), and runs various tests for primality. If the random bits generated don't correspond to a prime number, they're incremented -- often going through about 50 to 200 iterations before finding a number that is statistically prime. (Since factoring large numbers is the hard problem we're trying to solve, if we knew how to factor the random numbers, we wouldn't be using RSA.) However: Once the primes are known, and the keys derived from them, just leaving them in memory would be a Bad Idea[tm].
And one other thing: don't feed the random numbers you pulled out back into the seeding function. Random numbers should be used one time and one time ONLY, and as soon as their function is complete, they should be either zeroed or hashed in-place or something else to make it impossible to derive them.
If the same random number is used more than once, it exposes information about what it is, the same way that using a one-time pad more than once exposes information about what
it is. Which is why it must be kept secret... which is why it must be kept safe... and which is why it must be burned.
There's undoubtedly more to this than I'm putting in -- I'm writing this on very little sleep, and I'll likely look at it and figure out where and how it needs to be fleshed out -- but it's important to recognize that all three of the questions asked at the beginning must be answered for any cryptosystem to remain secure.