Perfect Secrecy and the One Time Pad (OTP)



I have explained unconditional (or information-theoretical) security in my previous post (see Unconditional vs. Conditional Security). As I have mentioned there, we can refer to unconditional security in the context of various cryptographic primitives, among which the encryption schemes (see Symmetric vs. Asymmetric Encryption).

An encryption scheme that is information-theoretically secure provides perfect secrecy (see, e.g., [1]), because the ciphertext perfectly hides the plaintext. In other words, the adversary has the same probability to correctly indicate the message m regardless if he/she knows the corresponding ciphertext c. Hence, the knowledge of the ciphertext gives no new information about the plaintext. We ignore the length of the message - which, of course, is exposed - and assume that all the possible messages are equally long.

More rigorously, the a-posteriori probability Pr[M=m/C=c] to guess the plaintext (a-posteriori in the sense that the adversary knows the ciphertext c) equals the a-priori probability Pr[M=m] to guess m (a-priory in the sense that the adversary does not know the ciphertext c). For rigurous definitions I recommend a good book (such as, e.g., [2] that I have used for years as the course book for the undergraduate Cryptography course at the University of Bucharest, and which it is, together with D.Boneh's course [3] the base for many of the definitions in my posts). 


Perfect secrecy is the best an encryption scheme can aim for regarding security. The well-known cryptosystem that targets and achieves perfect secrecy is the One Time Pad (OTP)
OTP defines encryption and decryption as a single operation on bits, the exclusive or (or XOR). More precisely, a message m is encrypted using a cryptographic key k to a ciphertext c = m XOR k. Decryption is similar, a ciphertext c decrypts to a message m = c XOR k.



XOR is a binary operation that indicates whether the two bits given as input are equal. If yes, it returns 0; if not, it returns 1. In fact, XOR is the addition modulo 2 (see XOR's truth table in the picture). For encryption, the two inputs are the key k and the plaintext m, and the result is the ciphertext c; for decryption the two inputs are the key k and the ciphertext c, and the result is the plaintext m
OTP is correct (see Symmetric vs. Asymmetric Encryption), i.e. for all possible plaintexts and all possible keys, the ciphertext correctly decrypts to the plaintext: c XOR k = (m XOR k) XOR k = m XOR (k XOR k) = m XOR 0^nm, where 0^n is a binary string of zeros of length n, and n is the length of the plaintext (Note: I have used the associativity of XOR and the fact that for any bit b, it holds that b XOR b = 0). 

Let's take an example to see how OTP works; let m = 10111001 and = 01101100.  The ciphertext is c  = m XOR k = 10111001 XOR 01101100 = 11010101. At decryption, given c = 11010101 and k 01101100, recover m = c XOR k = 11010101 XOR 01101100 = 10111001. 


Using XOR (i.e., addition modulo 2) comes natural in the era of digital communications, but it is not the only way of defining OTPIn fact, OTP makes use of modular addition in general, not necessarily modulo 2. An example is using modular addition modulo 26 when working on the 26 letters alphabet (i.e., the plaintext, the ciphertext, and the cryptographic key are all strings of letters, where A = 0, B = 1, C = 2, ... , Z = 25). Let's take another example to see how OTP works in this case; let m = PAGES and k = GFNOM. The ciphertext is cm + k (mod 26) = PAGES + GFNOM (mod 26) = VFTSE. At decryption, given c = VFTSE and k GFNOM, recover m = c + k (mod 26) = VFTSE + GFNOM (mod 26) = PAGES.

How and by who the OTP has been firstly and consequently described, its variants and the similar or related ciphers (read about, e.g., Frank Millerthe Vernam Cipher) is a subject on its own. In general, the history of science is an interesting aspect to look into, but I will skip it here. If you are interested, refer to other readings (e.g., [4] - Cryptologia is a journal for history-like papers in general or the books referred to in my first post). Further in this post, I will only refer to OTP defined by the binary XOR.



In order to preserve perfect secrecy, the key must be chosen uniformly at random (i.e., all keys are equally probable), must be of the same length as the plaintext, and must never be reused (neither entirely or partially).

If the key is chosen uniformly at random, then the truth table shows that a given bit can encrypt to both 0 and 1 with the same probability (50%, or equivalently 1/2, a constant you will find quite often in my posts): XOR 0 = and m XOR 1 = NOT m. Similarly, starting with a 1-bit ciphertext c, one cannot guess the plaintext (again, with a higher probability than 1/2) because m can equally take both valuesif k = 0, then m = c and if k = 1, then m = NOT c. In other words, for any given ciphertext c and any message m, there is a possible k = m XOR c that decrypts c to m. Because such a key always exists, and because all the keys are equally probable, then knowing c brings no advantage in finding m. Hence, perfect secrecy holds. To give a concrete example, think of an adversary who eavesdrops on c, the encryption of a boolean answer (1 = Yes and 0 = No). Let's suppose c = 1; then, for k = 0, m = 1 (Yes) and for k = 1, m = 0 (No). As can take both values with the same probability, the adversary obtains no advantage in finding m by knowing c. 

A nice exercise given in [2], which I always discuss with my students, refers to the the all-zero key (i.e., k = 0^n, a binary string of n zeros). The all-zero key keeps the plaintext unencrypted: c = m XOR k = m and 0^n = m. The role of encryption is to conceal the plaintext; hence, this behavior can be perceived as a problem. In general, a cryptographic key that misbehaves in a way is called a weak keyWeak keys are naturally undesired, and their absence is usually a design goal. It could be the case that they are acceptable, e.g., if they are just a few, publicly and clearly documented to avoid usage. An example of a cryptosystem with weak (and semi-weak) keys is the Data Encryption Standard (DES), a now old block cipher withdrawn for years. 

Back to our problem, one could consider excluding the all-zero key from OTP, i.e., change OTP so that the all-zero key is never in use. What happens in this case? Does OTP remain perfectly secure? Intuitively, one may say yes, as this would eliminate a weakness. In fact, the correct answer is no; if OTP never accepts the all-zero key, then OTP loses its perfect secrecy. It is because the adversary learns something about m by knowing c: that m is not c. And this defies the definition of perfect secrecy. However, it might still sound strange, so I will take an example (in fact, the same as before) to clarify this. Let's say that OTP is used in an encrypted communication to protect boolean answers to sensitive questions. In Romanian, yes is DA, and no is NU. Now, think that one eavesdrops on the communication link and reads DA. If the all-zero key is never in use, then he/she learns that the answer is NU (the ciphertext cannot equal the plaintext; this was the "vulnerability" we have eliminated by not using the all-zero key). Thus, an adversary would imediately break OTP in this case. But if the all-zero string remains a possible key, then the ciphertext DA can stand for both DA (when k = 0^n) and NU (when k = DA XOR NU), so the adversary can say nothing about the answer.

OTP is easy to define and understand. It is a logical gate, so it accepts a hardware implementation and thus is fast. Despite these advantages, OTP  has a big problem: it needs long keys to encrypt long messages. Another nice exercise, this time taken from [5], asks to discuss the real-life implications of storing a 1GB one-time key. Think about it in all three states of the information: generationtransmission, and storage (see Crypt(?)). (Note: I consider [5] a good book for beginners: it is easier to read than many other books, and it contains enough information; slides, videos, and problem sets related to the book are available online on the book website)





Under these circumstances, it might come natural to ask for a perfectly secure encryption scheme with shorter keys. Mathematics prove this to be impossible: all perfectly-secure encryption schemes must use keys (at least as) long the length of the plaintext. The proof is simple, and I will leave it as an exercise. As a consequence, OTP is the best we can aim for regarding perfect secrecy, and there is no need to search for a better encryption system. Do not spend time on this if you are a researcher 😃.

Finally, let's see why a key cannot be (even partially) reused. Let m_1m_2 (m_3, ...) be two (or more) plaintexts encrypted with OTP using the same cryptographic key k.


If the adversary finds, by any way, a pair (m_1c_1), then he/she can immediately compute the key k = m_1 XOR c_1. Obviously, once the adversary finds k, any encryption using this key is compromised. I don't discuss how an adversary can find such a pair, or why such an assumption is plausible. This scenario can correspond to several attack types I will come back to in later posts; these attacks include the well-known Chosen-Plaintext Attack (CPA) - the adversary has the right to choose the value of the plaintext m_1 to get a valid ciphertext c_1 - and Chosen-Ciphertext Attack (CCA) -  the adversary maintains its ability from CPA and, in addition, he/she can choose the value of a ciphertext c_1 to get its decryption m_1.

Even if the adversary is weaker and can only observe the encryptions c_1, c_2 (c_3, ...), he/she still can find a relation between the plaintexts that are encrypted using the same key:  c_1 XOR c_2 = (m_1 XOR k) XOR (m_2 XOR k) = m_1 XOR m_2. The adversary finds something about the plaintexts, so perfect secrecy fails. The attack is a ciphertext-only attack, because the adversary can only access the ciphertexts. In particular, think that the adversary eavesdrops the same ciphertext twice: c_1 = c_2. In this case, the adversary finds that m_1 and m_2 are equal. OTP that reuses the same key becomes deterministic: there is no randomness at encryption, so a repeated message m encrypts to the same cDeterministic encryption is unsecure and cannot stand CPA or CCA attacks.

To conclude, OTP is the best cryptosystem one can aim for regarding perfect secrecy, but it cannot be used (at large scale) because of its limitations. However, one must be familiar with the OTP, its strengths and shortcomings, as it appears quite often - in one form or another - in other constructions or systems (e.g., I will explain the relation to stream ciphers in a later post).

[1] Van Tilborg, H. C., & Jajodia, S. (Eds.). (2014). Encyclopedia of cryptography and security. Springer Science & Business Media. Available at: https://www.researchgate.net/profile/Krzysztof-Kryszczuk/publication/230674947_Springer_Encyclopedia_of_Cryptography_and_Security 
[2] Kats, J., Lindell, Y. Introduction to Modern Cryptography, CRC Press. More info: http://www.cs.umd.edu/~jkatz/imc.html
[3] Boneh, D. Cryptography I. Available at: https://www.coursera.org/learn/crypto 
[4] Bellovin, S. M. (2011). Frank Miller: Inventor of the one-time pad. Cryptologia, 35(3), 203-222.
[5] Paar, C., & Pelzl, J. (2009). Understanding cryptography: a textbook for students and practitioners. Springer Science & Business Media. 


Comments

Popular posts from this blog

Unconditional vs. Conditional Security

Principles