Cryptanalysis Of Hash Functions And HMAC/NMAC

Posted on:2008-04-09

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H B Yu

Full Text:PDF

GTID:1118360212494456

Subject:Information security

Abstract/Summary:

PDF Full Text Request

Cryptographic hash functions play a fundamental role in modern cryptography. They are used to compress messages of arbitrary length to fixed length hash values which are also called hash codes, hash results, message digests or digital fingerprints. A primary motivation for cryptographic hash functions is that they serve as compact representative images of input messages, which they can uniquely identify. For example, the MD5 hash code "72c067b50c90837c95186d019f686700" can be looked on as the "fingerprint" to represent this thesis title uniquely. Changing the title, even by a single letter, will change most of the digits in the hash code.Hash functions occur as components in various cryptographic applications (e.g. protection of pass-phrases, protocols for payment, broadcast authentication etc.) where usually their property as a computational one-way function is used. Hash functions are frequently used in digital signature schemes to compress large messages for processing by public-key cryptosystems such as RSA. So the study of the hash functions is of great significance in the cryptanalysis field.In addition to these the two basic properties - compression and ease of computation - cryptographic hash functions needs to satisfy the following three security properties:Preimage resistance: for any pre-specified output y, it is computationally infeasible to find an input x such that h(x) = y.Second-preimage resistance: for any input x, it is computationally infeasible to find another input x' such that h(x) = h(x') Collision resistance: it is computationally infeasible to find any two distinct inputs x, x' with the same output, i.e., h(x) = h(x').Hash functions can be partitioned as: hash functions based on block ciphers and dedicated hash functions; In 1990, Rivest introduced the the first dedicated hash function MD4[39], designed using basic arithmetic and boolean operations. After MD4, several dedicated hash functions were proposed, including, in rough chronological order, MD5[40], HAVAL[56], RIPEMD[38], RIPEMD-160[13], SHA-l[20], SHA-2[21], etc. Although these hash functions are more complex than MD4, they all follow the same design philosophy and have a similar structure. So they are called the MD4-family. MD5 and SHA-1 are currently the crucial technology that electronic signatures and many other password securities use throughout the international community. They are widely used in banking, securities, and e-commerce. SHA-1 has been recognized as the cornerstone for modern Internet security.The first analysis of MD4 and MD5 were made by Merkle (unpublished), den Boer and Bosselaers [11, 12], and by Vaudenay [42]. Then Dobbertin developed a new general technique to attack MD4-family hash functions. In 1996, Dobbertin presented a successful attack on MD4 which find a collision with complexity 2²²[14]. In 1998, Doberrtin[18] showed that the first two (out of the total three) rounds of MD4 are not one-way. In the rump session of Eurocrypt'96, Dobbertin [15]presented a semi free-start collision of MD5 which consists of two different messages with a chosen initial value.In 1997, Xiaoyun Wang first used an innovative method of bit tracing to show how SHA-0 could be broken in theory[45]. In 1998, Wang improved this result in conjunction with the message modification technique. In the same year, Chahaud and Joux[24] also published the theoretical analysis result of SHA-0 that used the ordinary differential analysis method. The bit tracing method was public in 2005 and it laid the theoretical foundation for the breaking of most MD4-family hash functions, including MD5 and SHA-1.In past three years, there has been an extensive focus upon design and analysis of hash functions within the cryptographic community. This has been largely catalyzed by some novel work included within this dissertation, first published in [52, 55, 54]. The attacks used to find the "breakthrough" collisions presented at the rump session of Crypto'04 by Wang et. al[49]. In that conference, Wang announced the presence of collisions within for MD4, MD5, HAVAL-128 and RIPEMD that was regarded as the most significant cryptanalytic developments of the year. At the same time, Biham gave an attack on 40-step SHA-1[6] and Joux improved upon the results of Biham and Chen[5] by presenting practical collisions in SHA-0 [24].The most important result is the break of MD5 by Xiaoyun Wang and Hongbo Yu. This break highlights serious potential risks in the fields of information and network security and lead to many practical consequences. In 2005, Lenstra et. al used our MD5 hash collisions to construct two X.509 certificates that contain identical signatures and that differ only in the public keys[28, 27]. Later, they improved this result and presented two MD5 based X.509 certificates with identical signatures but different public keys and different Distinguished Name fields[33]. Lucks and Magnus constructed the MD5 "Target Collision" for special file formats such as Postcript[29]. Gebhardt et. al extended the file format to PDF, TIFF and Word 97[22]. The work "Finding collisions by SHA-1" by Wang and Yu is another breakthrough in the hash functions history [51]. Recently Wang et. al improved this attack to find a hash collision with a estimated work of 263 operations. The result is plainly within the realm of feasibility for an attacker well equipped with computer resources. In response to the SHA-1 vulnerability by Wang et. al, NIST plans to host a public hash competition that is similar to the successful Advanced Encryption Standard (AES) development and selection process[34].In this thesis, we provide new attacks on MD4, 4- and 5-Pass HAVAL, and HMAC/NMAC based on HAVAL. It involves six chapters.In Chapter 1, we introduce some definitions and properties for hash functions and describe the structure of MD4-family hash functions. We give descriptions of our main targets of analysis in this thesis, being MD4 and HAVAL.In Chapter 2, we describe modular differential attack on hash functions in detail. This method, also called bit tracing method in Chinese, was presented early in 1997 [45] by Xiaoyun Wang, and formalized at Eurocrypt'05. The modular differential is a precise differential that unlike most differential attacks [9], uses integer modular subtraction in conjunction with exclusive-or as a measure of difference. There are four steps in attacking a hash function using the modular differential. The first step is to select an appropriate message difference, which determines the success probability of the attack. The key step of the modular differential attack is to select a feasible differential path according to the selected message difference. This difficult step requires needs intelligent analysis, sophisticated technique, lots of patience and good luck. The third phase is to derive the sufficient conditions that guarantee the feasibility of the differential path. In the process of searching for differential paths, the chaining variable conditions can be determined. A feasible differential path implies that all the chaining variable conditions deduced from the path do not contradict each other. The last step is the message modification which forces the modified messages to satisfy additional sufficient conditions.In Chapter 3, we present the collision attack for the full 4- and 5-Pass HAVAL. Previous results on HAVAL suggested only practical collision attacks for 3-pass HAVAL. For 4-Pass HAVAL, we describe a two-block collision with complexity 2³⁶. For 4-pass HAVAL, we describe a practical attack for finding two-block collisions with 2³⁶ computations. Using the more complex message modification technique, the complexity can be improved to 2³⁰. In addition, we show that collisions for 5-pass HAVAL can be found with about 2¹²³ computations, which is the first attack more efficient than the birthday attack. This is the best results on the analysis of 4 and 5-Pass HAVAL. A 4-Pass HAVAL collision example is given in Table 1.In Chapter 4, we give further analysis on the second-preimage attack on MD4. Ideally, it needs about 2¹²⁸ operations to find a second-preimage corresponding a given message. This is equivalent in complexity to the brute-force approach. However for some messages, there exists a more efficient attack than the brute force attack to find their second-preimages, and these messages are called "weak messages". In Eurocrypt'05[43], Wang et. al gave a fast attack for MD4 using the modular difference method. The differential path used in their attack also can be used to find the second-preimage of MD4 with a little advantage than the brute-force attack, and it is the first theoretical attack to the second-primage on the full MD4. At the same time, Wang pointed out that it is possible to find a better differential path for the second-preimage attack of MD4. Motivated by Wang's suggestion, in this chapter we find another new collision differential path. Using it, we have proved that any random message is a weak message with probability 2^-56 which has achieved a practical range of break. Furthermore, we can convert any random message into a weak message with only 227 MD4 computations, such that the original message is very close to the resulting weak message with a hamming distance of about 44. In addition, we give a result that shows weak messages can be used for distinguishing, forgery and key recovery attacks on MD4-based MACs. The attacks on MD4-based HMAC and NMAC developed by Contini et. al[8] and Kim et. al[26] respectively both used the differential path given in this chapter, published by us previously.In Chapter 5, we present a new attack for building multi-collisions and multi-near-collisions for hash functions directly instead of combining multi-collisions from the pairwise collisions. Finding a K (multi)-collision means to find K different messages that map to the same hash value. The multi-near-collisions is a generalization for the pairwise near-collisions presented by Biham and Chen[4]. The basic idea of the new attack method is to use t different differential paths to cause a 2^t-collision of the compression function, and each different path can be found by the modular differential method. For an ideal hash function, finding a K-collision requires about O(2^((k-1)/k)n hash computations where n is the length of the hash value. However, using our new attack method, we can find a 4-collision for MD4 compression function with complexity 2¹⁹. For the 3-Pass HAVAL, we can find a 4-collision with complexity 2²⁸ and a 8-near-collision with complexity 2⁹. Our multi-collision examples are given in Tables 2 and 3.In chapter 6, we analyze the HMAC based on full 3-Pass and 4-Pass HAVAL. A message authentication code (MAC) can be viewed as a hash functions that takes two functionally distinct inputs, a message and a secret key, and produces a fixed-size output. HMAC is a standardized hash-based MAC algorithm that is widely used as a MAC algorithm and as a pseudorandom function generator [1]. We use two type of distinguishers to analyze the HMAC: differential distinguishers and rectangle distinguishers. In [26], Kim et. al use the rectangle distinguisher to distinguish the HMAC based on 3-Pass and reduced 4-Pass HAVAL. In our analysis it turns out that differential distinguishers are better suited than rectangle distinguishers to HMAC based on HAVAL. Nevertheless, to facilitate comparison, we give our results for both types. For the differential distinguisher, we introduce two new collision differential paths for 3 and 4-Pass HAVAL, with probabilities is 2^-96.26 and 2^-121.27 respectively. Using our new differential paths, we can execute the distinguishing, forgery and key recovery attack on HMAC based on 3 and 4-Pass HAVAL. In the case that the output lengths both for inner and outer functions are 256 bits, our forgery attack is of greater advantage than the birthday attack. For the rectangle distinguisher, we also give two near-collision differential paths for 3 and 4-Pass HAVAL with probabilities 2^-64.5, and 2^-95.5 respectively, these being better than that of Kim et. al's. Because NMAC is a generalized version of HMAC and they have similar structure, all of our attacks on HMAC are applicable to NMAC. The main results of our attack are shown in Table 4.In conclusion, this thesis summarizes the large body of new work that we have performed on breaking hash functions such as MD5 and SHA-1. Previous to this work, it was not possible to get real collisions on these hash functions. Consequently we have shown that these hash functions are no longer safe to use. This work has been acknowledged in the cryptographic community, and precipitated NIST to launch a new hash function competition [34].

Keywords/Search Tags:

hash function, HMAC, NMAC, collision attack, second-preimage attack, multi-collision attack

PDF Full Text Request

Related items

1	New Techniques For Symmetric-key Cryptanalysis
2	Cryptanalysis Of SM3Cryptographic Hash Algorithm
3	Research On The Attack Methods Of The Hash Functions
4	Research Of Attack Methods On Hash Function BLAKE
5	Research Of Attack Technology Of Hash Function MD5
6	Research On Cryptographic Hash Functions
7	Cryptanalysis Of The Hash Functions RIPEMD-128 And HMAC-MD4
8	Analysis And Research Of SHA-0 Collision Attack
9	The Situation Of The Hash Function Structures And A New Design
10	Cryptanalysis Of Hash Function And Block Ciphers