Sei sulla pagina 1di 8

Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

Contents lists available at ScienceDirect

Commun Nonlinear Sci Numer Simulat


journal homepage: www.elsevier.com/locate/cnsns

Short communication

A chaos-based hash function with both modication detection and localization capabilities
Di Xiao a,b,*, Frank Y. Shih b, Xiaofeng Liao a
a b

College of Computer Science and Engineering, Chongqing University, Chongqing 400044, China Computer Vision Laboratory, College of Computing Sciences, New Jersey Institute of Technology, Newark, NJ 07102, USA

a r t i c l e

i n f o

a b s t r a c t
Recently, a variety of chaos-based hash functions have been proposed. Nevertheless, none of them can realize modication localization. In this paper, a hash function with both modication detection and localization capabilities is proposed, which can also support the parallel processing mode. By using the mechanism of changeable-parameter and self-synchronization, the keystream can establish a close relation with the algorithm key, the content, and the order of each message unit. Theoretical analysis and computer simulation indicate that the proposed algorithm can satisfy the performance requirements of hash functions. 2009 Elsevier B.V. All rights reserved.

Article history: Received 21 February 2009 Received in revised form 2 June 2009 Accepted 12 October 2009 Available online 6 November 2009 Keywords: Hash function Chaos Modication detection and localization Parallel

1. Introduction It is promising to incorporate chaos into cryptography [14] because chaotic characteristics can satisfy the analogous requirements of a cryptosystem. Hashing, one of the cores in cryptography, is a basic technique for information security [5,6]. Up to now, a variety of chaos-based hash functions have been proposed [712]. However, these algorithms have a common characteristic that they are only able to verify whether there is modication, but unable to locate where the modication occurs. Besides, their iterative hash structures are all in a sequential mode. The processing of the current message unit cannot be started until the previous one has been processed. These limitations restrict their applications. In this paper, we propose an algorithm for both modication detection and localization. Its structure can support the parallel processing mode. The mechanisms of changeable-parameter and self-synchronization are used to achieve the requirements of hash functions. The rest of this paper is organized as follows. The proposed algorithm is presented in Section 2. Performance analysis is given in Section 3. Conclusion is made in Section 4.

2. Proposed algorithm We adopt the Piecewise Linear Chaotic Map (PWLCM) in the proposed algorithm. It is dened as:

* Corresponding author. Address: College of Computer Science and Engineering, Chongqing University, Chongqing 400044, China. Tel.: +86 23 8633 3521; fax: +86 23 6510 3199. E-mail address: xiaodi_cqu@hotmail.com (D. Xiao). 1007-5704/$ - see front matter 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.cnsns.2009.10.012

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

2255

X t 1

8 X t =P ; > > > < X P=0:5 P; t F P X t > 1 X t P=0:5 P; > > : 1 X t =P ;

0 6 X t < P; P 6 X t < 0:5; 0:5 6 X t < 1 P; 1 P 6 X t 6 1; 1

where X t 2 0; 1 and P 2 0; 0:5 denote the iteration trajectory value and the current iteration parameter of PWLCM, respectively. 2.1. Algorithm description 2.1.1. Padding, division, and pre-processing A look-up table of 16 16 blocks is built in advance, as shown in Fig. 1. The original message M is padded such that its length is a multiple of 256 characters (2048 bits): let m be the length of the original message M; the padding bits 100 02 with length n (such that m n mod 2048 2048 64 1984; 1 6 n 6 2048 are appended. The left 64-bit is used to denote the length of the original message M. If m is greater than 264 , then we use m mod 264 . After padding, M is constituted by s divisions with 256 characters (2048 bits), i.e., M M 1 ; M 2 ; . . . ; M s . For each division, its 256 characters are inserted into the 256 blocks of the look-up table, respectively. After all the divisions are processed, there are s characters within each block of the look-up table. Without loss of generality, let the ith block of the look-up table hold the character arrayc1 ; c2 ; . . . ; cs . The detailed pre-processing of the ith block i 1; 2; . . . ; 256 in the look-up table is described as follows: (i) We convert the character arrayc1 ; c2 ; . . . ; cs into ASCII numbers, which are then mapped to a number array C 1 ; C 2 ; . . . ; C s by a linear transform, where the element is a number 2 [0, 1] and the length is the character number s in the ith block of the look-up table. (ii) We perform the iteration process of PWLCM as follows:

1st : P1 C 1 P 0 =4 2 0; 0:5;

X 1 F P1 X 0 2 0; 1; X k F Pk X k1 2 0; 1; X s1 F Ps1 X s 2 0; 1; X k F Pk X k1 2 0; 1:

2ndsth : Pk C k X k1 =4 2 0; 0:5; s 1th : Ps1 C s X s =4 2 0; 0:5;

s 2th2sth : Pk C 2sk1 X k1 =4 2 0; 0:5;

Here, the initial X 0 2 0; 1 and P 0 2 0; 1 of PWLCM are used as the algorithm key. If a certain iteration value X k is equal to 0 or 1, an extra iteration is carried out. This extra iteration time is very low according to the property of chaos. (iii) The X 2s obtained is used as the representative value of the ith block i 1; 2; . . . ; 256 in the look-up table.

2.1.2. Grouping and processing in a parallel mode The 256 blocks in the look-up table are arranged into 16 horizontal groups and 16 vertical groups for further processing. As shown in Fig. 1, the groups are: Horizontal Direction: H_Group 1-Block1, . . ., 16; H_Group 2-Block17, . . ., 32; . . ., H_Group 16-Block241, . . ., 256; Vertical Direction: V_Group 1-Block1, . . ., 241; V_Group 2-Block2, . . ., 242; . . ., V_Group 16-Block16, . . ., 256. Without loss of generality, let CBi1 ; CBi2 ; . . . ; CBi16 2 0; 1 be the corresponding representative values of the blocks in the ith horizontal group. The detailed processing of the ith horizontal group i 1; 2; . . . ; 16 is described below.

Fig. 1. Look-up table.

2256

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

(i) The iteration process of PWLCM is given as follows: 1st : P1 CBi1 P0 i=16=6 2 0; 0:5; X 1 F P1 X 0 2 0; 1, where the initial X 0 2 0; 1 and P 0 2 0; 1 of PWLCM are the secret key of the algorithm, and i is the order of each horizontal group. 2nd16th : Pk CBik X k1 =4 2 0; 0:5; X k F Pk X k1 2 0; 1. 17th : P 17 CBi16 X 16 =4 2 0; 0:5; X 17 F P17 X 16 2 0; 1. 18th32nd : Pk C 32k1 X k1 =4 2 0; 0:5; X k F Pk X k1 2 0; 1. 33rd34th : X k F P32 X k1 2 0; 1. (ii) We transform the iteration valuesX 32 ; X 33 ; X 34 to the corresponding binary format, extract 40, 40, 48 bits after the decimal point, respectively, and juxtapose them from left to right to get a 128-bit DHK i , the keystream of the ith horizontal group i 1; 2; . . . ; 16. At the same time, we extract 4 bits after the decimal point from the binary format of X 32 to get a 4-bit LHK i i 1; 2; . . . ; 16. Similar operations are also applied to the jth vertical group j 1; 2; . . . ; 16. Therefore, we also obtain a 128-bit DVK j , the keystream of the jth vertical group j 1; 2; . . . ; 16, and a 4-bit LVK j j 1; 2; . . . ; 16. Note that the 16 horizontal groups and the 16 vertical groups of the look-up table can be, respectively, implemented in parallel. Therefore, the proposed algorithm is efcient. Also note that the generation of the keystream in each group must be under the control of the corresponding algorithm key as well as the order and the content of the current group. 2.1.3. Obtaining the detection hash and localization hash values The nal 256-bit hash value of the message M is jointly composed of DHASH and LHASH, which are used to accomplish detection and localization, respectively. The 128-bit detection hash value can be obtained by DHASH DHK 1 DHK 16 DVK 1 DVK 16 , where denotes XOR (Exclusive-OR) operation. At the same time, all 4-bit LHK i i 1; 2; . . . ; 16 and 4-bit LVK j j 1; 2; . . . ; 16 are juxtaposed from left to right to get a 128-bit localization hash value-LHASH. 2.2. Characteristics of the algorithm construction 2.2.1. Both modication detection and localization capabilities The proposed algorithm has the capabilities of modication detection and localization. We rst verify whether there is any modication on the pending message by computing the new detection hash value and comparing it to the former one. If the comparison indicates the existence of some modications, we further realize modication localization by using the localization hash value. This kind of function is very useful for information authentication or communications with resource constraints, which cannot be provided by other kinds of hash algorithms. 2.2.2. Parallel mode The parallel mode of the proposed algorithm is embedded in the following two aspects. One is that the pre-processing among 256 blocks of the look-up table can be performed in a parallel mode, and the other from the whole structural point of view, that the processing among 16 horizontal groups and 16 vertical groups can be performed in a parallel mode (see Fig. 1). In fact, the processing of the 16 blocks within each horizontal or vertical group can also be adapted to a parallel mode. 2.2.3. Changeable-parameter and self-synchronizing keystream In steps (1) and (2), the message unit at different positions will cause chaotic parameters to change dynamically during the iteration. In step (2), the iteration of PWLCM is related to the order of each message group. Perturbation is introduced in a simple way to avoid the dynamic degradation of chaos. On the other hand, self-synchronizing stream is realized to ensure that the generated keystream is closely related to the algorithm key as well as the content and the order of each message group. The mechanism of both changeable-parameter and self-synchronization provides the foundation for the security of the proposed algorithm.

3. Performance analysis We implement the proposed algorithm for performance analysis using the Piecewise Linear Chaotic Map (PWLCM) in Eq. (1). The initial X 0 0:232323and P0 0:858485 are set as the algorithm key. A paragraph of the message is randomly chosen as: As a ubiquitous phenomenon in nature, chaos is a kind of deterministic random-like process generated by nonlinear dynamic systems. The properties of chaotic cryptography includes: sensitivity to tiny changes in initial conditions and parameters, randomlike behavior, unstable periodic orbits with long periods and desired diffusion and confusion properties, etc. Furthermore, beneting from the deterministic property, the chaotic system is easy to be simulated on the computer. Unique merits of chaos bring much promise of application in the information security eld.

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

2257

3.1. Detection hash performance In this section, we focus on the performance of the 128-bit detection hash value. 3.1.1. Sensitivity of hash value to the message and the secret key In order to evaluate the sensitivity of hash value with respect to the message and secret key, we perform hash simulation experiments using the following seven conditions: C1: The original message; C2: Change the rst character A to B; C3: Change the word unstable to anstable; C4: Change the full stop at the end of the message to comma; C5: Exchange the message blocks in the 1st message group M1 with the corresponding message blocks in the 2nd message group M 2 , respectivelyAs a ubiquitous with phenomenon in na, e behavior, unst with able periodic or, and ch promise of ap with plication in the; C6: Change the secret key X 0 from 0.232323 to 0.2323230000000001; C7: Change the secret key P0 from 0.858485 to 0.858485000000001. The corresponding hash values in the hexadecimal format are obtained as follows: C1: C2: C3: C4: C5: C6: C7: 58196491612A7E56D6E7516B012D2529 580BDDEAE5730A88421E5A5375C16043 28BBDC5395CB941E90DC8448F6147A5F D1B7BC19F4D3AD0F6B01298EB687D797 E62DF1C64672D8198154D35D1923B7CF E81515BD73C0097EB71A4FAF37279EE9 2102035BD7D3A3AA2D9241284BCE166C

In Fig. 2, the hexadecimal detection hash values of the message are uniformly distributed. The graphical display of binary hash values under different conditions is shown in Fig. 3. Simulation results show that the proposed algorithm is very sensitive to any slight change in the message or key which will cause huge changes in the nal detection hash value. 3.1.2. Statistic analysis of diffusion and confusion In order to hide message redundancy, diffusion and confusion are two general principles to the hash function design. The following tasks are conducted. The detection hash value of the original message is generated. Then a bit in the message is randomly selected and toggled to get a new detection hash value. Two hash values are compared and the number of changed

Fig. 2. Distribution of the hash values in hexadecimal format.

2258

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

Fig. 3. Hash values under different conditions.

bits is counted as Bi . This test is performed J times, and the corresponding distribution of changed bit number is shown in Fig. 4, where J 2048. We use the analysis of four statistics as follows: PN 1 Mean changed bit number B N i1 Bi . Mean changed probability P B=128 100%.

v u N u 1 X DB t Bi B2 ; N 1 i1 v u N u 1 X DP t Bi =128 P 2 100%: N 1 i1
Through the tests using J = 256, 512, 1024, 2048, the corresponding data are listed in Table 1. It is obvious that both the mean changed bit number B and the mean changed probability P are very close to the ideal values of 64 bit and 50%, respectively. This indicates that the proposed algorithm has very strong capability for diffusion and confusion. Moreover, since DB and DP are very small, the capability for diffusion and confusion is very stable.

Fig. 4. Distribution of changed bit number.

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261 Table 1 Statistics of number of changed bit Bi . J 256 B P % DB DP% 64.5430 50.42 5.6359 4.40 J 512 64.3184 50.25 5.7108 4.46 J 1024 63.7266 49.79 5.7475 4.49 J 2048 64.1777 50.14 5.5859 4.36 Mean

2259

64.1914 50.15 5.6700 4.405

3.1.3. Analysis of collision resistance The mechanism of both changeable-parameter and self-synchronization expedites the avalanche effect. The following tasks are conducted. The detection hash value of the original message is generated and stored in the ASCII format. Then a bit in the message is randomly selected and toggled. A new detection hash value is generated and stored in the ASCII format. Two hash values are compared, and the number of ASCII character with the same value at the same location in the hash value, namely the number of hits, is counted. Moreover, the absolute difference of two hash values is calculated using the forP 0 0 mula: d N i1 jt ei t ei j, where and ei be the ith ASCII character of the original and the new hash value, respectively, and the function t converts the entries to their equivalent decimal values. This collision test is performed 2048 times using the secret key X 0 0:232323 and P0 0:858485. The number of 0-hit is 1932, the number of 1-hit is 114, and the number of 2hit is 2. The maximum, mean, and minimum values of d as well as the mean per character are listed in Table 2. A plot of the distribution of the number of hits is given in Fig. 5. It should be noted that the maximum number of equal character is only 2 and the collision is very low.

3.2. Localization hash performance In this section, we focus on the performance of the 128-bit localization hash part. In the proposed algorithm, the 256 blocks of the look-up table are assigned into 16 horizontal groups and 16 vertical groups, as shown in Fig. 1. Each horizontal group or vertical group generates the corresponding 4-bit LHK i i 1; 2; . . . ; 16 or LVK j j 1; 2; . . . ; 16, which constitute the nal 128-bit localization hash valueLHASH. Note that each block belongs to a particular horizontal group and a partic-

Table 2 Absolute differences of two hash values. Maximum 2224 Minimum 573 Mean 1401.1 Mean/character 87.5625

Fig. 5. Distribution of the number of ASCII characters with the same value at the same location in the hash value.

2260

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

ular vertical group at the same time. Therefore, any tiny modication in one block, actually in one character of the original message in a certain block from the lower levels point of view, will very likely lead to the changes in the corresponding LHK i and LVK j . Theoretically, if a particular modication occurs, the probability of miss in one direction will be 1=24 , and in two directions will be 1=28 , which are quite small. The following modication localization test is conducted. The message is chosen the same as before, with the length of 572 characters. The random tiny modications are listed in Table 3. The comparison between the new and former localization hash values indicates that there are modications within H_Group 1, H_Group 9, H_Group 16, V_Group 1, V_Group 6, V_Group 8, and V_Group 16. The possible modication blocks can be located within the intersection between the above groups, namely within 12 blocks (all of the 4 modied blocks have been correctly located, but other 8 innocent blocks have also been identied with false alarm), while the modication-free blocks are the remaining 244 ones. From another perspective, it is more convenient to identify the modication-free area. On the condition that only a few modications scatter in the message, this kind of localization is meaningful. Theoretical analysis and simulation results indicate that the proposed algorithm can realize modication localization, although some limitations still remain, such as miss probability, false alarm, etc. It is our future work to nd out more suitable solutions to overcome them, for example, by extending the length of 4-bit LHK i and LVK j , increasing the grouping time, or introducing a random number into grouping, etc. 3.3. Analysis of speed For speed comparison among different algorithms, the numbers of required multiplicative operations for each ASCII character (8-bit) message during the hash process are listed in Table 4. Since each multiplicative operation consumes much more time than each additive operation, this kind of comparison is objective, in spite of different implementing platforms. Obviously, our proposed algorithm is the fastest one. The complexity calculation of the proposed algorithm is given as follows. Each group has 16 blocks, and there are s characters in each block. First, the pre-processing of each character in each block needs 4-time multiplicative operations. Therefore, all the 16 blocks in each group need 64stime multiplicative operations. Second, the processing of all the 16 blocks in each group needs 4 16 2 66time multiplicative operations. In all, the number of required multiplicative operations for each character in the proposed algorithm is 64s 66=16s 4 33=8s. As the character number of message increases, the required multiplicative operation for each character becomes only slightly larger than 4. Furthermore, since the proposed algorithm can support the parallel mode, its efciency is predominant, especially compared with other hash algorithms in the sequential mode. 3.4. Security of key In the proposed algorithm, the chaotic sensitivity to tiny changes in initial conditions and parameters as well as the mechanism of both changeable-parameter and self-synchronization are fully utilized. There exist complicated nonlinear and sensitive dependence among message, hash value, and secret key. Therefore, it is immune from key recovery attack. To investigate the key space size, we conduct the following evaluations. Let the tiny change of the initial X 0 be larger than 1016 . For example, when X 0 is changed from 0.232323 to 0.2323230000000001, the corresponding changed bit number of the detection hash value is around 64. However, if the tiny change of X 0 is set to be 1017 , no corresponding hash bit changes. Therefore, the sensitivities to X 0 are considered to be 1016 . Similarly, the sensitivities to the initial P 0 can be considered as 1015 . Considering X 0 2 0; 1 and P0 2 0; 1, it can be derived that the size of the key space is approximately larger than 2103 , which is large enough to resist the brute-force attack.

Table 3 Pending modication. Modication content A to B u to v p to q m to n Character order 1 6 256 372 Block order 1 6 256 136 H_Group H_Group H_Group H_Group 1 1 16 9 Group order V_Group V_Group V_Group V_Group 1 6 16 8

Table 4 Required multiplicative operation for each character of algorithms. Ref. [5] Multiplication
a

Ref. [6] 11.7

Ref. [7] 32(by software)/8(by hardware)

This paper 4+33/(8s)a

s represents the number of 256-character (2048-bit) divisions of message after padding, namely the number of characters within each block.

D. Xiao et al. / Commun Nonlinear Sci Numer Simulat 15 (2010) 22542261

2261

3.5. Implementation and exibility In the proposed algorithm, double precision oating-point arithmetic is involved. As long as IEEE 754 oating-point standard [13] and the same operation order are implemented on two computing platforms, two hash values of a message with the same key produced on both platforms will be the same, in spite of different operating systems or program languages. Through simply modifying the way to process X 32 ; X 33 ; X 34 in step (2), the length of the nal hash value can be adjusted. Compared to the conventional hash algorithms such as MD5 with xed 128-bit length, the proposed algorithm can be better adapted to the actual demand. 4. Conclusion In this paper, a chaos-based hash function supporting parallel processing is proposed. Compared to other existing hash functions, the proposed algorithm can not only detect the modication, but also locate the modication. Besides, its efciency advantage is predominant. Theoretical analysis and computer simulation indicate that it can satisfy the performance requirements of hash functions. It is concluded that the proposed algorithm is simple, efcient, practicable, and reliable. Acknowledgements Our sincere thanks go to the anonymous reviewers for their valuable comments. The work described here was supported by the National Natural Science Foundation of China (Grant No. 60703035, 60973114), the Program for New Century Excellent Talents in University of China (Grant No. NCET-08-0603) and the Natural Science Foundation Project of CQ CSTC (Grant Nos. 2008BB2193, 2009BA2024, 2009BB2208). References
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Dachselt F, Schwarz W. Chaos and cryptography. IEEE Trans Circuits Syst I 2001;48:1498509. Fridrich J. Symmetric cipher based on two-dimensional chaotic maps. Int J Bifurcat Chaos 1998;8:125984. Xiang T, Wong KW, Liao XF. An improved chaotic cryptosystem with external key. Commun Nonlinear Sci Numer Simul 2008;13:187987. Wang XY, Yu Q. A block encryption algorithm based on dynamic sequences of multiple chaotic systems. Commun Nonlinear Sci Numer Simul 2009;14:57481. Schneier B. Applied cryptography: protocols, algorithms, and source code in C. 2nd ed. New York: Wiley; 1996. Stinson DR. Cryptography: theory and practice. Boca Raton, FL: CRC Press; 1995. Wong KW. A combined chaotic cryptographic and hashing scheme. Phys Lett A 2003;307:2928. Xiao D, Liao XF, Wong KW. Improving the security of a dynamic look-up table based chaotic cryptosystem. IEEE Trans Circuits Syst II 2006;53:5026. Xiao D, Liao XF, Deng SJ. One-way Hash function construction based on the chaotic map with changeable parameter. Chaos Solitons Fract 2005;24:6571. Yi X. Hash function based on chaotic tent maps. IEEE Trans Circuits Syst II 2005;52:3547. Zhang JS, Wang XM, Zhang WF. Chaotic keyed hash function based on feedforwardfeedback nonlinear digital lter. Phys Lett A 2007;362:43948. Lian SG, Sun JS, Wang ZQ. Secure hash function based on neural network. Neurocomputing 2006;69:234650. Goldberg D, Priest D. What every computer scientist should know about oating-point arithmetic. ACM Comput Surv 1991;23:548.

Potrebbero piacerti anche