Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
L14: Steganography
!
! An encrypted file is visibly trying to hide something
2 Steganography (contd.)
3 An Example
PERSHINGSAILSFROMNYJUNEI
L14: Steganography
4 Another Example
5 One More Example
Original file
LSB extraction
Stego file
Source: http://www.petitcolas.net/fabien/steganography/mp3stego/
L14: Steganography
L14: Steganography
6 Understanding File Formats
7 Steganography Methods
! Before you can perform steganography, you must ! Can only hide a certain amount of information
understand how data is encoded in a file ! Data insertion
! The file format ! Insert data at redundant places of the original file
! Then identify how can the data be overwritten without ! Data after the EOI marker in a JPEG file is ignored; hide there
any perceptible change ! Insert data into padded bytes
! Standard bitmap file formats ! Many file formats employ word alignment
! Recall how you used to ignore data while reading MFT records
! GIF, JPEG, TIFF, BMP
! Insert data in ways that will not be revealed in typical viewing
! Standard audio file formats
! Comments in a HTML file are not shown by a browser
! WAV, MP3 ! An image as a frame in a video file
! Common place formats raise the least suspicion ! 72 frames-per-second in WMV; you won’t even see the image come and
go
L14: Steganography
L14: Steganography
8 Web Page Source
9 Steganography Methods (contd.)
! Data substitution
! Substitute data in the byte stream of the cover file
! Pixels, palettes, audio sampling points, …
! Done in a fixed manner
The correct SHA1 sums ! Substitute data in a pseudo-random manner
! Use a random number generator to decide the order of bytes to be
substituted
! Random number seed can act like a password to retrieve hidden
data
! The same random number generator with the same seed will give the
same sequence of numbers
! Hidden data is typically encrypted
! You still have to decrypt the data after extracting it
! Without decryption, the extracted hidden data itself will look
L14: Steganography
L14: Steganography
random
10 LSB Substitution
11 LSB Substitution (contd.)
! Changing this bit will result in ±1 in the value of the bit string LSB substitution of 11001010 (at the byte level)
! Smallest possible change
! 1111011 = 123; 1111010 = 122 FF 05 D2 6A 2B FC 91 CA
11111111 00000101 11010010 01101010 00101011 11111100 10010001 11001010
! LSB substitution
! Substitution does not mean every LSB will change
! Overwrite LSBs of byte streams in the cover file with that
! Some will coincide with what is already in there
from the data to hide
! Change to the cover file will be much lower!
! Sometimes you may have to substitute at the end of two or more
bytes
! Substituting 2 LSBs will allow more information to be packed in
the cover file
L14: Steganography
L14: Steganography
12 LSB Substitution in Graphics Files
13 LSB Substitution in Graphics Files
! Graphics file formats store color information ! Replace the LSB in each channel with bits from the
corresponding to each pixel of the image message to hide
! Some are stored in a straight-forward manner; e.g. BMP
00 CE CC 28 00 95 37 35
! Some needs to undergo elaborate decoding; e.g. JPEG 00 37 60 92 00 FF 00 00
! Each pixel of an image can be made up of four channels Hide the word HI
! 1 byte for each channel (32-bits per pixel) ASCII: 01001000 01001001
! Channels: Alpha - Red - Green - Blue LSB 0 1 0 0 1 0 0 0
00 CF CC 28 01 94 36 35
00 37 60 92 01 FE 00 01
00000000 11001110 11001100 00101000 0 1 0 0 1 0 0 1
Alpha Red Green Blue
Pixel: 00 CE CC 28 ! How many bytes can you hide in this manner in a M-by-
Some formats do not store the alpha channel (24-bits per N image?
L14: Steganography
L14: Steganography
!
pixel) ! Using two LSBs would allow us to hide 1 byte per pixel
! Digital audio data is created from analog sources based ! Analyzing files to detect if data is hidden in them
on through the use of steganography methods
! A sampling rate: how many times will you look at the analog
signal per second?
! Look for statistical inaccuracies in the files
! Each time you look (and record), you create a sample point ! Look for inaccuracies in the way data is packed
! A bit depth: how many bits will you use to represent the ! Example: compressed data should have been better
signal in each sample point? compressed
! 16-bit CD audio, 24-bit HD audio, …
! Its an evolving field and much remains to be done!
! LSB substitution is applied in every sample point of the
audio file
! 1 second of 16-bit audio sampled at 44.1 KHz – how many
bytes can you hide?
! 44.1 KHz = 44100 samples per second
L14: Steganography
L14: Steganography
! Changes in lower bit depth audio will be noticeable
16 Simple Steganalysis Methods
17 Enhanced LSB Method
! Compare suspect file to good or bad file versions ! Useful when hidden data generates a pattern that the
! Not always possible since the original need not be available original does not
! Compare hash values ! Enhanced LSB
! Assume that data is hidden using LSB substitution
! Can be done if data is stored in well-known files
! For every stream (a pixel/a sample possibly subjected to
! Also possible if file embeds unmodified hash information substitution), extract the LSB
! Mathematical calculations ! For a 24-bit image, that would be the LSBs in the red, green and
blue color channels
! More reliable and accurate
! Modify the data so that only these LSBs define the stream
! If the LSBs from a pixel are 0, 1 and 1, then the color channels
become red=0x00, green=0x01 and blue=0x01 respectively
! Since there is hardly any noticeable difference between a color
value of 0x00 and 0x01, enhance all those channels that has a value
of 0x01
L14: Steganography
L14: Steganography
! Enhance means making the value the highest possible, i.e. 0xFF here
! You will definitely get a noisy file ! Enhanced LSB will not work if hidden data is random
! But are there differentiating patterns in the noise itself? ! Can easily happen if you encrypt the data before hiding
! Chi-square test
! Assume that something is very likely to happen
! Random data has approximately the same number of zeros and
ones
Enhanced LSB ! Say what really happened was somewhat different than what
was assumed
! This statistical test will tell you if the observed discrepancy is
due to chance or other underlying factors
18KB of hidden text
L14: Steganography
include row 3, and so on
20 Computing Chi-Square
21 A Clean Image
0.7
Number of
times you 0.6
1
15
29
43
57
71
85
99
113
127
141
155
169
183
197
211
225
239
253
267
281
295
309
323
337
351
365
379
393
407
421
435
449
463
477
491
505
V = ∑i (xi – yi)2 / yi
Row number in image
L14: Steganography
L14: Steganography
Probability that data is random = Pr(X > V) where X follows a Chi-square distribution
22 Same Image With Random Hidden Data 23 Image With Hidden Image
1
1
Probability that data is random
0.9
L14: Steganography
L14: Steganography
1
15
29
43
57
71
85
99
113
127
141
155
169
183
197
211
225
239
253
267
281
295
309
323
337
351
365
379
393
407
421
435
449
463
477
491
505
Row number in image Row number in image
24 Few Points to Remember
25 References
! A stego file subjected to lossy compression will result in ! Kessler, G. C., An Overview of Steganography for the
loss of the hidden data Computer Forensics Examiner, Forensic Science
! Do not perform steganography on a BMP and then save as Communication, July 2004 (
JPEG http://citeseerx.ist.psu.edu/viewdoc/summary?
! Clever steganography is about not changing the doi=10.1.1.90.8113)
statistical properties of the original file while hiding data ! http://www.guillermito2.net/stegano/tools/index.html
! LSB substitution is a very simple method ! http://www.snotmonkey.com/work/school/405/
! Easy to detect methods.html
! There are more interesting ones out there ! Ch 10: B. Nelson, A. Phillips and C. Steuart, Guide to
Computer Forensics and Investigations. ISBN:
978-1-435-49883-9
L14: Steganography
L14: Steganography