Sei sulla pagina 1di 6

Hiding Images using AI — Deep Steganography

Harshvardhan GuptaFollow
Feb 12, 2018

Deep Learning is giving us some very new kinds of things. From areas like Style Transfer, to
Unsupervised Translation , it is constantly pushing the boundaries of computers. Interestingly, we
have not yet reached an upper bound , and new papers with great results seem to come up very often.
This post discusses one such new paper — Deep Steganography.
This post is based on the NIPS 2017 Paper Hiding Images in Plain Sight: Deep Steganography.
At the end of this article , I will provide links to my TensorFlow implementation , and a demo that
you can access on the web.
What is Steganography

Steganography is the process of hiding some type of data into other data. An example would be to
hide an Image inside another Image. The key difference between cryptography and steganography
is that in steganography, the Image looks unchanged, and therefore will not be scrutinised or
analysed by middlemen.

Figure 1.0: Example of Steganography for Images


Figure 1.0 shows a general steganography framework. It consists of 2 inputs, a Secret Image , and
a Cover image. The Secret Image the image you want to hide. The Cover image is the image that
should ‘cover’ the secret image. These two inputs are passed through some Hiding Algorithm to
generate the Output Image. The output should look exactly like the cover image, but upon using
a Revealing Algorithm, it will generate the secret image.
Thus, to an unsuspecting eye, the output will look like an ordinary image, but it would also
contain a secret image.
Problems with Existing Methods
Current methods that hide images in other images already exist, but there are a few problems
associated with these.

1. They are very easy to decode, as the way information is encoded , is fixed.
2. The amount of information that can be hidden is generally less. Hiding an image of the same
size will probably lose a fair bit of information.
3. In the case of Images, the algorithms dont exploit the structure of images. They don’t use the
patterns found in natural images.

The Solution — A Neural Network

Convolutional Neural Networks have shown to learn structures that correspond to logical features.
These features increase their level of abstraction as we go deeper into the network. Using a ConvNet
will solve all the problems mentioned above. Firstly, the convnet will have a good idea about the
patterns of natural images, and will be able to make decisions on which areas are redundant, and
more pixels can be hidden there. By saving space on redundant areas, the amount of hidden
information can be increased. Because the architecture and the weights can be randomised, the exact
way in which the network will hide the information cannot be known to anybody who doesnt have
the weights.

The Architecture

The entire network architecture is surprisingly similar to Auto Encoders. In general, auto-encoders
are made to reproduce the input after a series of transformations. By doing this, they learn about the
features of the input distribution.

In this case, the architecture is slightly different. Instead of merely reproducing images, the
architecture has to hide an image , as well as reproduce an other image.
Figure 2.0: Network Architecture
The whole framework consists of 3 Parts: The Prepare Network, The Hide Network, and The
Reveal Network.

The Prep Network takes in the secret image, and ‘prepares’ it. The Hide Network takes in the Output
of the Prep network as well as the Cover Image. These two inputs are first concatenated across the
Channels Axis. The Hide Network outputs an image, which is the Hidden Image. This is the Image
that contains the Secret, but looks like the Cover.
In order to get the Secret Image back, it needs to be passed to a Reveal Network. The Reveal
Network will output an Image, which looks like the Secret.
The actual architecture of each of the networks is roughly similar, and there is a lot of room for
experimentation. I used 4 (3x3),(4x4)& (5x5) kernel convolutions on the input(50 maps), before
concating. Then I did another 3 convolutions on the concatenated feature maps. After that , I did a
1x1 convolution to produce 3 channels. You can read about the actual details in the implementation
code, and the diagram in my repo.
The Network Losses

The Loss is fairly straightforward. It is:

Figure 3.0 Network Loss


where c is the input cover, c’ is the covered image. s and s’ are the secret input, and secret cover
images , respectively.
The loss is the standard MSE between the actual cover image and the produced covered image ,
and β*(MSE between actual secret image and the produced revealed image). Beta is a hyper
parameter that controls how much of the secret should be reconstructed. Thus the loss optimizes
for the following statement.
“The covered image should look very close to the cover image, and when revealed, the revealed
image should look very close to the secret image”.
Since the function is differentiable, the entire network can be trained end to end.
Results

The paper reports results that are substantially better than existing methods. There is a tool called
StegExpose, which can find whether or not an image has something hidden. It is fairly easy to
find out if the image is tampered if it is hidden using existing methods. However, this method is
able to fool StegExpose.

Figure 4.1: Results from the Paper


Figure 4.2: Results from my Implementation(with β=0.75)
Why is this Useful?

You may wonder, what the point of hiding images is. Apart from its uses by intelligence services,
it has a use that I find more appealing — Digital Copyrighting. A copyright image can be hidden in
an image. If an image is wrongfully stolen, the original author can reveal the copyright. By using
systems that make it hard to detect or remove this copyright , it makes it harder to steal digital
media and get away with it.

Conclusion

We looked at a very new method to improve state of the art results of Steganography. This opens
up a lot of new possibilities. The same could possibly be done for other media such as Audio and
Video.
Also , using smaller size secrets on large size covers will allow the prep + hide net to achieve
even higher quality results.
Call to Action

If you liked this article, hold the 👏🏻 for as as long as you want to. If you want me to talk about
some cool Deep Learning Concepts, feel free to comment.
Additional Resources

 TensorFlow Implementation
 Browser Demo (Warning, it takes 3 to 4 mins to load).It is slow because it runs entirely in the
browser. If you are a front end developer , and would like to improve the demo, send a pull
request.
Browser Demo of the Paper

 StegExpose

Potrebbero piacerti anche