Sei sulla pagina 1di 4

A Visual Information Encryption Algorithm for Video Conference

Qiuzhen Li, Yingchun Shen, Lixia Li


Wuhan Digital Engineering Institute
Wuhan, China
qiuzhen_li@foxmail.com, syc_lq@yahoo.com.cn, lixiali709@yahoo.com.cn


Abstract--For a typical video conference system, generally it is
required to satisfy some basic requirements such as real-time
and security. In the past few years, a large number of
investigators concentrate on how to develop efficient video
encryption algorithms, which are mainly fell into three
categories: complete encryption, selective encryption, and
entropy coding based encryption. Among above three kinds of
algorithms, they all remain some shortcomings. For example,
complete encryption is safe but computation expensive, selective
encryption usually can destroy the statistical characteristic of
DCT coefficients and then results in compression-efficiency
reduction, and entropy coding based encryption suffers from
the issue that some encrypted results of codewords indices are
possibly no corresponding codewords since the codeword index
set of coding table is just a subset of all possible results of
encrypted index. This paper presents a lightweight complete
encryption algorithm based on efficient XOR operation and
hierarchical encryption strategy, successfully overcoming the
weakness of complete encryption while offering a better
security.
Key words--Video encryption; Group public key;
Watermarking; Video Conference
I. INTREDUCTION
With the rapid development of Internet technology and
video compression standard, it probably transmits the video
stream in an efficient way and then yields some new video
applications such as video conference[1]. For video
conference particularly being involved in secret content, it is
usually required to meet with real-time and security
characteristics so as to watch video smoothly and securely.
In past few years, in spite of achieving significant
improvement in the Internet bandwidth, real-time property
remain a challenging issue in the field of video conference
due to ever-increasing demand on better visual quality such
as high-resolution effect[2].
To achieve such goals, a large number of investigators
pay more attentions to how to develop efficient video
encryption algorithms [3-5]. In contrast with ordinary text
data, video sequences are often of large volume, special
compression standard and real-time demand. These
characteristics lead to some problems in terms of video
security, real-time processing, compression performance,
compliance of data format, etc. So how to design an
effective algorithm particularly with respect to these
characteristics is an urgent problem. The existing video
encryption algorithms can be classified into three categories
based on the relationship between encryption and
compression process: complete encryption algorithm,
selective encryption algorithm and encryption algorithm
based on entropy coding [6]. Complete encryption algorithm
often offers high security, but suffers from great
computational complexity due to encrypting all the data.
Selective algorithm, especially DCT-based selective
algorithm, provides satisfying results with respect to real-
time processing and compliance of video format, but will
confront with negative effect on video compression
performance because of changing the statistical performance
of entropy due to encrypting DCT coefficients. That is,
video encryption algorithm should be well integrated with
compression process so as to meet with the expectations in
real-time transmission, error resilience and format
compliance. In this sense, it seems that the entropy coding
based encryption algorithm is ideal choice for video
encryption. Unfortunately, it doesn't become true because it
will face with the problem that some encrypted results of
codewords indices are possibly no corresponding codewords
since such kind of algorithm is based on encrypting index of
codeword but the codeword index set of coding table is just
a subset of all possible results of encrypted index.
Obviously, the aforementioned three kinds of
algorithms more or less exist some shortcomings yet.
Moreover, the performance of video conference needs to be
further improved so as to satisfy some new demand such as
high resolution and open security infrastructure. To avoid
aforementioned issues while considering some new
requirements, this paper proposed a novel video encryption
algorithm which sophisticatedly combine efficient XOR
operation and hierarchical encryption strategy to provide
security guarantees for visual information and identity
authentication. Besides, to further save the transmission
bandwidth to support better visual quality such as high
resolution, it particularly adopts a novel broadcast
encryption, which is somewhat similar to that of Conditional
Access System(CAS)[7] in digital TV but adopt a different
encryption strategy, for downstream(i.e., video stream
flowing from main to sub nodes). Thanks to such special
measure, for a video conference session with one main node
and N sub nodes, it will save approximate N times
transmission bandwidth for downstream. Meanwhile, it
supports that identity authentication is conducted under
public key infrastructure(PKI) since each node is allocated a
RSA-type key pair.The remainder of this paper is organized
as follows. In section ], we will present a visual

___________________________________
978-1-4244-6943-7/10/$26.00 2010 IEEE

information protected framework and explain how to
achieve real-time and security property. Section ] will
introduce a RSA based group key technique which is the
basis of broadcast encryption in section ]. In section |, we
will empirically show the performance of the proposed
scheme. Finally, it will draw a conclusion in section `.
II. THE PROPOSED VISUAL CONTENT ENCRYPTION
FRAMEWORK
For a typical video conference system, it usually
composes of one main node and multiple sub nodes, in
which each sub node captures the local visual information
and compresses it in an efficient compression format like
H.264, and then transmits it to main node. The main node
usually deploys a Multi-point Control Unit (MCU) which
integrates and compresses visual information captured from
all nodes, and then distributed to all nodes. In practice,
according to the direction of visual information stream, the
visual information between main node and all sub nodes can
be classified into two categories: downstream and upstream.
Currently, due to the fact that visual information get
involved in commercial or military secrets in most cases, it
is required to encrypt the video stream prior to transmission.
In the existing video conference system, the upstream and
downstream are encrypted using the unified way in which it
needs to prepares the distinguished encryption data for each
destination node. Under such an encryption mode, the main
node needs to prepare multiple encryption shares for all sub
nodes while the visual content of downstream is identical to
all sub nodes. It is evident that such encryption mode for
downstream is quite wasteful. Though most of video
conference systems can be operated under high-bandwidth
digital network environment such as Integrated Services
Digital Network (ISDN), the bandwidth is still rare resource
especially in the case of demand on high resolution.
As summarized in section 1, , existing video encryption
just consider the security of video content. Though they have
made a significant progress in time cost and security.
However, for a practical video conference system, the
security of a video conference should be comprehensively
considered. That is, besides the security for visual
information, security of system should be also taken into
account simultaneously. This means the encryption
algorithm for video stream must be sophisticatedly
integrated with the identity authentication and encryption for
keys. Based on above considerations, we propose a novel
visual information encryption algorithm in which the
efficient XOR operation,viewed as one kind of complete
encryption,is used to encrypt video stream offering lower
computation over current complete encryption like AES or
3DS. In addition, it adopts the hierarchical encryption
strategy to ensure the security of keys and session by
hierarchically encrypting the keys and providing a RSA-
based identity authentication. At the same time, to further
reduce the overhead of transmission bandwidth,it encrypts
and transmits the downstream in broadcast way. Though it is
very similar to the existing broadcast encryption such as
CAS for digital TV but substantially different from each
other. The detail process is as follows. Firstly, we will
introduce how to implement identity authentication and
encrypt for downstream. For each node such as main or sub
node, it will be assigned a RSA-based key pair,
1
,
M M
K K

< >
or
1
,
i i
S S
K K

< > , where


1
,
M M
K K

are public and private key


of main node, and
1
,
i i
S S
K K

are public and private key of


i th sub node, respectively. As to public key, it will be
produced as public key certificate (also referred to as digital
certificate) used as identity of each node. When starting with
a new session, the main node and sub nodes will exchange
and verify digital certificate each other. Only if being passed
the verification, the sub node will be permitted to join a
session. For a session of vide conference, suppose there
exists N sub nodes passed verification, and then at main
node side it will generate a new key GPK , called group
public key, based on N sub nodes public
keys
1
,...,
N
s s
K K .Then the arbitrary message mencrypted by
GPK can be successfully decrypted by any private key
1
, 1,...,
i
S
K i N

= ,i.e.
1
( ( ))
S
i
RSA RSA
GPK
K
m Decrypt Encrypt m

= .As to
why GPK has this property, it will be illustrated in detail in
section ] . Now, it will firstly present downstream
encryption algorithm. As illustrated in Fig.1.

Figure 1. Identity authentication and downstream encryption
As illustrated in Fig.1, similar to broadcast encryption,
it adopts the hierarchical way to encrypt the downstream and
the detail processes of main node are as follows:
(1)Authenticate the sub node based on the digital
certificate from sub node and generate a GPK using the
public keys of all valid sub nodes;
(2)To ensure efficiency and security, firstly
downstream is partitioned into two uneven parts (i.e., larger
size one and smaller size one) and the larger-size one is
scrambled using efficient XOR operation with control word
CW . The smaller-size one is used to carry encrypted
message ( )
SK
Encrypt CW in watermarking way,
where ( ) Encrypt is a symmetric encryption algorithm such
as AES or 3DES, SK is a session key. To avoid using
additional message package to transmit the encrypted
message of CW so as to save bandwidth, a bit domain
watermarking algorithm is selected from literature[8] to hide
the encrypted message into smaller-size part of downstream.
Due to the encryption and watermarking operations both

performing in the lowest bit domain, those operations are


computationally inexpensive;
(3)Session key SK , which is updated only until there
exists sub node departing and joining, is encrypted
with GPK and the encrypted message ( )
RSA
GPK
Encrypt SK is
broadcast to each sub node, and ( )
RSA
Encrypt is RSA, an
asymmetric encryption algorithm. The GPK isn't required to
pass to sub node since any message encrypted with it can be
successfully decrypted with any private key
1
, 1,...,
i
S
K i N

= ;
For each valid sub node, it will adopt a series of inverse
operation relative to main node towards encrypted keys,as
well as scrambled and watermarked downstream and then
can successfully obtain decrypted downstream. With respect
to upstream, its encryption and decryption algorithm is
analogous to that of downstream just by replacing GPK ,
1
, 1,...,
i
S
K i N

= with
1
,
M M
K K

respectively.
Based on above illustrations, in comparison with the
existing video conference, one can find out this downstream
encryption algorithm can save approximatetimes bandwidth.
In addition, it offers good security due to adopting
hierarchical encryption structure whose security is
guaranteed by appropriately setting the size and update
periodic of related keys such as , CW SK . Although the size
of message of encrypted session key SK is a little bit large,
it will not seriously affect the performance because the
update frequency of session key is quite low.
III. RSA-BASED GROUP PUBLIC KEY ALGORITHM
In section ], it has been told that each node will be
allocated a key pair
1
, , 1,...,
i i
K K i N

( ) = ,where
1
,
i i
K K

are
public and private key respectively and N is the number of
node or key pair. In addition, one can generate a group
public key GPK based on the set of all public keys
, 1,...,
i
K i N = so that the message encrypted with GPK
can be decrypted by any private key
1
i
K

,i.e., GPK is a
common public key relative to any private key. As to why
this kind of key pair has this characteritic, please look at the
generation principal of such a kind key pair and related
proof later. The key pairs is yielded as follows. First,
generate N pairs of prime number ,
i i
p q ( ) ,next compute
i i i
N p q = as well as ( ) ( 1) ( 1)
i i i
N p q u = , and then
find a positive integer e such that e is prime to
any ( )
i
N u (i.e., ( , ( )) 1
i
gcd e N u = ),finally
evaluate
i
d satisfying 1mod ( )
i i
ed N = u and can get public
key ,
i i
K e N = ( ) and private key
1
,
i i i
K d N

= ( ) .
And then we will introduce how to generate GPK and its
property. From main node side, one can derive a set

S ,which is composed of
i
N extracted from public key
of valid node. The generation of GPK will only consider
the all valid node passed the authentication and
,
j r
j
N S
GPK e N
e
=< >
[
.Let m denote the message expected
for encryption with GPK and the encrypted message is
e
=
[
) mod )
_
kS^
0ll _
\ S
lu)t m m \ .Using any prviate key
1
,
j j j
K d N

= ( ) , it can succefully decrypt the encrypted


message,i.e.,

e . =
=
[ !
)) mod )) mod
`
_

d kS^ kS^
0ll _ t _
l
t S t _
D)t l)t m m \ \ \
u +
= = =
) !
mod mod
_ _
d x \
_ _
m \ m \ m
Its expanded formulations is .
( ) 1
( mod ) mod
( mod( ) mod ) mod
( mod ) mod
mod
mod
j
x
j
x
j
j
j
d e
t j
t s
d e
j t j j
t s t j
d e
j j
ed
j
x N
j
m N N
m N N N N
m N N
m N
m N
m
e
e . =
u +
=
=
=
=
=
[
[

Where
( mod( ) mod
mod mod( )
x
x
e
j t j
t s t j
e
j j t
t s t j
m N N N
m N N N
e . =
e . =

=
[
[
.
From above derivation, it can be seen that GPK is
indeed a common public key to all valid sub nodes and with
the help of GPK , main node can prepare just one share of
encrypted message for all sub nodes thus saving large
amount of bandwidth. Meanwhile,we can conduct identity
authentication based on PKI, which is the security basis in
state-of-the-arts digital rights management system and
electronic commercial business, and is very popular.
IV. EXPERIMENTAL RESULTS AND ANALYSISES
For video conference,one usually concerns about the
complexity and security. To verify the such two property of
the proposed scheme, we choose three typical video
sequences (352`288) for test: akiyo, flower and football,
which have low, middle and high motion complexity
respectively and contain luminance and chrominance data.
Our program was run on a PC with a Intel(R)-Core(TM)2
2.26GHz CPU under Windows XP Professional.
A. Computational complexity
In this algorithm, we will evaluate its complexity by
analyzing the time cost in main and sub node. For main node,
the session key and GPK is updated only when session
membership has changed. In fact, it rarely happens during
video conference, so the running time is mainly spent on the
processes such as encrypting and embedding the control
word, and scrambling video stream, abbreviated as
'encryption+embed'.As to sub node, its main operations
include extracting and decrypting control word,
descrambling video stream as well as playing video,
abbreviated as 'decryption+extract+play'.

Fig. 2 shows the average time per frame consume to


complete the operations like
'encryption+embed','decryption+extract+play',and 'play' over
video sequences akiyo, flower and football, respectively.
One can find out that the time for 'encryption+embed' is far
smaller than that of 'decryption+extract+play' or 'play' and
the time difference between 'decryption+extract+play' and
'play' is very small. That means that operations of
encryption,decryption and watermarking are quite efficient,
and thus this algorithm performs well in the time cost and
can meet with real-time demand.

Figure 2. Comparison of the time consumed by three
processes (i.e., encryption+watermark embedding,
decryption+watermark extraction+play,and play) for video
akiyo,flower,and football
B. Analysis and Evaluation of Security
The security of the proposed algorithm can be divided
into two parts: one is security of the encryption algorithms
and all related keys, another is that of visual information
desired for scrambling and watermarking. As for the first
part, because the encryption algorithm such as AES and
RSA are very powerful standard ciphers, their security is no
doubt. Regarding the security of all related keys, it can be
guaranteed by setting proper parameters such as key size and
key lifecycle. The rule of setting related parameters can be
refer to Table I. From Table I, it can be observed that the
GPK and SK are updated as session-membership has
changed. This is because the RSA cipher is quite powerful
and it is impossible to hack such two keys during a video
conference session when length of GPK is equal to 1024-
bit.
TABLE I. The rules of setting up key size and update
periods
key lifecycle of key Key size
GPK Until the membership changing 1024 bit
SK Until the membership changing 128 bit
CW 2-3 second 16 bit
For the second part, though partial video stream is
selected for carrying the encrypted control word in
watermarking way and the watermarking almost doesn't
cause visual artifacts, i.e., the visual content associated with
this part stream isn't encrypted at all. But, in practice, the
negative impact resulting from watermarking can be ignored
in the case of when all other codewords are well scrambled
even if 16 codewords is not processed at all. Furthermore, in
spite of very simplicity, the scrambling of XOR remains
obtaining a good encryption result since the XOR operation
for codeword in the bit domain can arise a vast variation in
spatial domain. In Fig. 3, the upper row is the original frame
of three video sequence and the bottom row is decrypted
results with error key. Obviously, it hard to obtain any
valuable visual information without the correct keys.

(a)akiyo (b) flower (c) football

(d) decrypted video frame using error key
Figure 3. upper row is original frame; bottom row is the results
decrypted with error key
V. CONCLUSION
Different from the existing video encryption algorithm,
this paper achieve a good property in terms of real time and
security properly by appropriately combining the efficient
XOR operation and hierarchical encryption stategy.Its main
contributions are as follows:(1)Propose a novel RSA based
group public key technique which provides a security
guarantee that only valid node can decrypt the message
ecnrypted with GPK ;(2)Adopt a broadcast encryption
method so as to save approximately N times bandwidth in
the case of N sub nodes;(3)Properly integrate the
watermarking technique and XOR based visual encryption to
obtain realtime property without loss of security.
REFERENCES
[1] C. Xiao, S. Ma, J. Niu, L. Wang, B. Shan, and T. Chen, "A novel
security scheme for video conference system with wireless terminals,"
Beijing, China, 2008.
[2] F. Liu and H. Koenig, "A novel encryption algorithm for high
resolution video," New York, NY, USA, 2005.
[3] H. Guang-Ming, Y. Chun, W. Yi, and Z. Yu-Zhuo, "A quality-
controllable encryption for H.264/AVC video coding," Berlin,
Germany, 2006.
[4] Q. Zhang, J.-M. Wu, and H.-X. Zhao, "Efficiency video encryption
scheme based on H.264 coding standard and permutation code
algorithm," Los Angeles, CA, United states, 2009.
[5] Y. Wang, M. Cai, and F. Tang, "Design of a new selective video
encryption scheme based on H.264," Harbin, Heilongjiang, China,
2007.
[6] C. Mian, J. Jia, and Y. Lei, "An H.264 video encryption algorithm
based on entropy coding," Kaohsiung, Taiwan, 2007.
[7] C. Yang, J. Liu, J. Tian, and Y. Zhang, "Authentication scheme and
simplified CAS in mobile mulitimedia broadcast," Nanchang, China,
2009.
[8] G. C. Langelaar, R. L. Lagendijk, and J. Biemond, "Real-time labeling
of MPEG-2 compressed video," Journal of Visual Communication and
Image Representation, vol. 9, pp. 256-270, 1998.

Potrebbero piacerti anche