Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Abstract—We design distributed spectrum sensing and access is cognitive radio, which is capable of agile sensing and commu-
strategies for opportunistic spectrum access (OSA) under an nication through adaptive learning [2]. As such, cognitive radio
energy constraint on secondary users. Both the continuous and the is often used as a synonym for different dynamic spectrum ac-
bursty traffic models are considered for different applications of
the secondary network. In each slot, a secondary user sequentially cess strategies (see [3] for a survey of different approaches en-
decides whether to sense, where in the spectrum to sense, and visioned for dynamic spectrum access).
whether to access. By casting this sequential decision-making In this paper, we focus on the design of distributed medium
problem in the framework of partially observable Markov de- access control (MAC) protocols for OSA under an energy con-
cision processes, we obtain stationary optimal spectrum sensing straint on secondary users. We consider secondary users, each
and access policies that maximize the throughput of the secondary
user during its battery lifetime. We also establish threshold struc- with a finite amount of initial energy, exploiting temporal spec-
tures of the optimal policies and study the fundamental tradeoffs trum opportunities in a slotted primary system. In each slot, a
involved in the energy-constrained OSA design. Numerical results secondary user either turns off its transceiver to save energy or
are provided to investigate the impact of the secondary user’s chooses a channel in the spectrum to sense and possibly access,
residual energy on the optimal spectrum sensing and access resulting in different levels of reduction in its battery energy.
decisions.
A MAC protocol governing such a sequential decision-making
Index Terms—Cognitive radio, opportunistic spectrum access, process thus consists of two components: i) a sensing strategy
partially observable Markov decision process (POMDP), spectrum
that specifies whether to sense and where in the spectrum to
sensing.
sense and ii) an access strategy that determines whether to ac-
cess based on the sensing outcomes regarding the occupancy
I. INTRODUCTION state (idle or occupied by primary users) and the fading con-
dition of the channel. The design objective is to maximize the
throughput of a secondary user during its battery lifetime. We
PPORTUNISTIC SPECTRUM ACCESS (OSA), also re-
O ferred to as spectrum overlay or spectrum pooling [1], is
one of the approaches envisioned for dynamic spectrum man-
propose optimal MAC protocols for both the continuous and
the bursty traffic models. For brevity, we adopt the continuous
traffic model, where the secondary user always has packets to
agement. It has received increasing attention due to its poten-
transmit, unless otherwise specified.
tial for improving spectrum efficiency and its compatibility with
the current spectrum management policy and legacy wireless
A. Energy-Constrained OSA Design
systems. The basic idea of OSA is to allow secondary users to
search for and exploit local and instantaneous spectrum oppor- While optimal distributed MAC protocols for OSA have been
tunities with limited interference to primary users. The physical proposed in [4], [23], [5], [24], the impact of the energy con-
platform of OSA and other dynamic spectrum access strategies straint on optimal sensing and access protocols has not been
studied. The incorporation of the energy constraint significantly
complicates the problem. First consider the sensing strategy.
Manuscript received October 31, 2007; revised August 21, 2008. First Without the energy constraint, the secondary user should al-
published October 31, 2008; current version published January 30, 2009. The ways sense, and its channel selection should exploit the spec-
associate editor coordinating the review of this manuscript and approving it
for publication was Dr. Mats Bengtsson. This work was supported in part by trum occupancy statistics to achieve the best tradeoff between
the Army Research Laboratory CTA on Communication and Networks under gaining immediate access and gaining statistical information
Grant DAAD19-01-2-0011 and by the National Science Foundation under about the spectrum occupancy [4], [23], [5], [24]. With the en-
Grants CNS-0627090 and ECS-0622200. This work was presented in part at
the IEEE MILCOM Conference, Washington DC, October–November, 2006, ergy constraint, however, the secondary user, even with packets
and at the IEEE Global Telecommunications (GLOBECOM) Conference, to transmit, may choose to sleep to conserve energy. Moreover,
Washington DC, November 2007.
channel selection should also exploit channel fading statistics
Y. Chen was with the Department of Electrical and Computer Engineering,
University of California, Davis, CA 95616 USA. She is now with Cisco Sys- since a channel in deep fading requires more energy for trans-
tems, Inc., San Jose, CA USA (e-mail: yxchen@ece.ucdavis.edu). mission. The design tradeoff involved in sensing decisions thus
Q. Zhao are with the Department of Electrical and Computer Engineering,
lies among three often conflicting objectives: gaining imme-
University of California, Davis, CA 95616 USA (e-mail: qzhao@ece.ucdavis.
edu). diate access, gaining spectrum occupancy information, and con-
A. Swami is with the Army Research Laboratory, Adelphi, MD 20783 USA serving energy.
(e-mail: aswami@arl.army.mil). It has been shown in [5] and [24] that without the energy
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. constraint, the optimal access strategy is to access if and only
Digital Object Identifier 10.1109/TSP.2008.2007928 if the channel is sensed as idle, provided that the operating
1053-587X/$25.00 © 2008 IEEE
784 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009
characteristics (false alarm rate versus miss detection rate) diminishes as the residual energy increases. This observation
of the spectrum sensor is chosen optimally according to the indicates that energy conservation only plays a critical role in
interference constraint defined by the probability of collision. sensing and access decisions when the battery of the secondary
With the energy constraint, channel fading statistics play an user is close to depletion. We also find that when the sensing
important role in access decision-making. For example, when energy consumption is large, the secondary user should be
the sensed channel is idle but has poor fading condition, should more conservative in sensing, but more aggressive in access.
the secondary user with packets to send access this channel to Specifically, the secondary user should increase the sensing
gain immediate reward or wait for better channel realizations threshold and lower the access threshold.
to save transmission energy but waste the energy already used Bursty Traffic: We also extend our analysis to the case
in sensing? Clearly, such a decision depends on the secondary where the secondary user has bursty traffic. As explained in
user’s residual energy level and its energy consumption charac- Section I-A, the optimal sensing decisions in this case should
teristics, as well as the channel fading statistics. incorporate the secondary user’s buffer state. We, however,
Bursty Traffic: Bursty traffic of the secondary user fur- note that due to random packet arrivals, the receiver does not
ther complicates the design. In this case, the design tradeoffs know the secondary user’s buffer state. This impedes optimal
vary with the secondary user’s buffer state. Specifically, when distributed design since in the absence of additional control
the buffer is empty, the secondary user does not need to gain channels, transceiver synchronization requires the secondary
immediate access in the current slot. Hence, sensing decisions user and its receiver to have the same information for deci-
are made for the sole purpose of gaining statistical information sion-making [4], [5], [23], [24]. We overcome this obstacle by
about spectrum occupancy. The question raised here is whether treating the buffer state as a partially observable parameter. The
the secondary user should continue tracking the dynamics of the secondary user and its receiver can thus make sensing decisions
spectrum for future use or turn off its transceiver to save energy. based on the conditional probability mass function (PMF) of
Intuitively, the sensing strategy employed by the secondary user the buffer state. We show that the secondary user with an empty
when its buffer is empty should be fundamentally different from buffer can benefit from sensing a channel if the time-correlation
the one used when it has packets to transmit. of the spectrum occupancy state is sufficiently large.
C. Related Work
B. Main Results
Cognitive MAC design for OSA has been addressed under
Within the framework of partially observable Markov deci- different network architectures (see [4]–[7], [23], [24], and ref-
sion process (POMDP), we tackle the optimal MAC design for erences therein). In [6], the authors address the implementation
energy-constrained OSA. By modeling the primary users’ traffic of a MAC protocol for OSA in a GSM primary network. A ded-
as a Markov chain, we formulate the problem of dynamically icated control channel is required for the secondary transmitter
choosing whether to sense, where in the spectrum to sense, and and receiver to exchange information about channel selection.
whether to access for maximum throughput as a POMDP with a In [4] and [23] optimal distributed MAC protocols are proposed
finite but random time horizon. This formulation allows us to in- for OSA in slotted primary systems. The proposed protocols en-
tegrate the dynamics of spectrum occupancy and channel fading sure synchronous hopping of the secondary transmitter and re-
into the MAC design. The optimal MAC design is given by the ceiver in the spectrum without requiring central controllers or
stationary optimal policy of this POMDP, which can be solved control channels. More recently, sensing errors have been taken
using existing POMDP algorithms. into account in the MAC design [4], [5], [23], [24]. Signifi-
To gain insights into the energy-constrained OSA problem, cantly, a separation principle is established in [5] and [24] for
we search for structures of the optimal sensing and access the optimal joint design of the physical layer spectrum sensor
policies. We show that in the single-channel case, the optimal and the MAC layer sensing and access strategies. In [7], access
sensing decision (whether to sleep or sense) has a threshold strategies for a slotted secondary user searching for opportuni-
structure: the secondary user should sense the channel if and ties in an unslotted primary network are considered, where a
only if the conditional probability that the channel is idle in the round-robin single-channel sensing scheme is used and sensing
current slot (conditioned on the entire sensing and observation is considered to be perfect. The joint design of the spectrum
history) is above a certain threshold (referred to as the sensing sensor and sensing and access strategies for OSA in unslotted
threshold). We also show that the optimal access strategy is a primary systems has been addressed in [8]. To our best knowl-
threshold policy in terms of the channel fading condition. That edge, energy-constrained OSA design has not been considered
is, the secondary user should access the channel if and only in the literature.
if the sensing outcome indicates that the channel is idle and Statistical models for spectrum usage of primary systems
its fading condition is better than a certain threshold (referred are important for OSA protocol design. Existing work along
to as the access threshold). These structural results not only this line can be found in [9]–[11]. Measurements obtained
reveal the fundamental design tradeoffs but also reduce the from spectrum monitoring test-beds demonstrate the Makovian
computational complexity in searching for the optimal policies. transition between busy and idle channel states in wireless
These structural results are complemented with numerical LANs [9], a model similar to that used in this paper. With these
examples. We study different factors that affect the optimal active experimental research activities, we can perhaps foresee
sensing and access decisions. We find that the impact of the a public database of statistical models of spectrum usage in
secondary user’s residual energy on the optimal decisions different bands and at different times and locations. Secondary
CHEN et al.: DISTRIBUTED SPECTRUM SENSING AND ACCESS IN COGNITIVE RADIO NETWORKS 785
users can then download the required model for the design of strategies that maximizes the throughput of an individual sec-
spectrum sensing and access strategies. ondary user during its battery lifetime.
An overview of challenges and recent developments in OSA Channel Fading Model: We adopt a block channel fading
can be found in [12]. model.1 Specifically, we assume that the channel gain between
the secondary user and its receiver is a random variable inde-
D. Organization and Notation pendently and identically distributed (i.i.d.) across slots but not
The rest of this paper is organized as follows. After describing necessarily i.i.d. across channels.
the primary and the secondary network models in Section II, we Energy Model: The secondary user is powered by a battery
formulate the optimal MAC design for energy-constrained OSA with finite initial energy . Energy consumption in a slot may
as a POMDP over a random horizon in Section III. In Section IV, include the following: i) the energy consumed in the sleeping
we derive recursive formulas for solving this POMDP and es- mode; ii) the energy consumed in sensing the channel oc-
tablish structures of the solution. We also address the distributed cupancy and estimating the channel fading condition2; iii) the
implementation of the obtained optimal design. In Section V, we energy consumed in successfully transmitting over
further establish the threshold structures of the optimal sensing channel in a slot. In general, we have . For
and access policies and study different factors that affect the op- ease of presentation, we assume that the sleeping energy and
timal decisions. Finally, Section VI focuses on the energy-con- the sensing energy are constants, invariant to channel fading.
strained OSA design for secondary users with bursty traffic, and Due to hardware and power limitations, the secondary user
Section VII concludes the paper. only has a finite number of transmission power levels. We
Random variables and their realizations are denoted by cap- assume that the user transmits at a fixed rate. Hence, to ensure
ital and small letters, respectively. Vectors are denoted by bold- successful transmission, the user has to adjust its transmission
faced letters. For two equal-length vectors power according to the current channel fading condition. The
and , we say if for all . transmission energy consumption is thus a random vari-
Let denote the indicator function: if event oc- able depending on the current channel fading condition. In gen-
curs and zero otherwise. eral, the better the channel, the lower the transmission power
level. Let denote the energy consumed in transmitting at the
II. NETWORK MODEL th power level with . The PMF of is
determined by channel fading statistics and is denoted by
A. Primary Network Model
Consider a spectrum consisting of channels, each with (2)
potentially different bandwidth . These
channels are licensed to a primary network employing a More specifically, is the probability that the fading condi-
synchronous slotted communication protocol. The primary tion in channel falls into an interval that requires a minimum
traffic is modeled as a time-homogeneous discrete Markov energy of for successful transmission.
process. Specifically, let de- Let denote the secondary user’s residual energy at the
note the occupancy of channel by the primary network beginning of slot . Due to random transmission energy con-
in slot . The spectrum occupancy state (SOS), denoted by sumption, is also a random variable taking values from a
finite set
, forms a Markov chain with state
space . The transition probabilities are denoted by
(1)
which are determined by the statistics of the primary traffic and (3)
assumed known to secondary users.
B. Secondary Network Model where are, respectively, the numbers of slots when the
secondary user switches to the sleeping mode, senses a channel,
Consider an overlay ad hoc secondary network whose users
and transmits at the th power level. Since the secondary user is
independently and selfishly search for, according to a MAC pro-
required to sense a channel before accessing it in order to avoid
tocol, instantaneous spectrum opportunities in these chan-
collisions with primary users, we have .
nels. We assume that each secondary user can only sense and
Traffic Model: In Sections III–V, we adopt a continuous
access one channel in a slot. At the beginning of each slot, a
traffic model, i.e., the secondary user always has packets to
secondary user first determines its operation mode: sleeping or
transmit. The case where secondary users have bursty traffic is
sensing. If the former, the user turns off its transceiver until the
considered in Section VI.
next slot. If the latter, the user chooses one channel to sense and
then decides whether to access this channel based on the sensing 1Our analysis can be readily extended to a more general Markovian fading
outcome. We assume that sensing errors are negligible. channel model. See details in Section III-B.
2An interesting variation is to separate the energy for sensing channel occu-
The optimal sensing and access decisions are made based on
pancy from that for estimating channel fading conditions; the latter would be
the user’s statistical knowledge of the SOS and its own residual consumed only if the channel is sensed to be idle. This variation is easily incor-
energy. Our goal is to design the optimal sensing and access porated into the framework developed here.
786 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009
III. A DECISION-THEORETIC FRAMEWORK where is the energy required for a successful transmission
In this section, we formulate the optimal energy-constrained under the current sensing outcome . Hence, access de-
OSA design as a POMDP. This formulation allows us to incor- cisions for different sensing
porate the secondary user’s residual energy into sensing and ac- outcomes should be chosen from the composite set :
cess decisions at the MAC layer. We show that the optimal en-
ergy-constrained OSA strategy is given by the optimal policies (9)
of this POMDP.
At the end of the slot, the user updates its statistical knowl-
A. Sequential Decision-Making edge of the SOS by incorporating its decisions and observa-
tions in this slot (see Section III-B for details). Depending on
Sensing Decision: At the beginning of slot , based on its
its sensing and access decisions, the user’s residual energy is
statistical knowledge of the SOS and its current residual energy,
reduced from to
the secondary user first determines its operating mode in this
slot: sleeping or sensing. If the sleeping mode is chosen, no
more decisions need to be made in this slot. Otherwise, the user
chooses a channel to sense. Let 0 represent the sleeping mode.
We define sensing action as (10)
where the structure of the optimal solution and describe an efficient al-
gorithm for obtaining the optimal decisions. At the end of this
section, we discuss the distributed implementation of the op-
(12)
timal MAC design.
When the user chooses a channel to sense, the belief A. Stationary Optimal Policy
vector can be updated using Bayes rule based on the sensing Stationary policies are usually preferred due to their reduced
outcome : memory requirements and low complexity in implementation.
The fact that the user consumes nonzero energy in each slot and
that its battery has finite initial energy implies that the system al-
where ways reaches a terminating state (i.e., ) in a finite
but random time. The inevitable termination makes the energy-
constrained OSA design an example of a stochastic shortest path
(13) problem, which always has a stationary optimal policy [13].
Proposition 1: For the energy-constrained OSA design given
The belief vector together with the residual en- by (15), there exist stationary optimal sensing and access
ergy is a sufficient statistic4 for the system state policies.
. That is , referred to as the information Proof: See Appendix A.
state, is sufficient for making optimal sensing and access
decisions. A sensing policy is thus given by a sequence of B. Value Function
functions , where maps an information Proposition 1 allows us to focus on stationary policies without
state to a sensing decision in slot losing optimality. We can thus omit the time index for nota-
. Given sensing policy , an access policy is given by a tional convenience. The next step to solving (15) is to express
sequence of functions , where maps an the objective explicitly as a function of the information state
information state satisfying (i.e., and the sensing and access actions .
the user operates in the sensing mode) to an admissible access Let denote the action-value function or the -func-
decision . If functions are identical for all tion, which represents the maximum expected total reward that
is a stationary policy. can be obtained by taking sensing action in
Reward and Objective: A natural definition of the reward is the current slot when the information state is . The value
the number of bits delivered by the user in a slot. The immediate function, denoted by , is the maximum expected total
reward can thus be written as reward that can be accumulated starting from information state
. The value function and the corresponding op-
(14) timal sensing action are given by
(20) (25b)
(25c)
C. Solution Structure
We note that obtaining the optimal sensing and access de- Compared with the original value function developed
cisions hinges on the computation of the action-value and the in Section IV-B, the above value function not only has
value functions. We thus seek structures of the value function simpler belief updates but also avoids computation of the
that lead to efficient computation of the optimal decisions. summation in (18).
1) Reduced Dimension: One of the difficulties in calculating 2) Monotonicity: Monotonicity results for the value function
the value function is that the dimension of the belief are usually desired since they not only provides insights into
vector grows exponentially with the number of channels. the underlying problem but also serves as a stepping stone for
It has been shown in [4] and [23] that for independently evolving establishing the structure of optimal policies (see [14] for an
channels, an alternative sufficient statistic for the SOS is given example). In Proposition 2, we study the monotonicity of the
value function with respect to each of its parameters.
by the marginal distribution of the
Proposition 2: Monotonicity of Value Function
SOS, where denotes the probability (conditioned on the
P2.1) The value function is monotonically increasing
entire decision and observation history ) that channel is
with the residual energy , i.e.,
idle at the beginning of slot :
for .
(21) P2.2) Assume that the SOS evolves independently across chan-
nels. If , then the value function given in
Let and denote the (24) is monotonically increasing with the belief vector
, i.e., for .
transition probabilities of channel , where
Proof: See Appendix B.
and . P2.1) is straightforward. P2.2) considers the case where the
We can then obtain the belief updates similar to (12) and (13). SOS evolves independently across channels. It provides a suf-
Specifically, when the user operates in the sleeping mode, we ficient condition for the value function to be mono-
have tonically increasing with the belief vector defined in (21).
Note that represents the case where the channel occu-
pancy state is positively correlated across time. In this case, a
where larger current belief vector indicates a larger probability
that channels will be idle in all the future slots, leading to a
(22) higher chance of getting rewards. When , the channel
occupancy state is negatively correlated across time. The value
When the user chooses channel , then the belief vector function is not necessarily monotonic. This is because when
is updated according to the sensing outcome : , a larger belief vector indicates a smaller proba-
bility that channels are idle in the next slot. The probabilities of
channels being idle oscillates over time.
CHEN et al.: DISTRIBUTED SPECTRUM SENSING AND ACCESS IN COGNITIVE RADIO NETWORKS 789
secondary users.5 If the channel remains idle when its backoff gain immediate access but also statistical information about the
time expires, it transmits a short request-to-send (RTS) mes- SOS. With the energy constraint, however, the user may choose
sage at full power6 to the receiver. Upon receiving the RTS, to sleep since sensing costs energy. In this case, the optimal
the receiver estimates the channel fading condition using the operating decision should strike a balance between gaining re-
RTS, and then replies with a clear-to-send (CTS) message ward/information and conserving energy.
which contains the estimated channel fading condition. After a 1) Analytical Study: We first provide a sufficient condition
successful exchange of RTS–CTS, the transmitter can adjust its for the user to operate in the sensing mode.
transmission power according to the channel fading condition Proposition 4: When the secondary user’s belief vector is
and communicate with the receiver over this channel. given by the stationary distribution of the underlying SOS, its
2) Synchronous Hopping: Suppose that the transmitter and
the receiver have tuned to the same channel after the initial hand- optimal operating mode is to sense, i.e., if .
shake (one scheme for initial handshake can be found in [4] Proof: See Appendix D.
and [23]). To ensure synchronous hopping in the spectrum af- The intuition behind Proposition 4 is explained as follows.
terwards without extra control message exchange, the receiver Suppose that the secondary user chooses to operate in the
must be aware of the transmitter’s sensing decisions at the be- sleeping mode when its belief vector is given by the stationary
ginning of each slot. For this purpose, the transmitter and the re- distribution of the SOS. Then, it will have the same belief
ceiver must maintain the same information state in each vector but reduced residual energy at the beginning of the next
slot. slot. The energy consumed in sleeping is thus wasted without
We point out that when both the transmitter and the receiver gaining any statistical information about the SOS. This suggests
can observe the true state of the sensed channel, they will that the optimal operating mode is to sense.
have the same update of the belief vector and the residual Next, we consider the single-channel case, where
energy, thus reaching the same information state. When the the belief vector can be characterized by a scalar as defined
transmitter and the receiver are affected by different sets of in (21), and the transition probabilities of this channel can be
primary users or when sensing errors cannot be ignored, the denoted by and
exchange of RTS–CTS for fading state acquisition can be ex- .
ploited to ensure synchronous hopping between the transmitter Proposition 5: Threshold Optimal Sensing Decision
and the receiver. In this case, the common observation used for Consider the single-channel case. For any given
updating the information state is whether there is a successful residual energy , the optimal sensing decision has a threshold
exchange of RTS–CTS. A similar discussion on using common structure:
observations to ensure synchronous hopping can be found in
[5] and [24]. (27)
V. THRESHOLD STRUCTURES OF OPTIMAL POLICIES
In this section, we study different factors that affect the op- where is the optimal
timal decisions obtained in Section IV-B. We focus on the oper- sensing threshold.
ating decision (sleeping versus sensing) and the access decision, Proof: See Appendix D.
which are unique to the energy-constrained OSA problem. Proposition 5 states that the user should sense when the belief
Careful inspection of (17) and (19) reveals that the user’s de- of the channel is large and should sleep when the channel is
cision affects the total expected reward in three ways: i) it may less likely to be idle. This agrees with our intuition.
acquire an immediate reward ; ii) it transforms the current Corollary 1: Consider the single-channel case.
belief vector to or which When , the secondary user should always operate in the
summarizes the information of the SOS up to this slot; and iii) sensing mode, i.e., .
it causes a reduction in battery energy. Hence, to maximize the Proof: See Appendix D.
total expected reward during the battery lifetime, optimal de- 2) Numerical Study: As indicated by Proposition 5, the
cisions should be made to achieve a tradeoff among gaining in- user’s residual energy affects the optimal operating decision
stantaneous reward, gaining information for future use, and con- through sensing threshold . To study the role of the
serving energy. residual energy in choosing operate modes, we plot the
optimal sensing threshold in Fig. 2 for different sensing
A. To Sense or Not to Sense? energy consumption and channel occupancy statistics
Without the energy constraint, the user should always operate .
in a sensing mode since sensing provides not only a chance to We find that the optimal sensing threshold is highly
5Carrier sensing among secondary users starts only after the state of the
dependent on the user’s residual energy when is small. As
chosen channel has been identified as idle. In other words, a minimum value increases, the impact of the residual energy on the user’s op-
on the backoff time of secondary users is imposed to ensure the priority of the erating decision diminishes. When is sufficiently large,
primary users. This minimum value of the backoff time also allows a secondary the optimal sensing threshold becomes independent of
user to distinguish transmissions of primary users from those of competing
secondary users. the residual energy . This observation implies that when the
6Note that the energy consumed in channel state estimation is absorbed into battery is depleting, the user should focus more on how to fully
the sensing energy consumption e . utilize its residual energy.
CHEN et al.: DISTRIBUTED SPECTRUM SENSING AND ACCESS IN COGNITIVE RADIO NETWORKS 791
We also see that the optimal sensing threshold fluctu- where is the optimal access threshold.
ates more dramatically when the channel occupancy state is neg- Furthermore, when (i.e., the single-channel case), the
atively correlated (i.e., ). That is, in this case, the residual threshold is independent of the belief vector
energy plays a more important role in decision-making. As ex- .
plained below Proposition 2, the probability that the channel is Proof: See Appendix E.
idle fluctuates when . Hence, the user should focus more Note that the better the channel fading condition, the lower
on its residual energy in this case to save energy for those slots the sensing outcome. Proposition 6 indicates that the user should
when the channel is more likely to be idle. access when the channel is in good condition and not access
Furthermore, we find that the optimal sensing threshold when the channel experiences deep fading. In particular, when
increases with the sensing energy consumption . That the sensed channel is in the best fading condition (i.e.,
is, the user should be more conservative in making operating , then the user should always access, i.e., , for
decisions when is large. This observation agrees with our any .
expectation because when is large, the extra energy con- Proposition 6 also helps us reduce the size of the access deci-
sumed in sensing can only be paid off when the chance of sion space from exponential to linear with
gaining immediate access is higher. On the other hand, when respect to the number of power levels, leading to a more effi-
is comparable to , the user can afford sensing more often cient search for the optimal access policy. Specifically, we can
to gain statistical information. restrict our search for the optimal access decision to the fol-
lowing set:
B. To Access or Not to Access?
Without the energy constraint, the user should always access
an idle channel. With the energy constraint, however, the access
decision should take into account both the energy consumption
(29)
characteristics and the channel fading statistics. For example,
when the channel is idle but has poor fading condition, should where the size of is on the order of .
the user access this channel to gain immediate reward or wait 2) Numerical Study: For simplicity, we consider the single-
for better channel realizations for less transmission energy? We channel case in the numerical study. As shown in
find that such a decision is a monotonic function of the channel Proposition 6, the optimal access threshold in this case
fading condition. reduces to , which is independent of the belief vector .
1) Analytical Study: In Fig. 3, we plot the optimal access threshold as a func-
Proposition 6: Given that a channel is tion of the residual energy for different sensing energy con-
sensed, the optimal access decision is monotonically increasing sumption .
with the channel fading condition. Specifically, for any given Similar to the behavior of the optimal sensing threshold
residual energy , the optimal access decision is , the optimal access threshold may vary consid-
given by erably when is small, but a common steady value is reached
when is sufficiently large. That is, the impact of on optimal
(28)
access decisions diminishes.
792 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 2, FEBRUARY 2009
TABLE I
A SAMPLE PATH OF THE SOS EVOLUTION AND THE CORRESPONDING OPTIMAL SENSING AND ACCESS DECISIONS. (e = 01: ;e = 05: ;E 2
f1; 2; 3; 4g; [p (1); p (2); p (3); p (4)] = [0:4; 0:3; 0:2; 0:1]; [p (1); p (2); p (3); p (4)] = [0:2; 0:3; 0:4; 0:1].)
We further see that the optimal access threshold in- sleeping mode. Packets are dropped when its buffer overflows.
creases with the sensing energy consumption . Hence, when Let denote the number of packets in the user’s buffer at
is small, the user should refrain from transmission under poor the beginning of slot . Depending on the packet arrivals and
channel conditions and wait for better channel realization. On departures, the buffer state follows a Markov chain with
the other hand, when is large, the user should be more ag- state space and transition probabilities
gressive in making access decisions: it should grab an instan- given by
taneous opportunity even when the channel is in a deep fade.
This is because when is large, the sensing energy consumed
in waiting for the best channel realization may exceed the extra
energy consumed in combating the poor channel fading.
C. A Sample Path
To further illustrate the behavior of the optimal sensing and (30)
access policies, we study an example of the SOS evolution and
the corresponding optimal decisions. In Table I, we consider We assume that the transmission time of a packet over a
independent channels with identical transition proba- channel with unit bandwidth is equal to the slot length. Hence,
bilities but different channel fading statis- the number of packets transmitted over channel in a slot is
tics. At the beginning of the first slot, the user operates in the either 0 or .
sensing mode since its belief vector is given by the stationary
distribution of the SOS. This agrees with Proposition 4. We find B. POMDP Formulation
that to conserve energy, the user never chooses the channel (i.e., The POMDP framework developed in Section III-B for
Channel 2) in deep fading even if there is a higher probability energy-constrained OSA design in the continuous traffic case
that this channel is idle. This demonstrates the important role of can be extended to the bursty traffic case. Specifically, the new
channel fading statistics in deciding whether to sense. We also system is characterized by the following three components:
see that the exploitation of channel occupancy dynamics allows i) the primary network’s SOS ; ii) the secondary user’s
the user to efficiently track spectrum opportunities. Specifically, residual energy ; and iii) the secondary user’s buffer size
when the channel is less likely to be idle, the user operates in the .
sleeping mode to save energy (see ). It wakes up when Sufficient Statistics: As explained in Section IV-E, to ensure
the probability that the channel is idle is large. synchronous hopping in the spectrum without extra control mes-
sages, the user (i.e., transmitter) and its receiver must use a
VI. BURSTY TRAFFIC IN ENERGY-CONSTRAINED OSA common knowledge of the system state for decision-making in
In this section, we address the optimal distributed MAC de- each slot. We note that while the user and its receiver can main-
sign for energy-constrained OSA when the secondary user has tain the same belief vector and residual energy ,
bursty traffic. We show that in this case, the optimal sensing the receiver does not know the exact buffer state until no-
and access decisions should also take into account the traffic tified by the user during the exchange of RTS–CTS.7 Hence,
dynamics of the secondary user. We illustrate the impact of the when making sensing decisions (which occur before the ex-
secondary user’s buffer state on the optimal operating decision. change of RTS–CTS), both the user and its receiver should treat
the buffer state as a partially observable parameter and
A. Bursty Traffic Model use statistical information about . On the other hand, since
both the user and its receiver know the exact buffer state after
We assume that the packet arrival process is i.i.d. across slots,
a successful exchange of RTS–CTS, access decisions should be
for example, the Poisson packet arrival process. Let
made by taking into account the exact buffer state . Let
, denote the probability that packets arrive in a slot.
denote an admissible access decision
We assume that the user has a finite buffer with maximum size
. It receives packets in every slot even if it operates in the 7The secondary user can piggyback its buffer state D ( ) to the RTS message.
t
CHEN et al.: DISTRIBUTED SPECTRUM SENSING AND ACCESS IN COGNITIVE RADIO NETWORKS 793
under sensing outcome and buffer state , where is the updated knowledge of the buffer state
where given in (32).
Next, we derive the maximum expected reward
that can be achieved in the sensing mode. Consider the sce-
nario where the secondary user chooses access decision
under sensing outcome when its buffer state
(31) is . In this case, the maximum expected reward can be
calculated as
The statistical information about the buffer state can be
summarized by a conditional PMF , where
and . Each element de- (36)
notes the conditional probability (given the user’s notifications
of the buffer state) that the user’s buffer state is at Optimizing over all admissible access decisions
the beginning of slot . When the user operates in the sleeping and then averaging over all sensing outcomes
mode (i.e., ) or the chosen channel is sensed as busy with (18) and all buffer states with current
, the user is unable to inform the receiver of statistical information , we obtain that
its buffer state. Hence, the statistical information of
is updated at both the user and its receiver based solely on the
packet arrival process:
(37)
where
With (35), (36), and (37), the value function can be obtained as
(32) (38)
where
(33) (39)
APPENDIX B
PROOF OF PROPOSITION 2
Proof of P2.1: Note that the secondary user with residual
energy can always act as if it has a lower residual energy. Hence,
the secondary user with a larger initial energy earns no fewer
rewards. (42)
Proof of P2.2: We prove P2.2 by induction over residual
energies . Specifically, for the lowest possible residual energy
, the value function of any information state is where and are, respectively, the -vec-
and hence P2.2 holds. Suppose that it holds tors associated with the regions containing belief vectors
for all possible residual energies lower than . Since and , respectively. Viewing each term in
implies as seen from (22), the square brackets of (41) and (42) as an element of a
we obtain from (25a) that . Next, we possible -vector , we find that the action-value functions
show that for . We note that can be written as an inner product of the belief vector and an
when , we have from (23) -vector . Moreover, there are only a finite number of such
and hence from (25c). -vectors since we have assumed that sets are finite
Since as seen from (23), we have for all . Since the maximum of a finite set of piecewise
linear and convex functions is also piecewise linear and convex,
from (25c). Using (25b), we then obtain that Proposition 3 holds.
APPENDIX D
PROOF OF PROPOSITIONS 4-5 AND COROLLARY 1
Proof of Proposition 4: We prove by induction that
, i.e., . Clearly,
holds for any when
. Suppose that this equality holds for all residual en-
ergies lower than . Since is the stationary distribution
of the underlying SOS, we have . We thus obtain
from (16) and (17) that
(44)
[10] D. Datla, A. M. Wyglinski, and G. J. Minden, “A spectrum surveying Qing Zhao (S’97–M’02–SM’08) received the Ph.D.
framework for dynamic spectrum access networks,” IEEE J. Sel. Areas degree in electrical engineering from Cornell Univer-
Commun.: Cognitive Radio: Theory Appl., Mar. 2007, submitted to. sity, Ithaca, NY, in 2001.
[11] S. D. Jones, E. Jung, X. Liu, N. Merheb, and I.-J. Wang, “Characteri- In August 2004, she joined the Department of
zation of spectrum activities in the U.S. public safety band for oppor- Electrical and Computer Engineering at the Uni-
tunistic spectrum access,” presented at the IEEE Symp. New Frontiers versity of California, Davis (UC Davis), where she
Dynamic Spectrum Access Networks (DySPAN), Dublin, Ireland, Apr. is currently an Associate Professor. Her research
2007. interests are in the general area of signal processing,
[12] Q. Zhao and B. Sadler, “A survey of dynamic spectrum access,” IEEE communications, and wireless networking.
Signal Process. Mag., vol. 24, no. 3, pp. 79–89, May 2007. Dr. Zhao is an Associate editor of the IEEE
[13] D. P. Bertsekas, Dynamic Programming and Optimal Control. Bel- TRANSACTIONS ON SIGNAL PROCESSING and an
mont, MA: Athena Scientific, 1995, vol. 1. elected member of IEEE Signal Processing Society SP-COM Technical Com-
[14] W. S. Lovejoy, “Some monotonicity results for partially observed mittee. She received the 2000 Young Author Best Paper Award from the IEEE
Markov decision processes,” Oper. Res., vol. 35, no. 5, pp. 736–743, Signal Processing Society and the 2008 Outstanding Junior Faculty Award
Sept.–Oct. 1987. from the UC Davis College of Engineering.
[15] R. D. Smallwood and E. J. Sondik, “The optimal control of partially
observable Markov processes over a finite horizon,” Oper. Res., vol.
21, pp. 1071–1088, 1973.
[16] H.-T. Cheng, “Algorithms for partially observable Markov decision Ananthram Swami (S’79–M’79–SM’96–F’08) re-
processes,” Ph.D. dissertation, Univ. of British Columbia, BC, Canada, ceived the B.Tech. degree from the Indian Institute
1988. of Technology (IIT)—Bombay; the M.S. degree from
[17] M. Littman, The Witness algorithm: Solving partially observable Rice University, Houston, TX, and the Ph.D. degree
Markov decision processes Brown Univ., Providence, RI, Tech. Rep. from the University of Southern California (USC),
CS-94-40, 1994. Los Angeles, all in electrical engineering.
[18] A. Cassandra, M. Littman, and N. L. Zhang, “Incremental pruning: A He has held positions with Unocal Corporation,
simple, fast, exact algorithm for partially observable Markov decision USC, CS-3, and Malgudi Systems. He was a Statis-
processes,” in Proc. 13th Conf. Uncertainty in Artificial Intelligence, tical Consultant to the California Lottery, developed
Providence, RI, Aug. 1997, pp. 54–61. a Matlab-based toolbox for non-Gaussian signal pro-
[19] A. Cassandra, Optimal policies for partially observable Markov de- cessing, and has held visiting faculty positions at INP,
cision processes Brown Univ., Providence, RI, Tech. Rep. CS-94-14, Toulouse. He is with the U.S. Army Research Laboratory (ARL), Adelphi, MD,
1994. where his work is in the broad area of signal processing, wireless communica-
[20] E. J. Sondik, “The Optimal control of partially observable Markov pro- tions, and sensor and mobile ad hoc networks. He is a Co-Editor of the book
cesses,” Ph.D. dissertation, Electr. Eng. Dept., Stanford Univ., Stan- Wireless Sensor Networks: Signal Processing & Communications Perspectives
ford, CA, 1971. (New York: Wiley, 2007).
[21] D. Aberdeen, A Survey of approximate methods for solving partially Dr. Swami is an ARL Fellow. He is a member of the IEEE Signal Pro-
observable Markov decision processes National ICT Australia, Tech. cessing Society’s (SPS) Technical Committee (TC) on Sensor Array &
Rep., Dec. 2003 [Online]. Available: http://users.rsise.anu.edu.au/ Multi-Channel Systems. He has served as Chair of the IEEE SPS TC on
~daa/papers.html Signal Processing for Communications; member of the SPS Statistical Signal
[22] K. Liu, Q. Zhao, and Y. Chen, “Distributed sensing and access in cogni- and Array Processing TC; Associate Editor of the IEEE TRANSACTIONS
tive radio networks,” presented at the 10th Int. Symp. Spread Spectrum ON WIRELESS COMMUNICATIONS, the IEEE SIGNAL PROCESSING LETTERS,
Techniques Applications (ISSSTA) Bologna, Italy, Aug. 2008. the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II, the IEEE Signal
[23] Q. Zhao, L. Tong, and A. Swami, “Decentralized cognitive MAC for Processing Magazine, and the IEEE TRANSACTIONS ON SIGNAL PROCESSING.
dynamic spectrum access,” in Proc. 1st IEEE Symp. New Frontiers Dy- He was Guest Editor of a IEEE Signal Processing Magazine for a Special
namic Spectrum Access Networks, Nov. 2005, pp. 224–232. Issue on Signal Processing for Networking in 2004 and for a Special Issue on
[24] Y. Chen, Q. Zhao, and A. Swami, “Joint design and separation principle Distributed Signal Processing in Sensor Networks in 2006, a EURASIP Journal
for opportunistic spectrum access,” in Proc. IEEE Asilomar Conference on Advances in Signal Processing Special Issue on Reliable Communications
on Signals, Systems, and Computers, Asilomar, CA, Oct. 2006. Over Rapidly Time-Varying Channels in 2006, and a EURASIP Journal on
Wireless Communications and Networking Special Issue on Wireless Mobile
Yunxia Chen (S’03) received the M.Sc. degree Ad Hoc Networks in 2007, and he was the Lead Editor for the IEEE JOURNAL
from the Department of Electrical and Computer ON SELECTED TOPICS IN SIGNAL PROCESSING Special Issue Signal Processing
Engineering, the University of Alberta, Edmonton, and Networking for Dynamic Spectrum Access in 2008. He has co-orga-
AB, Canada, in 2004 and the Ph.D. degree from the nized and co-chaired three IEEE SPS Workshops (HOS’93, SSAP’96, and
University of California, Davis, in 2008. SPAWC’10) and a 1999 ASA/IMA Workshop on Heavy-Tailed Phenomena. He
Her research interests are in resource-constrained was a tutorial speaker on Networking Cognitive Radios for Dynamic Spectrum
signal processing, communications, and networking. Access at ICASSP 2008, DySpan 2008, and MILCOM 2008.
She is also interested in diversity techniques and
fading channels, communication theory, and MIMO
systems.
Dr. Chen received the Best Student Paper Award at
the 2006 IEEE International Conference on Acoustics, Speech, and Signal Pro-
cessing (ICASSP) in May 2006 and First Place in the Student Paper Contest at
the IEEE Asilomar Conference on Signals, Systems, and Computers in October
2006.