Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Three
distributed estimation algorithms are evaluated in terms of convergence, communication cost, computational cost and robustness to communication errors for the IEEE 14
bus network. Analysis is also performed for convergence of one algorithm for a general
graph.
Introduction
Power system state estimation (PSSE) refers to obtaining the voltage phasors of all system buses at a given moment.
observations of the many power ows through the network, then performing an inference to
determine the underlying phasor values. In early days, PSSE was performed in a centralized
data processing center which aggregated all the observations and computed a global solution
for the entire network [8, 7]. However, there is a a move towards expansion of system sensing
capabilities as well higher rate of estimation making decentralized estimation very attractive.
Also, the new grid structure is open for two-way electricity ow and distributed generations
down to distribution level.
in a large scale where the centralized operations might not as eective. Distributed power
system state estimation algorithms allow system operators to deal with large scale problems
by dividing the measurements and buses into control areas. Each control area will collect
it's measurements, perform it's own state estimation and exchange information with other
control areas.
Recently proposed methods for solving for state estimates come from the sensor networking literature. [9, 4]. This report provides an experimental study for the use of these
two techniques. It also evaluates the performance of recent work in distributed sub gradient
based optimization [1]. The study divides into two parts. In the rst part, three algorithms
are implemented and their performance compared with a centralized estimator. Comparisons
are made in four performance indicators of interest: (1) convergence behavior, (2) computational complexity (3) communication cost and (4) local observability requirement. In the
second part, we analyze the performance of these techniques on dierent tree networks and
present an empirical model for one of the algorithms.
The centralized state estimation problem is the following: Given an underlying state of the
network
X = {(V0 ), (1 , V1 ), . . . (N , VN )}
at every bus except the reference, the state estimator will use a set of measurements of the
form
hk (X) + k
each measurement,
x.
2
M
X
zk hk (X)
k
k=1
(1)
For the sake of simplicity, we only consider power measurements for the entire system.
In the fully non linear AC power ow, a state estimator will receive the following sets of
measurements.
Real and reactive power injection at bus
Pi = Vi
(2)
(3)
jN (i)
Qi = V i
X
jN (i)
to bus
j.
(i, j)
is
and
(4)
(5)
ij = i j .
In this
paper, however we will solve the state estimation problem based on linearized DC power
ow. In linearized DC power ow, we assume rst that all bus voltage magnitudes are close
to 1.0 per unit. Next, that all the transmission lines are lossless and the nodal voltage phase
angle dierences are small. Applying these to the nonlinear measurements to eq. (2), (4) we
have the following equation relating bus angle and real bus injection and branch ow power.
Pi =
Bij (i j )
(6)
Pij = bij (i j )
(7)
jN (i)
The linearized measurement model allows us to represent the measurement set as well
as central estimator as a simple weighted least squares formulation. From the relations in
eq. (7), the measurement vector and system state vector are related by Z = HX + . Here
Z Rnobs is the column vector containing all bus injection and branch ow measurements
n nbus
is the matrix containing elements of eq. (7). From the
(M = nbus + nbranch ). H R obs
DC modeling assumptions the state vector only contains bus angles
now on, state variable
x = {1 . . . N }.
Fron
We can therefore
simplify eq. (1) to the following centralized weighted linear least squares estimate.
xc = (H T R1 H)1 H T R1 Z
(8)
Note that in order to be able to estimate the state vector, mbus must be at least N 1
rank(H T H) = nbus 1. To make the system of equation full rank, we assume that
and
the bus
is always the slack bus with phase angle zero. We solve for each angle relative to
0 , this leads to a reduced sensing matrix Hr Rnobs nbus 1 which is H but with the rst
column removed. We can apply eq. (8) with
Hr
[1 0 . . . N 0 ]T = (HrT R1 Hr )1 HrT R1 Z
(9)
Distributed algorithms for estimation work by splitting large problems into many locally
computable problems.
shown in Figure 1.
A decentralized
algorithms work by distributing the observations and computation into dierent domains.
Here, each area receives observations of variables local only to itself as well as variables
it shares with neighboring areas.
messages from it's neighbors to compute estimates of it's unknown variables and possibly the
unknown variables contained in other areas. Local observability of an area is an important
assumption in many distributed techniques. If an area is locally observable, then estimation
can occur without the rest of the network. Many early and heuristic methods required local
observability, however this is not required.
local observability. In the following sections we will introduce the various algorithms and
illustrate their behavior using the 14 bus network.
3.1
CSE
The algorithm introduced in [9] for use in the distributed power system state estimation
originates from work in distributed estimation in sensor networks [3].
builds an estimate of the entire state of the system. In each step, an area will compute a
local estimate of the global state, and then transmit the full estimate to all neighbors in it's
communication network. Note that in CSE, each area's state vector is the full set of unkown
variables.
estimate
xtk Rnbus
if
k
l Nk .
to area
at time
"
xt+1
= xtk a b
k
#
X
(
xtk xtl ) HnT (zn Hn xtk )
(10)
lNk
This method puts no limitations on the communication architecture required. That is
the structure of
Nk
term of eq. (10) does not involve any matrix inversion, the algorithm does not assume any
local observability. Therefore, it can be used in a fully decentralized manner, in that each
bus can be a computing area. However, is no implicit inversion, since the update term is
similar to a gradient ascent direction.
Analysis of the algorithm is based mainly on introducing an aggregate state
[
xT1 (i) . . . xTnarea (i)]T .
x(i) =
T
x)
x(i + 1) = (Inbus Inarea ab(L Inarea )x) DH (Z DH
(11)
DH
is
H1T . . . 0
.
.
DH = ...
.
T
0 . . . HM
Note the similarity of the update step with that of recursive least squares lters in
(12)
ek+1 = (I Ak H)ek .
In the case
of the CSE algorithm, we can use a similar method to show that the error process evolves
T
as ek+1 = (IM N ab(L IM + DH DH ))ek . Like RLS lters, this technique has an associate
Riccatti equation relating error covariance as a function of iteration.
3.2
ADMM
The ADMM algorithm was rst introduced to power system state estimation in [4].
It
min
x
s.t.
L
X
kzk Hxk k2
k=1
xk [l] = xkl ,
l Nk .
From this we can formulate the augmented Lagrangian for the system. This is given for
the problem as
nX
area
k=1
nX
area
"
#
kzk Hxk k2 +
T
vk,l
(xk [l] xkl ) + ckxk [l] xkl k2
lNk
(13)
k=1
The alternating direction term comes from the fact that the augmented Lagrangian in eq.
(16) is maximized by each area partially minimizing their local Lagrangian
xt+1
=
k
arg min
x
t
Lk ({xk }, {xtkl }, {vk,l
})
(14)
xt+1
=
kl
argmin
xkl
t
L({xt+1 }, {xkl }, {vk,l
})
(15)
t+1
vk,l
=
argmin
vk,l
L({xt+1 }, {xt+1
kl }, {vk,l })
(16)
Note that in ADMM, each area's state vector is the local set of unkown variables as well
as variables that eect the obsevations set in the area. For example, the state vector for area
xt+1
= (HkT Hk + cDk )1 (HkT zk + cDk pk )
k
1 X r+1
xl [i]
st+1
=
kl
|Nki |
i
is
(17)
(18)
lNk
pt+1
k
prk (i)
sr+1
k
xrk + srk
(19)
(20)
i
corresponding to xk (i) dened for all l Nk . Nk is
th
dened as the set of all areas which share the i element of area k 's state vector (xk [i]). Dk is
i
a diagonal matrix with the (i, i) entry of |Nk |. With appropriate choice of c, for each control
r
area k, xk converges to the estimate of the subset of the whole system estimate. Combining
values from all areas, one can obtain the whole system estimate.
Here,
3.3
xl [i]
xl
In distributed dual averaging (DDA), estimates are constructed by each node calculating a
local estimate of the global subgradient and then sharing it with neighbors dened by a communication graph. In the case of the power system estimation problem, the communication
graph is independent of the topology of the power network as shown in 1. This makes DDA
similar to the CSE algorithm in that regard. Specically, we have a neighbor set
for each node
where
i V N (i) = {j V |(i, j) E}
The core algorithm dened for a general optimization problem is dened as the following.
For a given generic convex objective function of the form.
1X
fi (x)
n k=1
min
xX
fi (x)
wkt+1 =
(21)
lN (k)
xt+1
k
= X (wit , (t))
(22)
(23)
is dened as.
X (z, )
= argmin
xX
1
< z, x > + (x)
(24)
Here is the proximal function, and in this study is set to the canonical proximal function
1
kxk22 as stated in [1].
2
In the case of distributed linear estimation for power system state estimation, the algoT 1
rithm reduces to the following. The objective function now is fk (x) = (zk Hk x) Rk (zk
1 T
Hk x) which gives us gk (t) = 2Hkt R1 (zk Hk x). Next, X (z, ) = argmin z T x + 2
x x .
xX
t+1
In an unconstrained case, the solution becomes xk
= t+1 zkt+1 . Since x [, ] we need
to constrain the update. The nal recursion becomes.
wi (t + 1) =
(25)
jN (i)
xi (t + 1) =
min
max
n
o o
t wi (t + 1),
,
4
4
(26)
The centralized and three distributed algorithms are numerically tested using MATLAB.
The power system used in the test is the IEEE 14-bus system which is shown in Figure
1. The associated admittance matrix and true underlying power states/measurements are
obtained using MATPOWER and veried with the benchmark source. In the IEEE 14-bus
grid, measurements sites and types are shown in the gure.
An abstraction of the buses, measurements and control areas is also shown in 1 The
four rectangles represent local control areas with a total of 23 measurements in p.u. where
boxes on an edge represents branch ow measurement. The redundancy ratio is 23/14=1.64
therefore, the system is very likely to have a unique solution. The communication between
the areas forms a fully connected graph.
4.1
MSE shown is between the current estimate and the underlying true power state.
xc
true underlying state vector obtained from MATPOWER. That means all pieces of
centralized algorithm and related information are consistent.
13, 626
a = 8e10
and
b = 1e10 , we found
area agree with true underlying state vector obtained from MATPOWER.
495, 536
0 = 2.66e10 as
N EW = 2000 ,
we
we
control area agree with true underlying state vector obtained from MATPOWER.
34 iterations ,
c = 9,
we found that
after matching with true underlying state vector obtained from MATPOWER.
4.2
The convergence behavior is shown by calculating the error with respect to the
xc xrk k2 and the error to the true underlying
centralized solution x
c dened as erk,c = N1 k
1
r
r
state dened as ek,c =
k
xc xk k2 . Here is the size of vector . Note that we do not
N
use stopping criteria here in order to see the convergence behavior across large number of
iterations. The error curves obtained from the IEEE 14-bus network are shown below.
4.2.1
Figure 3: Performance three algorithms under only observation noise. Measurement noise
indicates values of
HIGH = 0.01
and
HIGH = 0.00001.
The
convergence rate is high and the number of steps to reach the cut-o level is about 35
iterations, while other two methods require 3 or 4 more order of magnitude number of
iterations. Moreover, the convergence rate of the ADMM algorithm in terms of
be linear after reaching the cut-o level of
M SE .
ek,c
tends to
after reaching this cut-o level, while the dual averaging algorithm has very at convergence
rate long before reaching the cut-o level. It would be interesting to see why the convergence
behavior of the dual averaging algorithm has a sudden increase and then become at.
4.2.2
Figure 4:
HIGH = 0.01
and
HIGH = 0.001.
Measurement
technique, messaging times are nite thus these techniques seem of only theoretical interest
for this application.
4.2.3
Here we simulate the three methods by having noise in the message as well as the observations. The results are shown in Figure 7. Using the same variances as in 4.2.3 and 4.2.3. We
only present cases where both are in low and high noise conditions. In the rst experiment
we see again, that given enough time, the CSE algorithm will outperform ADMM and DDA
in terms of forming consensus, however like in the other situations, ADMM performs many
orders of magnitude faster.
10
Figure 5:
HIGH = 0.01
and
HIGH = 0.001.
Measurement
is diagonal.
since we assume
clc
sending one number for a local control area to its neighbor local area to be
clcl .
We also
assume there is no communication cost to collect local measurements to local control areas.
Notice that the communication cost is proportional to the number and size of messages sent
during the algorithm.
Similarly, to measure the communication time, we dene the maximum time of sending
one number to for a local control area to the central control center to be
tll
, and the
maximum time of sending one number for a local control area to its neighbor local area to be
tll .
approximate
Centralized algorithm:
11
measurements to the central control area. Hence, the communication cost is bounded
above by:
O(mclcl )
O(tlc )
tor
to the neighbors.
bounded above by
Communication time: the communication time is bounded above by
509334
O(niterate tll )
new = 2000
as
iterations.
ADMM-based algorithm:
niterate = 33.
xrk
12
Computational Complexity
Algorithm
Centralized
CSE
DDA
ADMM
Pre-Operational
Operational
O(nbus M 2 + (nbus )3 )
0
0
O((narea )(maxi |zi |)2 + (maxi |xi |)3 )
O(nbus M + (nbus )2 )
O((narea )(niterate )(maxi |zi |) + (d)(nbus ))
O((narea )(niterate )(nb us)(maxi |zi |) + (d)(nbus ))
O((narea )(niterate )(maxi |xi |)(maxi |zi | + d))
Table 1: Computational complexity of three algorithms separate with pre-operational complexity and operational complexity.
Algorithm
Communication Cost
Communication Time
Centralized
O(mclcl )
O ((niterate )(d)(narea )(nbus ))
O((niterate )(d)(nbus )(narea )(cll ))
O(niterate )(d)(narea (maxi |xi |)clc + (narea )2 (maxi |xi |)
O(tlc )
O(niterate tll )
O((niterate )(tll ))
O(narea tll )
CSE
DDA
ADMM
by
the whole system state, it requires extra communication cost of sending incomplete
state information to all other control areas through local communication channels.
Only for this 4 control area system, with careful counting, we found that this extra
2
communication cost is bounded by O((narea ) (maxi |xi |). So the total communication
2
cost is O(niterate )(d)(narea (maxi |xi |)clc + (narea ) (maxi |xi |)
Communication time: The communication time is bounded by
O(narea tll ).
However,
to make every control area know the whole system state, it requires extra time to
send incomplete states information to all other control areas through local communication channels. This extra communication time is bounded by
communication cost is bounded by
O(narea tll ).
The total
O(narea tll )
The computational complexity and communication time/cost for each algorithm are in
Tables 5, 5, The table includes a numerical result from IEEE 14-bus system.
5.1
Discussion
are the best among four algorithms, while the centralized algorithms is the worst. This is
because the centralized algorithms require a big matrix inversion.
four algorithms, while the dual averaging algorithm is the worst. This is because the dual
averaging algorithm requires high number of iterations.
13
Once the pre-operational computation is done, the result can be stored and reused unless
there is any update in the system (e.g. a bus connection line is cut). The numerical result
here implies that, in term of overall computational complexity, the centralized algorithm is
more advantageous than the distributed algorithm. Among the distributed algorithms, the
ADMM algorithm is the best algorithm. In the general setting, especially when the problem
is large, the ADMM algorithm may be as good or better than the centralized algorithm if
niterate
is in the same order as the problem size. In, [4] for the larger problem, IEEE 118-bus
4
system, niterate does not grow as the problem size grows. The 118-bus system reaches 10
accuracy in
ek,c
after
10
clc cll
and
tlc tll .
clc cll
and
tlc tll .
Then, the
We need a model
for relationship between local-to-local parameters and local-to-central parameters when the
problem is large as well as how in the ADMM algorithm relates to the problems size.
Roughly speaking, if the local-to-central parameters grow as fast as
niterate ,
the ADMM
Local Observability
Due to proliferation of sensing and computation capabilities on the distribution grid. Exploring the convergence properties of distributed estimation on tree like graphs seems and
interesting direction of study. We simulated the PSSE problem using the ADMM only. It
would be fruitful to explore the other techniques as well, however the large converegence
times limited study to faster converging techniques. In the simulations, we tested randomly
generated trees of size
N = 251020.
documented in [6]. For these simulations, the ground truth state vectors were randomly generated as well as the observation matrices. We assumed that each bus made complete power
observations of the entire system. That is, each bus had a bus power measurement as well
as a branch ow measurement. This was chosen to reduce the variability of the convergence
n
results. Convergence was measured by the number of iterations taken for the ec,k < . We
14
chose an
value of
0.01
and
0.001.
5
10
10
10
Iterations
Iterations
10
10
10
10
10
15
10 2
10
20
Tree Size
10
10
10
10
Iterations
10
10
10
10
10
10
10
10
10
10
Iterations
10
10
10
10
10
15
20
Tree Size
10 2
10
10
10
10
k
Figure 6: (Top Left) Number of iterations required for ec < 0.001 to be satied vs. tree size.
k
(Top Right) Number of iterations required for ec < 0.001 to be satied v.s. 2 . (Bottom
k
Left) Number of iterations required for ec < 0.01 to be satied vs. tree size. (Bottom Right)
k
Number of iterations required for ec < 0.01 to be satied v.s. 2 .
Figure 6 illustrates the convergence properties of ADMM on randomly generated tree
networks. The rst result that is apparent is that there is a large variation of termination
time for a given graph size. In the second sets of plots, we see the variation of minimum
iteration counts vs. the second smallest eigenvalue of the system. The loglog scale of the
plots show a linear relationship. Also, there is high variation in the termination conditioning
on
2 .
This is because each randomly generated tree had a randomly generated bus matrix
as well as observation.
on the ADMM algorithm assuming single bus areas. However, the experimental results
point towards a polynomial relationship O(2 (L)). The simulation data is used to t a
simple regression model. In the two experiments a exponent value of =01 = 0.52143
and =001 = 0.51850. The 95% percent condence intervals under linear regression is
95
95
=(.001)
= [0.56, 0.48] and (.01)
= [0.56 0.47]. It would be of theoretical interest to
derive this bound based on rst principles, however the scope of this work leaves this to the
future.
Table 3 shows the mean number of iterations required before termination for dierent
sized trees.
The results hint towards a linear relationship, however there is a vary large
15
Tree Size
10
20
.01
32
125
542
2198
.001
20
116
1083
2931
Table 3: Mean computation interations required for convergence for trees of size N.
3000
= 0.01
= 0.001
Mean Iterations
2500
2000
1500
1000
500
0
0
10
15
20
Tree Size
Figure 7: Mean computation time for various sizes of trees.
to undertand the fundemental evolution of the ADMM algorithm. One avenue of interest
would be interpreting the entire algorithm as a linear dynamical system. It can be shown,
thatusing the following
state vector.
x1 (t + 1)
s1 (t)
s1 (t + 1)
p1 (t + 1)
x1 (t)
s1 (t 1)
s1 (t)
p1 (t)
.
.
.
.
=
A
+B z
.
.
xM (t)
xM (t + 1)
sM (t 1)
sM (t + 1)
sM (t)
sM (t + 1)
pM (t + 1)
pM (t)
This system can be treated in the same manner as other stochastic approximation problems. It would be interesting to apply similar methodologies used in [5] in uncovering the
emprically determined results relating convergence and graph spectrum as well as introducing
damped update sceams to combat communication noise.
16
Conclusion
In 4, we investigated the CSE and ADMM algorithms, which have appeared in the distributed power system estimation literature.
dual averaging method. It turns out that CSE and the dual averaging algorithms suer from
a large number of iterations and size of messages needed to be sent. The ADMM algorithm
is experimentally justied to be a good algorithm for power system state estimation for
many reasons. It converges fastest among three algorithms. It requires small computational
complexity, communication cost and time.
ADMM algorithm. The only poor performance indicator for the ADMM algorithm is the
tolerance to the communication noise.
It is less tolerant to the communication noise than the CSE algorithm. Yet this ammount
is small for moderate termination steps. In comparison with the centralized algorithm, the
ADMM algorithm may be more ecient in term of computational complexity, communication time and cost when the problem size is large.
number of iterations in the power state system estimation to grow not as fast as the size of
whole network, we found there to be a large increase as the size of the network grew in the
case of trees. Future work should also include experimental results on connected graphs to
correspond with future theoretical ndings.
References
[1] J. Duchi, A. Agarwal, and M. Wainwright. Dual averaging for distributed optimization:
convergence analysis and network scaling.
(99):11, 2010.
[2] S. Haykin. Adaptive lter theory (ise). 2003.
[3] S. Kar, J.M.F. Moura, and K. Ramanan.
[4] V. Kekatos and G.B. Giannakis. Distributed robust power system state estimation. Arxiv
preprint arXiv:1204.0991, 2012.
[7] F.C. Schweppe. Power system static-state estimation, part iii: Implementation. Power
Apparatus and Systems, IEEE Transactions on, (1):130135, 1970.
17
[8] F.C. Schweppe and D.B. Rom. Power system static-state estimation, part ii: Approximate model. power apparatus and systems, ieee transactions on, (1):125130, 1970.
[9] L. Xie, D.H. Choi, and S. Kar. Cooperative distributed state estimation: Local observability relaxed. In Power and Energy Society General Meeting, 2011 IEEE, pages 111.
IEEE, 2011.
18