Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
When I finished writing The Black Swan, in 2006, I was confronted with ideas of great moderation, by people who did not realize that the
process was getting fatter and fatter tails (from operational and financial, leverage, complexity, interdependence, etc.), meaning fewer but
deeper departures from the mean. The fact that nuclear bombs explode less often that regular shells does not make them safer. Needless to say
that with the arrival of the events of 2008, I did not have to explain myself too much. Nevertheless people in economics are still using the
methods that led to the great moderation narrative, and Bernanke, the protagonist of the theory, had his mandate renewed.
I had argued that we were undergoing a switch between the top graph (continuous low grade volatility) to the next one, with the process moving
by jumps, with less and less variations outside of jumps.
Position
140
130
120
110
100
90
80
50
100
150
200
250
Time
Position
140
130
120
110
100
90
80
50
100
150
200
250
Time
Long Peace.nb
2. Pinker conflates nonscalable Mediocristan (death from encounters with simple weapons) with scalable Extremistan (death from heavy
shells and nuclear weapons). The two have markedly distinct statistical properties. Yet he uses statistics of one to make inferences about
the other. And the book does not realize the core difference between scalable/nonscalable (although he tried to define powerlaws). He
claims that crime has dropped, which does not mean anything concerning casualties from violent conflict.
3. Another way to see the conflation, Pinker works with a times series process without dealing with the notion of temporal homogeneity.
Ancestral man had no nuclear weapons, so it is downright foolish to assume the statistics of conflicts in the 14th century can apply to the
21st. A mean person with a stick is categorically different from a mean person with a nuclear weapon, so the emphasis should be on the
weapon and not exclusively on the psychological makup of the person.
4. The statistical discussions are disturbingly amateurish, which would not be a problem except that the point of his book is statistical. Pinker
misdefines fat tails by talking about probability not contribution of rare events to the higher moments; he somehow himself accepts
powerlaws, with low exponents, but he does not connect the dots that, if true, statistics can allow no claim about the mean of the process.
Further, he assumes that data reveals its properties without inferential errors. He talks about the process switching from 80/20 to 80/02,
when the first has a tail exponent of 1.16, and the other 1.06, meaning they are statistically indistinguishable. (Errors in computation of tail
exponents are at leat .6, so this discussion is noise, and as shown in [1], [2], it is lower than 1. (It is an error to talk 80/20 and derive the
statistics of cumulative contributions from samples rather than fit exponents; an 80/20 style statement is interpolative from the existing
sample, hence biased to clip the tail, while exponents extrapolate.)
5. He completely misses the survivorship biases (which I called the Casanova effect) that make an observation by an observer whose survival
depends on the observation invalid probabilistically, or to the least, biased favorably. Had a nuclear event taken place Signor Pinker would
not have been able to write the book.
6. He calls John Grays critique anecdotal, yet it is more powerful statistically (argument of via negativa) than his >700 pages of pseudostats.
7. Psychologically, he complains about the lurid leading people to make inferences about the state of the system, yet he uses lurid arguments
to make his point.
8. You can look at the data he presents and actually see a rise in war effects, comparing pre-1914 to post 1914.
3
9. Recursing a Bit (Point added Nov 8): Had a book proclaiming The Long Peace been published in 1913 it would carry similar arguments
4
to those in Pinkers book.
{Technical Appendix}
1
of
3
not understand the difference between probability and expectation (drop in observed volatility/fluctualtion drop in risk) or the incompatibility
of his claims with his acceptance of fat tails (he does not understand asymmetries-- from his posts on FB and private correspondence). Yet it
was Pinker who said what is the actual risk for any individual? It is approaching zero.
Long Peace.nb
Now Pinker is excusable. The practice is widespread in social science where academics use mechanistic techniques of statistics without
understanding the properties of the statistical claims. And in some areas not involving time series, the differnce between M* and M is negligible. So I rapidly jot down a few rules before showing proofs and derivations (limiting M to the arithmetic mean). Where E is the expectation
operator under real-world probability measure P:
1. Tails Sampling Property: E[|M*-M|] increases in with fat-tailedness (the mean deviation of M* seen from the realizations in different
samples of the same process). In other words, fat tails tend to mask the distributional properties.
2. Counterfactual Property: Another way to view the previous point, [M*], The distance between different values of M* one gets from
repeated sampling of the process (say counterfactual history) increases with fat tails.
3. Survivorship Bias Property: E[M*-M ] increases under the presence of an absorbing barrier for the process. (Casanova effect)
4. Left Tail Sample Insufficiency: E[M*-M] increases with negative skeweness of the true underying variable.
5. Asymmetry in Inference: Under both negative skewness and fat tails, negative deviations from the mean are more informational than
positive deviations.
6. Power of Extreme Deviations (N=1 is OK): Under fat tails, large deviations from the mean are vastly more informational than small ones.
They are not anecdotal. (The last two properties corresponds to the black swan problem).
Fig1 First 100 years (Sample Path): A Monte Carlo generated realization of a process of the 80/20 or 80/02 style as described by Pinkers in his book,
that is tail exponent $= 1.1, dangerously close to 1
Fig 2: The Turkey Surprise: Now 200 years, the second 100 years dwarf the first; these are realizations of the exact same process, seen with a longer
widow.
q = q
i=1 Xi >h(q)
Pn
bq
i=1 Xi
where h(q)
is the estimated exceedance threshold for the
probability q :
1X
h(q)
= inf{h :
n i=1
x>h
q}
I. I NTRODUCTION
Vilfredo Pareto noticed that 80% of the land in Italy
belonged to 20% of the population, and vice-versa, thus both
giving birth to the power law class of distributions and the
popular saying 80/20. The self-similarity at the core of the
property of power laws [1] and [2] allows us to recurse and
reapply the 80/20 to the remaining 20%, and so forth until one
obtains the result that the top percent of the population will
own about 53% of the total wealth.
It looks like such a measure of concentration can be
seriously biased, depending on how it is measured, so it is
very likely that the true ratio of concentration of what Pareto
observed, that is, the share of the top percentile, was closer
to 70%, hence changes year-on-year would drift higher to
converge to such a level from larger sample. In fact, as we
will show in this discussion, for, say wealth, more complete
samples resulting from technological progress, and also larger
population and economic growth will make such a measure
converge by increasing over time, for no other reason than
expansion in sample space or aggregate value.
The core of the problem is that, for the class one-tailed
fat-tailed random variables, that is, bounded on the left and
unbounded on the right, where the random variable X 2
[xmin , 1), the in-sample quantile contribution is a biased
estimator of the true value of the actual quantile contribution.
(1)
x (x)dx
1
h(q) 1
h(q)
q = R 1
=
=q
(2)
xmin
x (x)dx
x
min
1 E [X]
b(n)
b(103 )
b(104 )
b(105 )
b(106 )
b(107 )
b(108 )
Mean
Median
0.405235
0.485916
0.539028
0.581384
0.591506
0.606513
0.367698
0.458449
0.516415
0.555997
0.575262
0.593667
STD
across MC runs
0.160244
0.117917
0.0931362
0.0853593
0.0601528
0.0461397
bq (n) is "of the order of" c(, q)n b(q)( 1) where constants
b(q) and c(, q) need to be evaluated. Simulations suggest that
b(q) = 1, whatever the value of and q, but the rather slow
convergence of the estimator and of its standard deviation to
0 makes precise estimation difficult.
2) General Case: In the general case, let us fix the threshold h and define:
h = P (X > h)
E[X|X > h]
E[X X>h ]
=
E[X]
E[X]
1 This value, which is lower than the estimated exponents one can find in
the literature around 2 is, following [7], a lower estimate which cannot
be excluded from the observations.
bh i=1
X
i=1 i
bh (n + 1) AS(n)+X
= 1 S(n)+X
, which is
n+1 h
n+1 h
now concave in Xn+1 , so that uncertainty on Xn+1 reduces its
value. The competition between these two opposite effects is in
favor of the latter, because of a higher concavity with respect
to the variable, and also of a higher variability (whatever its
measurement) of the variable conditionally to being above
the threshold than to being below. The fatter the right tail
of the distribution, the stronger the effect. Overall, we find
E [Ah (n)]
that E [b
h (n)]
= h (note that unfreezing the
E [S(n)]
threshold h(q)
also tends to reduce the concentration measure
estimate, adding to the effect, when introducing one extra
sample because of a slight increase in the expected value of
X tY )
This inequality is still valid with
bq as the value h(q,
kHSXi +YL
m
X
Si
E [b
q (N )]
E
E [b
q (Ni )]
S
i=1
0.95
0.90
0.85
0.80
0.75
0.70
0.65
20 000
40 000
60 000
80 000
Y
100 000
0.626
0.624
0.622
S0 =
n
X
Xi0 . We define A =
i=1
20
40
60
80
100
mq
X
i=1
i=1
mq
X
0
X[i]
where
i=1
0
X[i]
is the i-th largest value of (X10 , . . . , Xn0 ) . We also set
(m+n)q
S 00 = S + S 0 and A =
00
00
X[i]
where X[i]
is the i-th
i=1
0
S
S
00
E [ ] E 00 E [] + E 00 E [0 ]
S
S
=
2 The same concavity and general bias applies when the distribution is
lognormal, and is exacerbated by high variance.
We observe that:
A=
max
J{1,...,m}
|J|=m i2J
Xi
P
0
0 |=qn
and, similarly A0 = maxJ 0 {1,...,n},|J
i2J 0 Xi and
P
00
A = maxJ 00 {1,...,m+n},|J 00 |=q(m+n) i2J 00 Xi , where we
have denoted Xm+i = Xi0 for i = 1 . . . n. If J
{1, ..., m} , |J| = m and J 0 {m + 1, ..., m + n} , |J 0 | =
00
0
0
qn,
P then J = J00 [ J has cardinal m + n, hence A + A =
, whatever the particular sample. Therefore
i2J 00 Xi A
S
S0 0
00
and we have:
S 00
S 00
0
S
S
E [00 ] E 00 + E 00 0
S
S
S
A
E 00 = E 00
S
S
S
A
E
S 00
S
S
S
E [] + E 00 E [0 ]
S 00
S
A =
n!+1
i=1
Xi
Xi <T .
i=1
S
E 00 =
S
ZZ
a+b
a
pA (t, da) pB (t, db) q(dt) p0 (ds0 )
a + b + s0 a + b
a+b
a
For given b, t and s0 , a ! a+b+s
and a ! a+b
0
are two increasing functions of the same variable a, hence
conditionally to T , B and S 0 , we have:
S
A
0
E
T,
B,
S
=
E
T, B, S 0
S 00
A + B + S0
A+B
A
0
E
T, B, S E
T, B, S 0
0
A+B+S
A+B
S
S
A
E 00
E 00 E
S
S
S
m
X
S(!i n, Xi )
E [b
q (n, X)]
E
E [b
q (!i n, Xi )]
S(n, X)
i=1
S(!i n, Xi )
converges almost
S(n, X)
surely to !i respectively, therefore we have the following
convexity inequality:
When n ! +1, each ratio
m
X
q (X)
h =
lim
bh (n) = h
!i q (Xi )
i=1
is F (x) = 1
x
xmin
and we have:
q (X ) = q
and
2
1 (log q)
d2
q (X ) = q
>0
2
3
d
q (X)
m
X
i=1
!i q (Xi )
q (X )
Pm
where
= i=1 !i .
Suppose now that X is a positive random variable with
unknown distribution, except that its tail decays like a power
low with unknown exponent. An unbiased estimation of the
N
X
!i Li (x)x
(3)
j=1
1
+
(ln(q) 2(+ ))
order effect ln(q)q
, an effect that is exac(+ )4
erbated at lower values of .
To show how unreliable the measures of inequality concentration from quantiles, consider that a standard error of 0.3 in
the measurement of causes q () to rise by 0.25.
k Hn=104 L
1.0
0.9
0.8
0.7
0.6
0.5
1
1
1 1 +1
q (
) = q 1 <
q
+ q1
2
So in practice, an estimated
of around 3/2, sometimes
called the "half-cubic" exponent, would produce similar results
as value of much closer ro 1, as we used in the previous
section. Simply q () is convex, and dominated by the second
1
0.4
0.3
60 000
80 000
100 000
120 000
Wealth
tend to fail with fat-tailed data, see [6]. But, in addition, the
problem here is worse: even if such "robust" methods were
deemed unbiased, a method of direct centile estimation is
still linked to a static and specific population and does not
aggregage. Accordingly, such techniques do not allow us to
make statistical claims or scientific statements about the true
properties which should necessarily carry out of sample.
Take an insurance (or, better, reinsurance) company. The
"accounting" profits in a year in which there were few claims
do not reflect on the "economic" status of the company and it
is futile to make statements on the concentration of losses per
insured event based on a single year sample. The "accounting"
profits are not used to predict variations year-on-year, rather
the exposure to tail (and other) events, analyses that take into
account the stochastic nature of the performance. This difference between "accounting" (deterministic) and "economic"
(stochastic) values matters for policy making, particularly
under fat tails. The same with wars: we do not estimate the
severity of a (future) risk based on past in-sample historical
data.
B. How Should We Measure Concentration?
Practitioners of risk managers now tend to compute CVaR
and other metrics, methods that are extrapolative and nonconcave, such as the information from the exponent, taking
the one closer to the lower bound of the range of exponents,
as we saw in our extension to Theorem 2 and rederiving the
corresponding , or, more rigorously, integrating the functions
of across the various possible states. Such methods of
adjustment are less biased and do not get mixed up with
problems of aggregation they are similar to the "stochastic
volatility" methods in mathematical finance that consist in adjustments to option prices by adding a "smile" to the standard
deviation, in proportion to the variability of the parameter
representing volatility and the errors in its measurement. Here
it would be "stochastic alpha" or "stochastic tail exponent"6
By extrapolative, we mean the built-in extension of the tail in
the measurement by taking into account realizations outside
the sample path that are in excess of the extrema observed.7 8
ACKNOWLEDGMENT
The late Benot Mandelbrot, Branko Milanovic, Dominique
Guguan, Felix Salmon, Bruno Dupire, the late Marc Yor,
Albert Shiryaev, an anonymous referee, the staff at Luciano
Restaurant in Brooklyn and Naya in Manhattan.
6 Also note that, in addition to the centile estimation problem, some authors
such as [11] when dealing with censored data, use Pareto interpolation for
unsufficient information about the tails (based on tail parameter), filling-in the
bracket with conditional average bracket contribution, which is not the same
thing as using full power-law extension; such a method retains a significant
bias.
7 Even using a lognormal distribution, by fitting the scale parameter, works
to some extent as a rise of the standard deviation extrapolates probability mass
into the right tail.
8 We also note that the theorems would also apply to Poisson jumps, but
we focus on the powerlaw case in the application, as the methods for fitting
Poisson jumps are interpolative and have proved to be easier to fit in-sample
than out of sample, see [6].
R EFERENCES
[1] B. Mandelbrot, The pareto-levy law and the distribution of income,
International Economic Review, vol. 1, no. 2, pp. 79106, 1960.
[2] , The stable paretian income distribution when the apparent
exponent is near two, International Economic Review, vol. 4, no. 1,
pp. 111115, 1963.
[3] C. Dagum, Inequality measures between income distributions with
applications, Econometrica, vol. 48, no. 7, pp. 17911803, 1980.
[4] , Income distribution models. Wiley Online Library, 1983.
[5] S. Singh and G. Maddala, A function for size distribution of incomes:
reply, Econometrica, vol. 46, no. 2, 1978.
[6] N. N. Taleb, Silent risk: Lectures on probability, vol 1, Available at
SSRN 2392310, 2015.
[7] M. Falk et al., On testing the extreme value index via the pot-method,
The Annals of Statistics, vol. 23, no. 6, pp. 20132035, 1995.
[8] X. Gabaix, Power laws in economics and finance, National Bureau of
Economic Research, Tech. Rep., 2008.
[9] T. Piketty, Capital in the 21st century, 2014.
[10] S. Pinker, The better angels of our nature: Why violence has declined.
Penguin, 2011.
[11] T. Piketty and E. Saez, The evolution of top incomes: a historical and
international perspective, National Bureau of Economic Research, Tech.
Rep., 2006.
This is extracted from Chapter 6 of Silent Risk. This chapter is in progress and comments are welcome. It has been slowed
down by the authors tinkering with explicit expressions for partial expectations of asymmetric alpha-stable distributions and
the accidental discovery of semi-closed form techniques for assessing convergence for convolutions of one-tailed power laws.
C ONTENTS
-A
II
III
.
.
.
.
4
4
5
5
5
IV
6
6
7
8
9
9
9
VI
9
9
9
9
Acknowledgement
VII-A Cumulants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VII-B Derivations using explicit E(|X|) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VII-C Derivations using the Hilbert Transform and = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
10
10
10
FT
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
RA
.
.
.
.
VII
References
11
Ou observe data and get some confidence that the average is represented by the sample thanks to a standard metrified
"n". Now what if the data were fat tailed? How much more do you need? What if the model were uncertain we had
uncertainty about the parameters or the probability distribution itself?
Main Results In addition to explicit extractions of partial expectations for alpha stable distributions, one main result
in this paper is the expression of how uncertainty about parameters (in terms of parameter volatility) translates into a
larger (or smaller) required n. Model Uncertainty The practical import is that model uncertainty worsens inference, in
a quantifiable way.
FT
Fig. 1: How thin tails (Gaussian) and fat tails (1< 2) converge to the mean.
RA
(1)
1 The Pinker Problem A class of naive empiricism. It has been named so in reference to sloppy use of statistical techniques in social science and policy
making, based on a theory promoted by the science writer S. Pinker [1] about the drop of violence that is based on such statistical fallacies since wars unlike
domestic violence are fat tailed. But this is a very general problem with the (irresponsible) mechanistic use of statistical methods in social science and
biology.
C2
C1
( )
1.7
1.6
1.5
1.0
1.5
2.0
FT
1.4
2.5
3.0
RA
And since we know that convergence for the Gaussian happens at speed n 2 , we can compare to convergence of other classes.
We are expressing in Equation 1 the expected error (that is, a risk function) in L1 as mean absolute deviation from the
observed average, to accommodate absence of variance but assuming of course existence of first moment without which there
is no point discussing averages.
Typically, in statistical inference, one uses standard deviations of the observations to establish the sufficiency of n. But in
fat tailed data standard deviations do not exist, or, worse, when they exist, as in powerlaw with tail exponent > 3, they are
extremely unstable, particularly in cases where kurtosis is infinite.
Using mean deviations of the samples (when these exist) doesnt accommodate the fact that fat tailed data hide properties.
The "volatility of volatility", or the dispersion around the mean deviation increases nonlinearly as the tails get fatter. For
instance, a stable distribution with tail exponent at 32 matched to exactly the same mean deviation as the Gaussian will deliver
measurements of mean deviation 1.4 times as unstable as the Gaussian.
Using mean absolute deviation for "volatility", and its mean deviation "volatility of volatility" expressed in the L1 norm, or
C1 and C2 cumulant:
C1 = E(|X m|)
C2 = E (|X
E(|X
m|)|)
We can compare that matching mean deviations does not go very far matching cumulants.(see Appendix 1)
Further, a sum of Gaussian variables will have its extreme values distributed as a Gumbel while a sum of fat tailed will
follow a Frchet distribution regardless of the the number of summands. The difference is not trivial, as shown in figures , as
in 106 realizations for an average with 100 summands, we can be expected observe maxima > 4000 the average while for
a Gaussian we can hardly encounter more than > 5 .
II. G ENERALIZING M EAN D EVIATION AS PARTIAL E XPECTATION
It is unfortunate that even if one matches mean deviations, the dispersion of the distributions of the mean deviations (and
their skewness) would be such that a "tail" would remain markedly different in spite of a number of summands that allows
the matching of the first order cumulant. So we can match the special part of the distribution, the expectation > K or < K,
where K can be any arbitrary level.
Let (t) be the characteristic function of the random variable. Let be the Heaviside theta function. Since sgn(x) = 2(x) 1
Z 1
2ieiKt
(t) =
eitx (2(x K) 1) dx =
t
1
And the special expectation becomes, by convoluting the Fourier transforms:
Z 1
@
E(X|X>K ) = i
(t u) (u)du|t=0
@t 1
Mean deviation becomes a special case of equation 2, E(|X|) = E(X|X> ) + E( X|X< ) = 2E(X|X> ).
(2)
|t | 1 i tan
sgn(t)
= exp it
2
which, for an n-summed variable (the equivalent of mixing with equal weights), becomes:
1
n t
1 i tan
sgn(t)
(t) = exp int
2
S(^,2 ,
FT
0<<2
A. Results
sgn(u) exp |u |
1
RA
E(X|X>K ) =
1
2
|u|
E(, , , 0) =
1 + i tan
1
1 + i tan
E(, , , K) =
i K
k+ 1
1
X
1/
1 + i tan
+ 1
tan2
k=1
1/
2
i tan
+1
1/
+ 1
i tan
i tan
sgn(u) + iKu du
1/
2
(3)
( 1)k 1 + i tan
2
k 1 k!
+ 1
i tan
(4)
Our formulation in Equation 4 generalizes and simplifies the commonly used one from Wolfe [5] from which Hardin [6]
got the explicit form, promoted in Samorodnitsky and Taqqu [4] and Zolotarev[3]:
!!
1
2
tan 1 tan
1
1 2
2
2
E(|X|) =
2
1
tan
+1
cos
1) Relative convergence: The general case with 6= 0: for so and so, assuming so and so, (precisions) etc.,
1
1
1 p
1
2
2
1
n = 2
ng
1 i tan
+ 1 + i tan
2
2
(5)
n = 2
Which in the symmetric case
1
2
2
B sec
B
@
sec
ng
tan
(tan(
2 ))
= 0 reduces to:
n =
2(1 )
ng
n
X
X
C
C
A
(6)
! 1
(7)
FT
kn
X
X
1 1
/E
= k
(8)
TABLE I: Corresponding n , or how many for equivalent -stable distribution. The Gaussian case is the = 2. For the case
with equivalent tails to the 80/20 one needs 1011 more data than the Gaussian.
= 1
2
n=1
-
6.09 1012
2.8 1013
1.86 1014
5
4
574,634
895,952
1.88 106
11
8
5,027
6,002
8,632
3
2
567
613
737
13
8
165
171
186
7
4
75
77
79
15
8
44
44
44
30.
30
30
n
Fughedaboudit
9
8
RA
Proof. (Sketch) From the characteristic function of the stable distribution. Other distributions need to converge to the basin
S.
B. Stochastic Alpha or Mixed Samples
Define mixed population X and (X ) as the mean deviation of ...
Proposition 1. For so and so
(X )
where
=
Pm
i=1
!i i and
Pm
i=1
m
X
!i (Xi )
i=1
!i = 1.
Proof. A sketch for now: 8 2 (1, 2), where is the Euler-Mascheroni constant 0.5772, (1) the first derivative of the Poly
Gamma function (x) = 0 [x]/ [x], and Hn the nth harmonic number:
1
@2
2
1
1
(1)
1
1 + log(n) +
1 + log(n) +
=
n
+
H
2
H
@2
4
(|X|)
= 5/4
3.5
3.0
Fig. 3: Asymmetries and Mean
Deviation.
2.5
= 3/2
1.5
= 7/4
FT
-1.0
2.0
0.5
-0.5
2
2
300
250
200
150
100
50
RA
350
1.0
1.4
1.6
1.8
2.0
which is positive for values in the specified range, keeping < 2 as it would no longer converge to the Stable basin.
Which is also negative with respect to alpha as can be seen in Figure 4. The implication is that ones sample underestimates
the required "n". (Commentary).
IV. S YMMETRIC N ON S TABLE D ISTRIBUTIONS IN THE S UBEXPONENTIAL C LASS
A. Symmetric Mixed Gaussians, Stochastic Mean
While mixing Gaussians the kurtosis rises, which makes it convenient to simulate fattailedness. But mixing means has the
opposite effect, as if it were more "stabilizing". We can observe a similar effect of "thin-tailedness" as far as the n required
to match the standard benchmark. The situation is the result of multimodality, noting that stable distributions
q unimodal
are
2
(Ibragimov and Chernin) [7] and infinitely divisible Wolfe [8]. For Xi Gaussian with mean , E = erf p2 + 2 e 2 2 ,
and keeping the average with probability 1/2 each. With the perfectly symmetric case = 0 and sampling with equal
0.12
5
2
0.08
0.10
0.06
0.04
0.02
20 000
40 000
60 000
80 000
100 000
20 000
40 000
60 000
80 000
100 000
probability:
1
(E+ + E
2
e
)=@ p
2
2 2
1
erf
2
FT
Fig. 5: Different Speed: the fatter tailed processes are not just more uncertain; they also converge more slowly.
0 q
1
2
p
erf
e
2
B
A erf @ e p +
A
p
+ p exp @
2
2
1
c1
where:
2
2 2
RA
P
kn
P
n
Xi m
n
Xi m
n
2
2 2
+ erf
2 1
C
A
(9)
c2
c1 = k 1
1
7/2 1/2
c2 = 2
4
Note that because the instability of distribution outside the basin, they end up converging to SM in(,2) , so at k = 2, n = 1,
equation 9 becomes an equality and k ! 1 we satisfy the equalities in ?? and 8.
Proof. (Sketch)
The characteristic function for = 32 :
(t) =
33/8 |t|
3/4
p
8
5
4
2 F1
K 34
3
2
3
4
5
7
4 , 2; 4 ;
3 2
7
4
4
|t|
2x2
3
xi |
0.7
0.6
0.5
0.4
0.3
FT
0.2
0.1
10
20
30
40
50
we have:
RA
Student T with 3 degrees of freedom (higher exponent resembles Gaussian). We can get a semi-explicit density for the Cubic
Student T.
p
6 3
p(x) =
2
(x2 + 3)
'(t) = E[eitX ] = (1 +
'(t) = (1 +
t e
p(x) =
Z
using
1
0
+1
(1 +
3 |t|) e
3|t|)n e
3 t)n e
3 |t|
p
n 3 |t|
p
n 3t
cos(tx) dt
p
T1+k (1/ 1 + s2 )k!
cos(st) dt =
(1 + s2 )(k+1)/2
where Ta (x) is the T-Chebyshev polynomial,2 the pdf p(x) can be writen:
n 1
2
n2 + x3
p
p(x) =
3
1 2 k +n
x2
2
n!
n
+
Tk+1
3
n
X
k=0
(n
k)!
q 1
x2
3n2
+1
which allows explicit solutions for specific values of n, not not for the general form:
( p
)
p
p
p
p
p
p
2 3 3 3 34 71 3 3138 3
899
710162 3 425331 3
33082034
5719087 3
p ,
p ,
{En }1 n<1 =
,
, p ,
,
,
,
,
,...
0.5
p=
1
2
0.4
p=
1
8
Gaussian
0.3
10
p=
0.5
Betting against the
0.4
FT
0.2
1
2
p=
100
0.3
0.2
0.1
RA
Gaussian
40
60
80
100
20
P
P
Consider a sum of Bernoulli variables X. The average n in xi follows a Binomial Distribution. Assuming np 2 N+
to simplify:
X
x n
E (|n |) = 2
(x np) p
(1 p)n x
x
i0np
n
(p 1)
1
np + 1
where:
p)+n 2 np+1
(np + 2)
n
p(np + 2)
2
np + 2
=2 F1 1, n(p
1) + 1; np + 2;
=2 F1 2, n(p
1) + 2; np + 3;
and
10
p
p
1
p
VII. ACKNOWLEDGEMENT
Colman Humphrey,...
APPENDIX :
M ETHODOLOGY, P ROOFS , E TC .
A. Cumulants
C2g =
1
erf( p + e
C1
1
3 (
1
= p
2 + 2 ) 5/4
2 6 3/2
1
3
q
q
5 4
3 4
2
2
+ 1536 2 1 + 3 H2 +
21 +
where
q
2 6
1/
C1g
5
4
3
4
RA
Move to appendix
=3/2
C2
FT
3
4
5
4
1
4
3 5/2
2 1
384 5/4
2
3
p
3 2
, H 1 = 2 F1
3
1
+3
+ 24 9/4
9/2
2 1
(H2 + 2)
p
4
3 5 7
4, 4; 4;
2
1
2
3
2 9/4
2 2
3/4
H1
, and H2 = 2 F1
q
4
1 5 3
2, 4; 2;
2
1
2 9/2 H
1
3 1
2
3
2
1
See Wolfe [5] from which Hardin got the explicit form[6].
=0
Section obsolete since I found forms for asymmetric stable distributions. Some commentary on Hilbert transforms for
symmetric stable distributions, given that for Z = |X|, dFz (z) = dFX (x)(1 sgn(x)), that type of thing.
Hilbert Transform for a function f (see Hlusel, [10], Pinelis [11]):
Z 1
1
f (x)
H(f ) = p.v.
dx
x
1 t
Here p.v. means principal value in the Cauchy sense, in other words
Z 1
Z
p.v.
= lim lim
a!1 b!0
E(|X|) =
In our case:
E(|X|) =
+
a
a
b
@
1 @
(z)
H( (0)) =
p.v.
dz|t=0
@t
@t
t
z
1
Z 1
1
(z)
E(|X|) = p.v.
dz
2
z
1
1
p.v.
1
1
|t |
t2
dt =
11
R EFERENCES
S. Pinker, The better angels of our nature: Why violence has declined. Penguin, 2011.
V. V. Uchaikin and V. M. Zolotarev, Chance and stability: stable distributions and their applications. Walter de Gruyter, 1999.
V. M. Zolotarev, One-dimensional stable distributions. American Mathematical Soc., 1986, vol. 65.
G. Samorodnitsky and M. S. Taqqu, Stable non-Gaussian random processes: stochastic models with infinite variance. CRC Press, 1994, vol. 1.
S. J. Wolfe, On the local behavior of characteristic functions, The Annals of Probability, pp. 862866, 1973.
C. D. Hardin Jr, Skewed stable variables and processes. DTIC Document, Tech. Rep., 1984.
I. Ibragimov and K. Chernin, On the unimodality of geometric stable laws, Theory of Probability & Its Applications, vol. 4, no. 4, pp. 417419, 1959.
S. J. Wolfe, On the unimodality of infinitely divisible distribution functions, Probability Theory and Related Fields, vol. 45, no. 4, pp. 329335, 1978.
I. Zaliapin, Y. Y. Kagan, and F. P. Schoenberg, Approximating the distribution of pareto sums, Pure and Applied geophysics, vol. 162, no. 6-7, pp.
11871228, 2005.
[10] M. Hlusek, On distribution of absolute values, 2011.
[11] I. Pinelis, On the characteristic function of the positive part of a random variable, arXiv preprint arXiv:1309.5928, 2013.
RA
FT
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]