Tails of Gaussian Distributions and Asymptotic Expansions

Asymptotics of Gaussian Tails

The Gaussian probability distribution is given by the expression

(1) P[x,\bar{x};\sigma] = \frac{1}{\sigma \sqrt{2\pi}} \exp\left( -\frac{1}{2} \left( \frac{x-\bar{x}}{\sigma} \right)^2 \right) ,

where P[x,\bar{x};\sigma] dx is the probability of obtaining a value between x and x + dx; the average of this distribution is \bar{x}; and the standard deviation \sigma.

Suppose we wish to compute what fraction f[x_\star] of some population described by a Gaussian distribution P[x,\bar{x};\sigma] has a value greater than some threshold x_\star. This is given by the integral

(2) f[x_\star] = \int_{x_\star}^\infty P[x,\bar{x};\sigma] dx .

When this threshold is very large, namely (x_\star - \bar{x})/\sigma \gg 1, this integral may then serve as a good example of an asymptotic expansion, a subject I feel is not well taught in US physics curriculum these days. In this case, the asymptotic expansion of f[x_\star] may be carried out simply via integration-by-parts. To begin, we write

(3) f[x_\star] = \frac{1}{\sigma \sqrt{2\pi}} \int_{x_\star}^\infty \frac{\partial_x \exp\left[-\frac{1}{2} \left( \frac{x-\bar{x}}{\sigma} \right)^2 \right]}{-(x-\bar{x})/\sigma^2} dx .

After one integration-by-parts,

(3′) f[x_\star] = \frac{1}{\sigma \sqrt{2\pi}} \left( \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}}{\sigma} \right)^2 \right]}{(x_\star-\bar{x})/\sigma^2} - \int_{x_\star}^\infty \frac{dx}{(x-\bar{x})^2/\sigma^2} \frac{\partial_x \exp\left[-\frac{1}{2} \left( \frac{x-\bar{x}}{\sigma} \right)^2 \right]}{-(x-\bar{x})/\sigma^2} \right).

After n integration-by-parts, we would find

(3”) f[x_\star] = \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}}{\sigma} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x})/\sigma} \left( 1 + \frac{A_2}{(x_\star - \bar{x})^2/\sigma^2} + \dots + \frac{A_{2(n-1)}}{(x_\star - \bar{x})^{2n}/\sigma^{2n}} \right) + B_{2n} \int_{x_\star}^\infty  \frac{\exp\left[-\frac{1}{2} \left( \frac{x-\bar{x}}{\sigma} \right)^2 \right]}{(x-\bar{x})^{2n}/\sigma^{2n-1}} dx ,

for some numerical constants \{ A_2, \dots, A_{2n} , B_{2n} \}. In particular, the first two terms read

(4) f[x_\star] = \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}}{\sigma} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x})/\sigma} \left( 1 - \frac{\sigma^2}{(x_\star - \bar{x})^2} + \mathcal{O}\left( (x_\star-\bar{x})^{-4} \right)  \right).

We infer from eq. (3”) that a power series in \sigma^2/(x_\star - \bar{x})^2 \ll 1 has emerged from our integration-by-parts. Moreover, the final “remainder” integral can be shown to be bounded as

\left\vert \frac{1}{\sqrt{2\pi}} \int_{x_\star}^\infty \frac{\exp\left[-\frac{1}{2} \left( \frac{x-\bar{x}}{\sigma} \right)^2 \right]}{(x-\bar{x})^{2n}/\sigma^{2n-1}} dx \right\vert \leq \left\vert \frac{\sigma^{2n}}{(x_\star - \bar{x})^{2n}} \right\vert f[x_\star] .

Dividing the (n+1)th term with the nth term then informs us, each integration-by-parts generates a term that is suppressed by \sigma^2/(x_\star - \bar{x})^2 relative to the previous one, at least in the limit of a very large threshold x_\star. However, the n \to \infty limit of eq. (3”) is not a Taylor series in \sigma^2/(x_\star - \bar{x})^2, in that if you sum up the entire infinite series, you will find the result to be divergent! The reason is, the coefficients \{ A_2, A_4, \dots, A_{2n} \} grow factorially with n, so that as n \to \infty the contributions are not in fact suppressed for a finite x_\star. Instead, asymptotic series such as the one at hand should always be truncated at a finite n; and it is this truncated sum that becomes a better approximation to the integral the larger x_\star grows:

\lim_{(x_\star-\bar{x})/\sigma \to \infty} \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}}{\sigma} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x})/\sigma} \left( 1 + \frac{A_2}{(x_\star - \bar{x})^2/\sigma^2} + \dots + \frac{A_{2n}}{(x_\star - \bar{x})^{2n}/\sigma^{2n}} \right) \to f[x_\star] .

Comparing Gaussian Tails

A key observation that I wish to highlight in this post is that, when comparing the tail ends of two different Gaussian distributions, the higher the threshold x_\star the more sensitive the result is to the differences in their variances and means. For concreteness we shall consider the following two scenarios.

Different Means, Same Variance

Let group A have the smaller mean \bar{x}_< and let group B have the larger mean \bar{x}_>; namely, \bar{x}_> > \bar{x}_<. If the two groups have the same variance \sigma^2, we ask the following questions:

  • What fraction f_A of A lies above the threshold x_\star?
  • What fraction f_B of B lies above the threshold x_\star?
  • If N_A denotes the total population of A and N_B that of B, what is the ratio R_{\sigma} of the total number from A above x_\star to the total number from B above x_\star?

From eq. (4), we may answer the questions, respectively as

f_A[x_\star] = \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}_<}{\sigma} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x}_<)/\sigma} \left( 1 - \frac{\sigma^2}{(x_\star - \bar{x}_<)^2} + \mathcal{O}\left( (x_\star-\bar{x}_<)^{-4} \right) \right) ;

f_B[x_\star] = \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}_>}{\sigma} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x}_>)/\sigma} \left( 1 - \frac{\sigma^2}{(x_\star - \bar{x}_>)^2} + \mathcal{O}\left( (x_\star-\bar{x}_>)^{-4} \right) \right) ;

and

(5) R_\sigma = \frac{N_A}{N_B} \left( \frac{x_\star-\bar{x}_>}{x_\star-\bar{x}_<} \right) \exp\left[-\frac{1}{2 \sigma^2}\left( 2 x_\star - \bar{x}_> - \bar{x}_< \right)(\bar{x}_> - \bar{x}_<) \right] \left( 1 - \frac{\sigma^2}{(x_\star - \bar{x}_<)^2} + \frac{\sigma^2}{(x_\star - \bar{x}_>)^2} + \dots \right) .

Same Mean, Different Variances

Let group A have the smaller variance \bar{\sigma}_< and let group B have the larger variance \bar{\sigma}_>; namely, \bar{\sigma}_> > \bar{\sigma}_<. If the two groups have the same mean \bar{x}, we ask the following questions:

  • What fraction f_A of A lies above the threshold x_\star?
  • What fraction f_B of B lies above the threshold x_\star?
  • If N_A denotes the total population of A and N_B that of B, what is the ratio R_{\bar{x}} of the total number from A above x_\star to the total number from B above x_\star?

From eq. (4), we may answer the questions, respectively as

f_A[x_\star] = \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}}{\sigma_<} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x})/\sigma_<} \left( 1 - \frac{\sigma_<^2}{(x_\star - \bar{x})^2} + \mathcal{O}\left( (x_\star-\bar{x})^{-4} \right) \right) ;

f_B[x_\star] = \frac{\exp\left[-\frac{1}{2} \left( \frac{x_\star-\bar{x}}{\sigma_>} \right)^2 \right]}{\sqrt{2\pi} \cdot (x_\star-\bar{x})/\sigma_<} \left( 1 - \frac{\sigma_>^2}{(x_\star - \bar{x})^2} + \mathcal{O}\left( (x_\star-\bar{x})^{-4} \right) \right) ;

and

(6) R_{\bar{x}} = \frac{N_A}{N_B} \cdot \frac{\sigma_<}{\sigma_>} \cdot \exp\left[-\frac{1}{2 \sigma_>^2 \sigma_<^2} \left( x_\star - \bar{x} \right)^2 \left( \sigma_>^2 - \sigma_<^2 \right) \right] \left( 1 - \frac{\sigma_<^2}{(x_\star - \bar{x})^2} + \frac{\sigma_>^2}{(x_\star - \bar{x})^2} + \dots \right) .

Very Taboo Topic: IQ of Different Human Groups

Even though the tail ends of a Gaussian distributed population comprises an exponentially small fraction of the total population, they correspond to extreme characteristics — for e.g., the fastest runners, most brilliant geniuses, etc. — and can therefore assert an outsized impact. For example, it is reasonable to expect, progress in fundamental science has been driven largely by people occupying the upper tail end of the cognitive spectrum. This topic has become extremely taboo in recent times due to the politically correct Far Left Fundamentalism err I meant the Holy Church of Diversity, Inclusion and Equity that has deeply permeated Western Academia.

If human intelligence is well reflected by IQ scores, and if IQ scores of a given human group are well modeled by a Gaussian distribution, then my understanding of the psychology literature indicates:

  • There are differences in average intelligence levels of distinct human groups; for e.g., between Blacks and Ashkenazi Jews in the US. In each group, there will of course be extremely dull and very bright folks; but the means are not the same. In particular, the Black average IQ is lower than the White average IQ; whereas Ashkenazi Jews have an average IQ that is among the highest across human groups. In other words, Ashkenazi Jews are on average more intelligent than Blacks.
  • Males and females, on average, have roughly the same intelligence. However, the spread is larger for males than for females — this is known as greater male variability: more males than females are extremely clever; and more males than females are extremely dumb.

The asymptotic analysis we performed above can be applied to these situations, and I believe is important for understanding why for instance we see far more American Jews than Blacks winning Nobel Prizes in Physics or holding distinguished Chair Professorships in Mathematics at Princeton / Harvard / UC Berkeley / etc. For similar reasons, we should expect far more men than women to acquire the most coveted leadership positions in STEM fields. The latter is exacerbated by the also well documented fact that, on average, women are less interested in ‘things’ than men.

More specifically, if we — as a crude approximation — suppose the standard deviation \sigma of Black and Jewish IQs to be the same, identifying group A to be Blacks and group B to be Ashkenazi Jews in eq. (5) then tells us the higher the threshold IQ x_\star of a particular human activity, the more exponentially dominant Jews would be over Blacks in numbers. (In the US, N_A/N_B is roughly of order 5 or so; and would readily be depleted for large enough x_\star by the exponential multiplying it.) Similarly, if we identify group A with women and group B with men, and recognizing N_A/N_B \approx 1, eq. (6) tells us the higher the IQ a particular task demands, the more exponentially “over-represented” men would be over women.

Personal Thoughts

The above are the facts as I understand them — I think it is important to distinguish between the Science and the Ethics; because the former deals strictly with reality as it is, whereas the latter deals with subjective human values. To entangle them will only lead to the corruption of our grasp on reality itself.

In my opinion, the nobility of equal dignity for all should not be predicated upon the sameness of humans. That is, I do believe in treating our fellow humans equally, but not because they are all clones of one another. In particular, equal opportunity means: we should allow all to compete equally, so that the best ideas and the most competent folks would win. Trying to enforce equal outcomes — same number of males and females in STEM fields, for e.g. — will, I am afraid, only lead to disaster. On the other hand, just because we are less talented in one or more areas does not mean we cannot find personal fulfillment and self-worth in life through other means. Growing up with a life-long disability, I have learned this quite early on.

I do worry significantly about where the free Western societies of our human civilization are moving. Even the scientific communities are unwilling or unable to come to terms with the reality of human group differences. This does not bode well for our long term collective scientific integrity.

At the end of the investigation into the Space Shuttle Challenger disaster, theoretical physicist Richard Feynman gave the following warning:

For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.

Richard P. Feynman

If I may paraphrase the great Feynman for our current climate:

For a successful Democracy that values Science, Reality must take precedence over Political Correctness, for Nature cannot be fooled.

Integration Via Differential Equations: Two Bessel(-Trigonometric) Integrals

It is rather easy to find integrals that cannot be expressed in “closed form”, which in this context means functions whose properties we know a lot about. I, for one, am rather grateful for the resources — and the people who compiled them! — such as the Table of Integrals, Series, and Products, DLMF, AMS55 and Wolfram Math World that we may consult for calculus results and properties of “special functions”.

In this post I will discuss how to evaluate the following two integrals by solving the relevant differential equations they satisfy.

(I): \frac{E_1[m]}{I_\nu[m]} \equiv \int_{-Z-\sqrt{Z^2-1}}^{-Z+\sqrt{Z^2-1}} \frac{d \rho'}{\sqrt{\rho'}} \frac{I_\nu[m \rho'] \cos\left[ m \sqrt{2 \bar{\sigma}} \right]}{I_\nu[m] \sqrt{2 \bar{\sigma}}} = \pi P_{\nu-\frac{1}{2}}\left[-Z\right]

(II): \frac{E_2[m]}{I_\nu[m]} \equiv \int_{-Z-\sqrt{Z^2-1}}^{-Z+\sqrt{Z^2-1}} \frac{d \rho'}{\rho'} \frac{I_{\nu}[m \rho']}{I_\nu[m]} J_0\left[m\sqrt{2\bar{\sigma}}\right] = \frac{1}{\nu} \left\{ \left( -Z + \sqrt{Z^2-1} \right)^\nu - \left( -Z - \sqrt{Z^2-1} \right)^\nu \right\}

Here, J_\nu is the Bessel function of the first kind; I_\nu is the modified Bessel function of the first kind; and

(III): \bar{\sigma} \equiv -\frac{1}{2} \left( \rho'^2 + 1 + 2 \rho' Z \right) \\ = -\frac{1}{2} \left\{ \rho' - \left( -Z - \sqrt{Z^2-1} \right) \right\} \left\{ \rho' - \left( -Z + \sqrt{Z^2-1} \right) \right\}.

Motivation       Since there are infinitely many intractable integrals anyway, you may wonder why you should pay attention to this result. There is in fact a physical reason for doing so. I hope to start writing about it more, but in curved spacetimes, waves associated with massless particles in Nature — including light itself — do not in fact travel strictly on the null cone. Results (I) and (II) describe the inside-the-light cone (aka “tail”) portion of massive scalar waves in de Sitter spacetime, by viewing the latter as a hyperboloid situated in 1 higher dimensional Minkowski spacetime. The massless wave tails can in turn be obtained by setting m=0.

Derivation of I and II       By a direct calculation, you may readily verify that

(D1): \mathcal{D}_m E[m] \equiv m^2 E''[m] + m E'[m] - (m^2+\nu^2) E[m] = 0

where here E[m] is either E_1 or E_2. Note that this is the ordinary differential equation (ODE) satisfied by I_\nu[m] itself; i.e., \mathcal{D}_m I_\nu[m] = 0.

In more detail, you should find that applying \mathcal{D}_m upon the left hand sides of equations (I) and (II) yields

(D2): \mathcal{D}_m E_1[m] = -2m \int_{-Z-\sqrt{Z^2-1}}^{-Z+\sqrt{Z^2-1}} d\rho' \frac{\partial}{\partial \rho'} \left( \sqrt{\rho'} \sin\left[ m \sqrt{2\bar{\sigma}} \right] I_\nu[m\rho'] \right)

and

(D3): \mathcal{D}_m E_2[m] = -2m \int_{-Z-\sqrt{Z^2-1}}^{-Z+\sqrt{Z^2-1}} d\rho' \frac{\partial}{\partial \rho'} \left( \sqrt{2\bar{\sigma}} I_\nu[m\rho'] J_1\left[ m \sqrt{2\bar{\sigma}} \right] \right) .

That is, acting \mathcal{D}_m on E_{1,2} converts the integrands into total derivatives, which then tells us the result is simply these integrands evaluated at the end points \rho_\pm \equiv -Z \pm \sqrt{Z^2-1}. But from eq. (III) we see these \rho_\pm are precisely the zeroes of \bar{\sigma} and hence of \sin[m \sqrt{2\bar{\sigma}}] and \sqrt{2\bar{\sigma}} J_1[m \sqrt{2 \bar{\sigma}}]; which in turn means the integrals are zero. In other words, \mathcal{D}_m E[m] = 0. But as already alluded to, this is precisely the ODE satisfied by I_\nu[m] itself. However, since there are two linearly independent solutions, we still need to show that our integrals satisfy E[m] \propto I_\nu. For non-integer \nu, note that I_{\pm\nu} are linearly independent. Moreover, we may check from equations (I) and (II) that E_{1,2}[m] are in fact power series in m that begin with an overall m^\nu pre-factor arising from the I_\nu[m \rho']. Since I_{\pm\nu}[m] is m^{\pm\nu} times a positive power series in m^2, this tells us there cannot be a I_{-\nu}[m] term in our E_{1,2}. What remains is to figure out the \chi_{1,2} in

(D4): E_{1,2}[m] = \chi_{1,2} I_\nu[m] .

To do so, notice \chi_{1,2} cannot depend on m. We may therefore extract their values through the limits

(D5): \chi_{1,2} = \lim_{m \to 0} \frac{E_{1,2}[m]}{I_\nu[m]} .

From equations (I) and (II), and utilizing the Taylor series results

I_\nu[z] = \frac{(z/2)^\nu}{\Gamma[\nu+1]} \left( 1 + \mathcal{O}[z^2] \right)

and

J_0[z] = 1 + \mathcal{O}[z^2]

we have

(D6): \lim_{m \to 0} \frac{E_1[m]}{I_\nu[m]} = \int_{-Z-\sqrt{Z^2-1}}^{-Z+\sqrt{Z^2-1}} d \rho' \frac{\rho'^{\nu-\frac{1}{2}}}{\sqrt{-\rho'^2 - 1 - 2\rho' Z}}

and

(D7): \lim_{m \to 0} \frac{E_2[m]}{I_\nu[m]} = \int_{-Z-\sqrt{Z^2-1}}^{-Z+\sqrt{Z^2-1}} d \rho' \rho'^{\nu-1} = \frac{1}{\nu} \left\{ \left( -Z + \sqrt{Z^2-1} \right)^\nu - \left( -Z - \sqrt{Z^2-1} \right)^\nu \right\} .

From equations (D4), (D5) and (D7), we may see that equations (II) has been proven for non-integer \nu. What remains, therefore, is to tackle eq. (D6). If we put

(D8): \rho' \equiv - Z + \cos[u] \sqrt{Z^2-1}, \qquad\qquad u \in [0,\pi]

so that

\frac{d\rho'}{\sqrt{2 \bar{\sigma}}} = \frac{d\rho'}{\sqrt{-\rho'^2 -1 - 2\rho' Z}} = \frac{d(\cos u) \sqrt{Z^2-1}}{\sin[u] \sqrt{Z^2-1}};

eq. (D6) now reads as

(D9): \lim_{m \to 0} \frac{E_1[m]}{I_\nu[m]} = \int_{u=0}^{u=\pi} d u \left( -Z + \sqrt{Z^2-1} \cos u\right)^{\nu-\frac{1}{2}}.

Since the cosine is an even function we may extend the integration limit to u = -\pi, namely

(D9′): \lim_{m \to 0} \frac{E_1[m]}{I_\nu[m]} = \frac{1}{2}\int_{u=-\pi}^{u=\pi} d u \left( -Z + \sqrt{Z^2-1} \cos u\right)^{\nu-\frac{1}{2}} .

At this point, referring to the integral representation of the Legendre function P_{\nu-1/2}[z] — see here, for instance — tells us we have arrived at the right-hand-side of eq. (I), at least for non-integer \nu. Our results are very likely true for integer \nu as well, since I believe it is safe to assume they are continuous functions of \nu.


References

  • Y.-Z. Chu, “A line source in Minkowski for the de Sitter spacetime scalar Green’s function: massive case,” Class. Quant. Grav. 32, no. 13, 135008 (2015); doi:10.1088/0264-9381/32/13/135008; [arXiv:1310.2939 [gr-qc]].