Zodiacal Light from Death Valley

After attending the conference honoring my PhD adviser Tanmay Vachaspati, I got to spend a few days in Death Valley National Park. At Stovepipe Wells, the skies were very dark, and during one of the evenings I saw Venus immersed in the Zodiacal Light.


Since the winter Milky Way was nearby, I took a shot of the duo — notice, the Andromeda Galaxy (M31) was visible too.


Technical details for the photographer reader: Canon 5D Mark IV with the Sigma 14mm f/1.8 lens.

Intellectual Independence Index

III     Intellectual independence is something I personally hold dearly as a theoretical physicist; and respect very much when I see reflected in senior scientists. I think some sort of Intellectual Independence Index should be one of the metrics used to judge a theorist; though I admittedly do not have a good sense of exactly how to quantify it.

My primary scientific motto can be summed up as:

Listen to and learn from others, but always think for oneself.

I have always thought all of us theorists should be doing so; but with the intense pressure to publish frequently, I believe it is not at all the norm. If the latter is indeed true — how does one properly measure it? — then this is yet another reason for my worry that the integrity of theoretical physics will be gradually eroded away: namely, where are the checks-&-balances from independent minds? A recent article in Scientific American pointed out:

…… Of course, reputations for good work affect scientists as much as anyone else, but one or two “real” advances by a researcher will erase any downside to even a litany of other findings that disappeared into the trash pile of time since no one else can reproduce them. Indeed, in a now famous report from Bayer Pharmaceuticals, 65 percent of published scientific findings were not reproducible by Bayer scientists when they tried to use them for drug development.

This is not an issue of scientific fraud or misconduct where scientists invent data or purposefully lie; the data are real and were really observed. However, the fiercely competitive environment leads to a haste to publish and a larger number of less rigorous papers results. Careful and self-critical scientists who spend more time and resources to carry out more rigorous and careful studies may be promoted less often, receive fewer research resources and get less recognition for their work.

I’ve always wondered what happens if we’d carefully work through the details of theoretical physics papers; how many mistakes will be found, and how many papers will turn out to be wrong? In other words, how reliable is the theoretical physics literature?

Closely related to the integrity of the science itself, is the proper attribution of intellectual credit to the researchers who have contributed to it. I find it hypocritical, Western Academia (including theoretical physics) is increasingly “woke” — for instance, the Diversity, Inclusion and Equity (DIE) religion of equal representation (as opposed to equal opportunities) is now its cultural norm — but yet there is comparatively little discussion on actual scientific accountability: namely, where it costs to signal honestly, to borrow evolution-speak. To this end, I had recommended that a statement of author contributions be made mandatory when, more than a year ago, the Physical Review journals were soliciting feedback.

Two Theoretical Physicists

I regard my PhD advisor, Tanmay Vachaspati —  who is turning (or has already turned?) 60; see here for the Arizona State U conference organized to honor him — to be one of the most intellectually independent senior scientists I have ever interacted/worked with. It is not difficult to verify, Tanmay writes many papers on his own, even during the recent years. He also wrote a single-author book on topological aspects of field theory, which he gifted a copy to me upon my graduation. Tanmay certainly does not chase after the winds of fashion; but yet regularly generates ideas of his own. This is unlike many senior scientists, who become essentially project managers: sure, their names appear on many papers addressing the latest fad, but they oftentimes merely attend the meetings; ask the occasional smart question; but no longer even steer the project intellectually; let alone provide deep ploughing insights nor important guidance — especially when the going gets rough.

Although I’ve never had the privilege to meet Steven Weinberg, I find him to be an amazing theorist; not just for his past, very fundamental (Nobel prize winning) work in quantum field theory and particle physics — he played a key role in building what’s now known as the Standard Model — but for remaining very active in cosmological research till now (circa 2019). As the reader may readily verify, he is also one who still writes single-author papers; not to mention recent books on cosmology and quantum mechanics. According to Wikipedia, he is now in his mid 80’s! He is most definitely a scientific role model for the rest of us theoretical physicists.

Many senior and/or famous scientists travel regularly; and they use their status/connections/influence to sit on papers where they contribute very little of actual substance. (In theoretical physics, it’s often the junior/less famous colleagues who explain the physics and technicalities to the seniors/the famous. The traffic the other way round can be rather sporadic and vague, depending on who the senior/famous physicists are, as well as the relationships of the people involved.) Because of these factors, they remain plugged into the collective wisdom of their colleagues as well as the mainstream of physics; and as such, are able to maintain their clout. Serious problems arise when the graduate student(s) gets royally stuck on a problem and do not have experienced postdocs/other graduate students to collaborate with; here, the senior adviser can prove to be of little help.

A Funding Question

Why can’t research funding be structured in a more sophisticated manner to reflect this reality, while maintaining high levels of scientific integrity? That is, senior/famous scientists could continue to be rewarded; not for their intellectual contributions (i.e., if they are no longer making any), but for running their group, hiring the right personnel, etc. While those scientists who do in fact continue to make substantive intellectual contributions to science will be rewarded for doing so. Those who manage to do both should of course be rewarded even more! This way, there will be far less pressure on senior scientists to pad their CVs with papers in which they have done little for.

Personal Experiences

My personal experiences have reinforced the need to enjoy one’s work; and, to this end, to be as intellectually independent as possible. Why pad someone else’s CV, when one can write one’s own papers. for example? This is especially the case in theoretical physics, where many would ultimately not find permanent positions.

Another of my own personal motto:

One should live for one’s own curiosity and scientific conscience.

I worked with a senior physicist a few years back, where the whole paper was my idea and the calculations were carried out entirely by myself. I put his name on the paper, to be completely honest, because I knew I needed his recommendation letter. (In this sense, I did not behave in the most scientifically ethical manner. Moreover, during a meeting later on that involved his graduate student, he reminded me I needed his recommendation letter — through no antagonism on my part. Is this professional behavior? Since then, I have developed a skeptical attitude towards the requirement of recommendation letters within Academia.) Halfway through the project, he came to learn from their then-graduate student that our “friends” were working on exactly the same problem; in particular, the first part of their work was apparently completed years ago. The short of the story is, we soon got scooped by them, once the graduate student returned to inform them of what we were doing. I had to work extra hard to do more, in order to publish a legitimate research paper. It took a couple of months to do so, and at the end of it there were some discrepancies, which I described in a footnote. These “friends” ended up accusing me (I wrote the paper) of misrepresenting their work. This senior scientist then held some private negotiations with them without me — following which, I was confronted in a one-on-one meeting with him, where he twisted my arm (figuratively speaking) and had me remove a good chunk that footnote. Suffice to say, if I had written that paper by myself, which I had the full intellectual right to do so, I would not have budged unless I was provided with valid scientific reasons. The primary problem was, because of the manner which the senior scientist dealt with the situation, I never had the chance to properly discuss with our “friends” the scientific points of disagreement! I was told by this senior scientist:

I can fight with them [our “friends”]. You cannot fight with them.

It was all politics; and zero science.

I was also very engaged with a project, based off a misguided idea of the senior scientist’s, involving his student and a senior postdoc. When the going got rough, the senior postdoc “fell off the bandwagon” (his words, not mine) and at the end of the project I requested he removed his name from the paper, because he hardly participated in the effort leading up to the primary results. For this push back, I received passive-aggressive backlash from two senior scientists. I never had the chance to speak to the graduate student on a more personal level; how this young scientist felt, given the huge amount of work expended and the low return on investment. (This young scientist has recently left Academia.)

Remark     The career trajectory of this senior postdoc taught me how important politics is within Academia. To be sure, he is highly competent and well educated. But it was clear to outsiders his particularly close relationship to his supervisors meant that he appeared on nearly every paper they put out — regardless of how much work he had actually exerted. I have never understood why this is scientifically acceptable behavior.

A physicist friend of mine told me a story involving a former student — whom I will denote as X — of those “friends” I described above. (She is now faculty.) X was supposed to work with my friend and a senior scientist, but did not end up contributing much. Still, X strong-armed herself onto the paper. As I understand it, she was working on a parallel paper — that all 3 of them were supposed to write together as a follow-up project — and proceeded to scoop her 2 collaborators (i.e., my friend and the senior scientist).

Remark     I fear the behavior of theoretical physicists will become increasingly unethical, as the number of tenure-track/permanent jobs dwindle and the pressure to publish frequently increases. I wish senior scientists, instead of playing politics, would set good examples; and show the proper moral leadership to set up the right (dis)incentives so that high scientific standards will be properly maintained.

I mentored another senior faculty’s student for more than a year — suggesting a project and supervising it through. At the end of the project, I was somewhat bemused that — although this senior scientist did not contribute very much — there was not only zero acknowledgement from his end, he asked if I were going to write a strong letter for the student. Now, I’m a believer of proper scientific mentor-ship; so I supposed he meant well for the student, and hence immediately proceeded to inform him I had already done so. But as far as intellectual credit is concerned, is it perhaps too presumptuous of me to question: who really has the moral and scientific authority to question me in this situation — for, I was in fact the student’s de facto primary adviser; and, furthermore, shouldn’t my reference letter ought to be an independent assessment of the young scientist?

I met a remarkable physicist while taking a year off between my Master’s degree from Yale and re-starting my PhD program at Case. He was initially working on String Theory for the most part of his PhD; but towards the end, discovered he wanted to work on Loop Quantum Gravity instead. He did in fact manage to switch fields; though he has since spent an unreasonable amount of time (in my humble opinion) as a postdoc, despite being consistently research-active. I truly hope a scientist with his intellectual independence will soon be rewarded with a tenure-track position!


Theoretical physics still attracts highly intelligent and competent people; and I’ve personally met quite a few of them. While I admire them for their intellectual prowess, I have yet to meet a contemporary that shares my concern for the hyper-competitive environment that I fear is leading to an erosion of our Scientific Integrity.  To be clear, collaborations can be very beneficial to Science itself, by bringing together people with different expertise, etc. But given the current climate, I do encourage theorists to set some of their time aside — as well as gather one’s intellectual courage — to write their own papers!

Wave Tails in Flat Spacetimes

In 3+1 dimensional flat spacetime, electromagnetic waves travel strictly on the null cone. How does one quantify this statement? It is through the retarded Green’s function of its wave operator.

In Lorentz covariant notation, the electromagnetic fields are encoded with the antisymmetric tensor F_{\mu\nu} = -F_{\nu\mu}, which in turn is built out of the vector potential A_\mu through

(1)    F_{\mu\nu} \equiv \partial_\mu A_\nu - \partial_\nu A_\mu .

If J_\mu is the electromagnetic current, the electromagnetic fields themselves are sourced through the equation

(1)    \partial^2 F_{\mu\nu} = \partial_{[\mu} J_{\nu]} \equiv \partial_\mu J_\nu - \partial_\nu J_\mu .

Throughout this post, we shall assume the time-space coordinates (t,\vec{x}) and (t',\vec{x}') parametrize some global Lorentzian inertial frames; i.e., they describe the invariant intervals

(1′)    ds^2 = dt^2 - d\vec{x} \cdot d\vec{x}    and    ds^2 = dt'^2 - d\vec{x}' \cdot d\vec{x}'.

Suppose we have solved the massless scalar Green’s function G, which obeys

(2)    \partial^2 G[x-x'] = \delta^{(d)}[x-x'],

where \delta^{(d)}[x-x'] is the d-dimensional Dirac delta function; and suppose further we’ve managed to specify the initial spatial components of the vector potential A_{i}[t=t_0,\vec{x}] and the electric field F^{i0}[t_0,\vec{x}]. It is then possible to express F_{\mu\nu}[t > t_0,\vec{x}] at any later time via the following Kirchhoff integral representation:

(3)    F_{\mu\nu}[t>t_0,\vec{x}] = \int_{t_0}^\infty d t' \int_{\mathbb{R}^{d-1}} d^{d-1} \vec{x}' G[t-t',\vec{x}-\vec{x}'] \partial_{[\mu'} J_{\nu']}[t',\vec{x}'] \\ + \int_{\mathbb{R}^{d-1}} d^{d-1}\vec{x}' \left( \partial^{[0'} \partial_{[\mu} G[t-t_0,\vec{x}-\vec{x}'] \delta_{\nu]}^{i']} A_{i'}[t_0,\vec{x}'] - \partial_{[\mu} G[t-t_0,\vec{x}-\vec{x}'] \eta_{\nu]i'} F^{0'i'}[t_0,\vec{x}']  \right);

where the un-primed indices denote derivatives with respect to the observer location at (t,\vec{x}) and the primed ones the source location at (t',\vec{x}').

Note that the magnetic field can be defined as F_{ij} = \partial_i A_j - \partial_j A_i; hence, eq. (3) may be viewed a variant of the statement:

To determine the electric and magnetic fields at some later time t > t_0, it suffices to specify them at the initial time t = t_0.

Huygens’ principle in 3+1D     Suppose we were dealing with free electromagnetic waves — i.e., homogeneous solutions to the wave equation, with J_\mu = 0 — then we have

(3′)    F_{\mu\nu}[x] = \int_{\mathbb{R}^{d-1}} d^{d-1}\vec{x}' \left( \partial^{[0'} \partial_{[\mu} G[t-t_0,\vec{x}-\vec{x}'] \delta_{\nu]}^{i']} A_{i'}[t_0,\vec{x}'] - \partial_{[\mu} G[t-t_0,\vec{x}-\vec{x}'] \eta_{\nu]i'} F^{0'i'}[t_0,\vec{x}']  \right).

In d=3+1 dimensions, the Green’s function is non-zero strictly on the light cone.

(3.1′)    G[t-t',\vec{x}-\vec{x}'] = \frac{\delta[t-t' - |\vec{x}-\vec{x}'|]}{4\pi |\vec{x}-\vec{x}'|}

When eq. (3.1′) is inserted into eq. (3′), we obtain the quantitative form of Huygens’ principle: the electromagnetic field at each point in space at the initial time t_0 will spread out in an infinitesimally thin spherical shell at the speed of light.

If we focus instead on the inhomogeneous solution — i.e., attribute F_{\mu\nu} entirely to J_\mu, set all initial fields to zero, and send t_0 \to -\infty.

(3”)    F_{\mu\nu}[x] = \int_{\mathbb{R}^{1,d-1}} d^d x' G[x-x'] \partial_{[\mu'} J_{\nu']}[x']

What this statement says is: the electromagnetic field generated from the electric current J_\mu[t',\vec{x}'] at each spacetime point (t', \vec{x}') again propagates outwards in an infinitesimally thin spherical shell at the speed of light: t-t' = |\vec{x}-\vec{x}'|.

Other dimensions     In higher dimensions, the situation is a tad more complicated; splitting into even versus odd cases.

For even dimensions, the Green’s function G again propagates signals strictly on the null cone. Hence, Huygens’ principle continues to hold; except the structure of the Green’s function does become more involved — see eq. (12) of this post.

In odd dimensions, the Green’s function — see eq. (12′) of the same post — now has a non-zero piece inside the light cone (t-t' > |\vec{x}-\vec{x}'|). This inside-the-light-cone portion is known as the tail. Equations (3′), which really holds in all d \geq 3, tells us the homogeneous signal now receives contributions from inside the past light cone of the observer. And equation (3”) tells us the signal produced by the electric current at (t',\vec{x}') now travels inside the forward light cone.

Huygens’ principle is violated in odd dimensional flat spacetime; but respected in even dimensional ones (higher than 2).


Petition to CERN: Invite External Gender Experts

It is likely the outrage cycle regarding the Strumia affair is long over. But if one is genuinely concerned about Scientific Integrity, I would assert — given how publicly we physicists have flaunt our far Left Gender Ideology at the center for particle physics — there is much to do to ‘restore the balance’.

One key thought I’ve always had is, since so many of us claim to be on the side of Science when it comes to women-in-STEM issues, why not invite external gender experts to inform us of the latest research? This is precisely what I have tried to do, to urge CERN to invite external gender experts to its future “High Energy Physics and Gender” conferences — see below. Together with collaborators, I started with experts suggested by Psychology Professor Lee Jussim; then wrote to these experts for further suggestions, etc. And from those who responded positively, we proposed a panel for CERN’s consideration. I believe this panel does include a spectrum of views.

If you are interested in gender-and-STEM issues from a scientific standpoint, I urge you to sign the petition below.

Balance at CERN

Transverse-Traceless Massless Spin-2 Gravitational Waves Are Acausal

Let X^i be the Cartesian coordinate vector joining one end of a laser interferometer arm to another; and let this interferometer be freely-falling in a weakly curved spacetime

(1)     g_{\mu\nu} = \eta_{\mu\nu} + h_{\mu\nu}.

Practically all the pedagogical literature on gravitational physics tell us the distortion \delta X^i of this arm due to the presence of a gravitational wave is proportional to the transverse-traceless part of the metric perurbation h_{\mu\nu}:

(2)     \delta X^i = \frac{1}{2} h_{ij}^{\text{TT}} X^j .

But, what does “transverse-traceless” (TT) actually mean here? The field theorist reader would likely think that the h_{ij}^{\text{TT}} must be the gauge-invariant massless spin-2 graviton, which obeys

(2′)     \partial_i h_{ij}^{\text{TT}} = 0 \qquad \qquad \text{(Transverse)}


(2”)     \delta^{ij} h_{ij}^{\text{TT}} = 0 \qquad \qquad \text{(Trace-less)} .

Massless Spin-2     The helicity–2 character is the result of these TT conditions; for, each Fourier \vec{k}-mode, it is always possible to find a basis of polarization tensors \epsilon_{ij}^\pm, namely

(2”.I)     h_{ij}^{\text{TT}} = \int \frac{d^3 \vec{k}}{(2\pi)^3} \left\{ \left( a_+[\vec{k}] \epsilon^+_{ij}[\vec{k}] e^{-i |\vec{k}|t} + a_-[\vec{k}] \epsilon^-_{ij}[\vec{k}] e^{-i |\vec{k}|t} \right) e^{i\vec{k}\cdot\vec{x}} + \text{c.c.} \right\} ;

such that under a rotation along the axis \vec{k} through an angle \theta,

(2”’)     \epsilon_{ij}^\pm e^{-i |\vec{k}|t+i\vec{k}\cdot\vec{x}} \to e^{(\pm 2) i \theta} \epsilon_{ij}^\pm e^{-i |\vec{k}|t+i\vec{k}\cdot\vec{x}}.

The \pm 2 may be viewed as the eigenvalues of the generator of rotation on the plane perpendicular to \vec{k}.

Gauge-Invariance     Next, by gauge-invariance, I mean here that, under an infinitesimal change in coordinates

(3)     x^\alpha \to x^\alpha + \xi^\alpha ,

the TT character of this gravitational wave ensures it remains unaltered:

(3′)     h_{ij}^{\text{TT}} \to h_{ij}^{\text{TT}} .

Now, this gauge-invariance is often invoked as a criterion for physical observability: for, if some observable is expressed in terms of the gauge dependent components h_{\mu\nu} in eq. (1), how does one know if the physical effect at hand cannot be rendered trivial simply by choosing an infinitesimally different coordinate system? However, the main point of this post is — the converse most certainly does not hold:

Gauge invariance does not imply physical observability.

The reason is simple: even though h_{ij}^{\text{TT}} is gauge-invariant, it is acausal. More specifically, within the linearized approximation of General Relativity, this massless spin-2 gravitational wave (GW) admits the solution

(4)     h_{ij}^{\text{TT}}[x] = \int_{\mathbb{R}^{3,1}} d^4 x' G_{ij a'b'}[x-x'] T_{a'b'}[x'];

where G_{ij a'b'} is the Green’s function of the TT GW and T_{ab} is the stress-energy tensor. Through a direct calculation, in arXiv: 1902.03294, Yen-Wei Liu and I showed that G_{ij a'b'}[x,x'] is non-zero outside the past light cone of the observer at x. In other words, the signal h_{ij}^{\text{TT}} receives contributions from portions of T_{a'b'} that are spacelike separated from the observer — and therefore cannot be a standalone observable.

Tidal Forces & GW Strain    So, what is one to make of the formula in eq. (2) then? To this end, we first recall that — if \ell^\mu describes the displacement between a pair of infinitesimally nearby timelike geodesics (a pair of freely-falling test masses, for instance); the fully covariant acceleration a^\mu of this displacement vector is driven by the Riemann tensor:

(5)     a^\mu = -R^\mu_{\phantom{\mu} \nu \alpha\beta} U^\nu \ell^\alpha U^\beta.

(The U^\alpha is the unit norm timelike vector tangent to one of the two geodesics.) In a flat spacetime, the Riemann tensor is exactly zero; i.e., a pair of parallel lines will remain parallel because their relative acceleration is zero. Now, at first order in the perturbation h_{\mu\nu}, both sides of eq. (5) must be gauge-invariant since their ‘background value’ (evaluated on g_{\mu\nu} = \eta_{\mu\nu}) is zero. This in turn means we can choose any gauge we wish. Synchronous gauge, where the perturbations are strictly spatial

(5′)     h_{\mu\nu} d x^\mu d x^\nu \to h_{ij}^{\text{(s)}} d x^i d x^j ,

is particularly pertinent in this context of 2 infinitesimally close-by free falling test masses. For, if the x^0 of the synchronous-gauge coordinate system refers to the proper times of these free falling objects, their spatial coordinates are then automatically time independent, and

(5”)     U^\mu = \delta^\mu_0 .

If we assume the clocks on this pair of test masses are synchronized at some initial time t_0, then one may demonstrate using eq. (5) they will continue to remain so for later times; namely, a^0[x^0 > t_0] = 0 if \ell^0[x^0 = t_0] = 0 = U^\sigma \nabla_\sigma \ell^0[x^0 = t_0]. Employing eq. (5”), the spatial tidal forces described by the geometrically induced relative acceleration is now

(6)     \delta_1 a^i = \delta_1 R_{0i0j} \ell^j + \mathcal{O}[h^2] ;

with the notation \delta_1 R_{0i0j} denoting the 0i0j components of the linearized Riemann tensor.

Within the synchronous gauge, the proper distance between two free falling test masses \vec{Z}_1 and \vec{Z}_2 at a given time t (accurate to first order in perturbations) is

(7)     L_{1 \leftrightarrow 2}[t] = R \left( 1 - \frac{1}{2} \widehat{R}^i \widehat{R}^j \int_0^1 h_{ij}^{\text{(s)}}\left[ t, \vec{Z}_1 + \lambda(\vec{Z}_2-\vec{Z}_1) \right] d \lambda + \mathcal{O}[h^2] \right) , \\ R \equiv |\vec{Z}_1 - \vec{Z}_2|, \qquad \widehat{R}^i = (\vec{Z}_1 - \vec{Z}_2)/R ;

from which, we see that the fractional distortion \delta L[t]/R is

(7′)     \delta L[t]/R = - \frac{1}{2} \widehat{R}^i \widehat{R}^j \int_0^1 h_{ij}^{\text{(s)}}\left[ t, \vec{Z}_1 + \lambda(\vec{Z}_2-\vec{Z}_1) \right] d \lambda + \mathcal{O}[h^2].

Moreover, the linearized Riemann \delta_1 R_{0i0j} in the synchronous gauge reads

(8)     \delta_1 R_{0i0j} = -\frac{1}{2} \partial_0^2 h_{ij}^{\text{(s)}}.

Remember the linearized Riemann is gauge-invariant, so it ought to be possible to re-express \delta_1 R_{0i0j} in terms of gauge-invariant metric perturbation variables. (More on this below.) In fact, what Yen-Wei and I argued in arXiv:1902.03294 was that, in the far zone where (observer-source distance)/(characteristic timescale of source)\gg 1,

(8′)     \delta_1 R_{0i0j} = -\frac{1}{2} \partial_0^2 h_{ij}^{\text{TT}} \qquad \text{(Far zone)}.

Therefore, in frequency space

(8”)     \widetilde{h}_{ij}^{\text{TT}}[\omega,\vec{x}] = \int_{\mathbb{R}} d t e^{i \omega t} h_{ij}^{\text{TT}}[t,\vec{x}] ,

that the linearized Riemann is gauge-invariant allows us to equate (8) and (8′) to conclude — for finite frequencies \omega

(9)     \widetilde{h}_{ij}^{\text{TT}}[\omega,\vec{x}] = \widetilde{h}_{ij}^{\text{(s)}}[\omega,\vec{x}]  \qquad \text{(Far zone)} .

Important aside     By placing \vec{0} at the center-of-mass of the material source of gravity, in the same far-zone limit, the transverse-traceless GW h_{ij}^{\text{TT}} reduces to

(9′)     h_{ij}^{\text{TT}} \approx \left( P_{ia} P_{jb} - \frac{1}{2} P_{ij} P_{ab} \right) h_{ab}[\text{de Donder}] , \\ P_{ij} = \delta_{ij} - \widehat{r}_i \widehat{r}_j ,  \qquad \widehat{r}_i \equiv x_i/|\vec{x}| , \qquad \widehat{r}_i P_{ij} = 0 .

Namely, the far-zone massless spin-2 GW is the de Donder gauge gravitational perturbation projected locally-in-space transverse to the propagation direction. But the de Donder gauge graviton is in fact causally dependent on the stress tensor; in the far zone, in particular,

(9”)     h_{ij}[\text{de Donder}] \approx -\frac{4 G_{\text{N}}}{r} \int_{\mathbb{R}^3} d^3 \vec{x}' T_{ij}[t-r+\vec{x}'\cdot\widehat{r},\vec{x}'] . \qquad r \equiv |\vec{x}| .

This means the far zone TT GW in eq. (9′) is causal, even though its full form in eq. (4) is not. The reason is, the acausal portions begin at higher order in 1/r. In the GW literature, the local-in-space projection in eq. (9′) — what Ashtekar and Bonga, referenced below, dubbed h_{ij}^{\text{tt}} to distinguish it from h_{ij}^{\text{TT}} in equations (2′) and (2”) — is actually the one that is employed, not the tranverse-traceless one subject to equations (2′) and (2”). We see, the reason why it is possible to get away with mixing these two distinct notions of transverse-traceless projections is that they coincide when \omega r \gg 1; i.e., in the far zone. (Note: Racz and Ashtekar-Bonga, whose papers can be found below, have correctly complained that the GW literature wrongly mixes ‘tt’ versus ‘TT’.)

Summary     Let us sum up the discussion within this section. In the far zone, the fractional distortion of the proper distance between the pair of free-falling test masses \vec{Z}_1 and \vec{Z}_2 at a given time t is

(10)     \frac{\delta L[t]}{R} = - \frac{1}{2} \widehat{R}^i \widehat{R}^j \int_0^1 h_{ij}^{\text{TT}}\left[ t, \vec{Z}_1 + \lambda(\vec{Z}_2-\vec{Z}_1) \right] d \lambda + \mathcal{O}[h^2].

This formula is to be understood as valid only for finite frequencies — for instance, LIGO is built to be sensitive to a limited bandwidth centered roughly at 100Hz. Otherwise, equating (8) and (8′), which was what led to eq. (9)-(10), actually misses the initial h_{ij}^{\text{(s)}} and its time derivative; in frequency space these initial conditions correspond to zero-\omega Dirac \delta-function terms. In the limit where the wavelength of the GW is long compared to R, so h_{ij}^{\text{TT}} is approximately constant between \vec{Z}_1 and \vec{Z}_2, eq. (10) then reduces to

(10′)     \delta L[t]/R \approx - \frac{1}{2} \widehat{R}^i \widehat{R}^j h_{ij}^{\text{TT}} + \mathcal{O}[h^2].

This is equivalent to eq. (1); but to arrive at it we have assumed the following.

  • The GW detector is in the far zone.
  • The GW detector is only sensitive to finite gravitational wave frequencies.
  • The GW detector’s proper size is much smaller than the gravitational wavelength.

Dynamical Degrees-Of-Freedom vs. Physical Observables     In field theory speak, one often hears the statement that “4D Einstein-Hilbert gravity has only 2 dynamical degrees-of-freedom”. In its linearized form, we shall see this statement amounts to:

Of all the gauge-invariant variables formed from the metric perturbation h_{\mu\nu} in eq. (1) — the transverse-traceless tensor h_{ij}^{\text{TT}} = D_{ij}; the transverse vector V_i; and the scalars \Psi and \Phi — only the tensor obeys a wave equation.

To build h_{ij}^{\text{TT}} = D_{ij}, V_i, \Psi and \Phi out of the perturbation h_{\mu\nu}, refer to equations (A10), (A15) and (A16) of arXiv: 1611.00018. (Put d=4; remove the over-bars and note that h_{\mu\nu}[\text{here}] = \chi_{\mu\nu}[\text{1611.00018}].) What I wish to highlight here are the (3+1)D version of the equations-of-motion in (A25) and (A26):

(11)     \vec{\nabla}^2 \Phi = 8\pi G_{\text{N}} \rho, \qquad \Phi - \Psi = 16\pi G_{\text{N}} \Upsilon, \\ \vec{\nabla}^2 V_i = 16 \pi G_{\text{N}} \Sigma_i, \qquad \partial^2 h_{ij}^{\text{TT}} = -16\pi G_{\text{N}} \sigma_{ij}.

The transverse-traceless conditions of equations (2′) and (2”) tell us, of the 3+(3^2-3)/2=6 components of h_{ij}^{\text{TT}}, only 6-3-1=2 are independent. However despite this “2 d.o.fs” assertion regarding the TT GW, as I have already pointed out above its solution is acausal and cannot possibly be a standalone physical observable. In eq. (11) the \sigma_{ij} is in fact a non-local functional of the spatial components of the stress tensor — heuristically, T_{ij} is smeared out over all space in such a manner that the resulting object \sigma_{ij} obeys the constraints \partial_i \sigma_{ij} = 0 = \sigma_{ij} \delta^{ij}.

What, then, is one to make of this acausality; as well as the gauge-invariant content of linearized gravitation? A partial answer is offered by the spatial tidal forces exerted by geometric curvature, encoded within the \delta_1 R_{0i0j} discussed above. Yen-Wei and I showed that, even though the TT GW h_{ij}^{\text{TT}} and its acceleration \partial_0^2 h_{ij}^{\text{TT}} are acausal, the vector V_i and scalars \Phi and \Psi appear in \delta_1 R_{0i0j} in such a way to precisely cancel out the acausal contributions from the tensor; with the end result yielding tidal forces that are strictly causally dependent on the material stress tensor:

(12)     \delta_1 R_{0i0j} = \frac{1}{2} \left( -\frac{1}{2} \partial_i \partial_j \Psi - \delta_{ij} \partial_0^2 \Phi + \partial_0 \partial_{\{ i} V_{j \}} - \partial_0^2 h_{ij}^{\text{TT}} \right) = - \frac{1}{2} \left( \partial_0^2 h_{ij}^{\text{TT}} \right)_{\text{causal}}.

Therefore, the tidal squeezing and stretching of a Weber bar or of a laser interferometer’s arms is not to be attributed to the entire spin-2 massless graviton — because of its acausal character — but to only the causal part of its acceleration. Even in the (quasi-)static limit where the \Phi, V_i, h_{ij}^{\text{TT}} in eq. (12) all appear to become negligible; say, for instance, the contribution to the tides on Earth due to differential gravitational tugs from either the Moon or the Sun; we should not attribute the rising and ebbing of the oceans to the second derivatives of the Newtonian-like potential \Psi in eq. (11). Rather, on grounds that physical tidal forces ought to be causal, according to eq. (12), it still has to be attributed to the causal part of the TT tensor perturbation’s acceleration.

Micro-causality in QFT     If you have taken a course on Quantum Field Theory, you might have been told that the amplitude for a particle to propagate from y to x is given by the vacuum expectation value \langle 0 \vert \varphi[x] \varphi[y] \vert 0 \rangle (for scalar particles \varphi). However, a direct calculation for non-interacting scalars would reveal this object is non-zero for spacelike separated x and y; i.e., a particle has a non-zero quantum mechanical amplitude to propagate outside the light cone. See discussion in \S 2.4 of Peskin and Schroeder (P&S) for instance. P&S goes on to assert

To really discuss causality, however, we should ask not whether particles can propagate over spacelike intervals, but whether a measurement performed at one point can affect a measurement at another point whose separation from the first is spacelike. The simplest thing we could try to measure is the field \phi(x), so we should compute the commutator [\phi(x),\phi(y)]; if this commutator vanishes, one measurement cannot affect the other. In fact, if the commutator vanishes for (x-y)^2 < 0, causality is preserved quite generally, … [truncated] — Chapter 2, page 28

At the quantum level, are the transverse massless spin-1 photon A_i^\text{T} (subject to \partial_i A_i^{\text{T}} = 0) or spin-2 graviton field h_{ij}^{\text{TT}} physically observable? That is, can we perform a direct measurement on them? P&S does not tell us, but although the commutators of free scalar fields vanish outside the light cone — they obey micro-causality — the helicity-1 and -2 photons and gravitons do not.

(13)     \left[ A_i^{\text{T}} [x], A_j^{\text{T}} [y] \right] \neq 0, \qquad \qquad (x-y)^2 < 0 ;

(13′)     \left[ h_{ij}^{\text{TT}}[x], h_{ab}^{\text{TT}}[y] \right] \neq 0, \qquad \qquad (x-y)^2 < 0 .

This is simply because their commutators are proportional to the difference between their corresponding retarded and advanced Green’s functions. As already alluded to after eq. (4), these retarded/advanced transverse Green’s functions are in fact non-zero outside the light cone. I end with the following question:

Can this violation of micro-causality by massless spin-1 photons be exploited within a physical setup?

Note added: I forgot to mention an interesting related discussion that took place over at Distler’s blog regarding micro-causality. My sense is, he knows a whole lot more than I do — but, sadly, he has closed his comments section for the post.


  • I. Racz, “Gravitational radiation and isotropic change of the spatial geometry,” arXiv:0912.0128 [gr-qc]
  • A. Ashtekar and B. Bonga, “On the ambiguity in the notion of transverse traceless modes of gravitational waves,” Gen. Rel. Grav. 49, no. 9, 122 (2017) doi:10.1007/s10714-017-2290-z [arXiv:1707.09914 [gr-qc]]
  • A. Ashtekar and B. Bonga, “On a basic conceptual confusion in gravitational radiation theory,” Class. Quant. Grav. 34, no. 20, 20LT01 (2017) doi:10.1088/1361-6382/aa88e2 [arXiv:1707.07729 [gr-qc]]
  • Y. Z. Chu and Y. W. Liu, “The Transverse-Traceless Spin-2 Gravitational Wave Cannot Be A Standalone Observable Because It Is Acausal,” arXiv:1902.03294 [gr-qc].
  • Y. Z. Chu, “More On Cosmological Gravitational Waves And Their Memories,”
    Class. Quant. Grav. 34, no. 19, 194001 (2017) doi:10.1088/1361-6382/aa8392
    [arXiv:1611.00018 [gr-qc]].
  • S. Weinberg, “Photons and gravitons in perturbation theory: Derivation of Maxwell’s and Einstein’s equations,” Phys. Rev. 138, B988 (1965).

Linear Displacement Gravitational Wave Memory

There has been a recent surge of interest in the phenomenon of gravitational memory, likely due to investigations undertaken by Andrew Strominger’s group at Harvard — see here for a pedagogical treatment — linking memory to symmetries and their corresponding Ward identities in the “soft limit” (i.e., where the gravitational signals’ frequencies are low/wavelengths are long). Gravitational memory itself was first discovered (as I understand it) by Zel’dovich and Polnarev: there is a permanent distortion of space due to stars scattering off each other on unbound trajectories.

I first stumbled upon this phenomenon myself while examining the causal structure of gravitational waves, namely how they propagate both on and within the null cone, in cosmological spacetimes. I found that the portion of gravitational waves (GWs) that travel inside the light cone (aka its “tail”) does not always decay with increasing distance from its source. This leads to a novel, albeit tiny, tail-induced gravitational memory that has no counterpart in the flat spacetime limit.

In this post, I will focus on linear displacement gravitational memory. As we shall see, this is the permanent distortion of space due to the passage of a primary GW train, induced directly by the matter source itself. As discovered by Christodoulou and independently by Blanchet and Damour, there is also a contribution to GW memory from the stress-energy of the GWs themselves, which is dubbed “nonlinear memory”; I hope to discuss this in a later post.

Synchronous gauge     To this end, we shall work in the synchronous gauge, because it allows us to readily discuss the proper geodesic length between two freely-falling observers at a given time. In particular, synchronous gauge refers to the coordinate system where the metric has no time-time and time-space components, namely,

(1):     ds^2 = dt^2 + g_{ij} dx^i dx^j .

The interpretation is that spacetime is foliated by the worldlines of free falling timelike trajectories (i.e., spatial point-like “observers”), with proper time t and spatial trajectories \vec{x}. This interpretation may be confirmed by verifying, co-moving timelike geodesics Z^\mu that have time independent spatial components:

(2):     Z^\mu = (t, Z^i), \qquad\qquad Z^i = \text{constant}

in fact satisfy the geodesic equation automatically.

What Is Displacement Gravitational Wave Memory?     Consider a pair of test masses (Z_1,Z_2) co-moving in a weakly curved spacetime,

(3):     ds^2 = dt^2 - (\delta_{ij} - h_{ij}) dx^i dx^j , \qquad \qquad|h_{ij}| \ll 1.

As we have discussed in a previous post, we may use Synge’s world function — which also defines the action for affinely-parametrized geodesics — to express the proper geodesic spatial distance L[t] between \vec{Z}_1 and \vec{Z}_2 at time t.

(4):     L[t] \approx R \left( 1 - \frac{1}{2} \widehat{R}^i \widehat{R}^j \int_0^1 h_{ij}\left[t, \vec{Z}_1 + \lambda (\vec{Z}_2-\vec{Z}_1)\right] d\lambda + \mathcal{O}[h^2] \right) ;


(5):     \widehat{R}^i \equiv \frac{\vec{Z}_1 - \vec{Z}_2}{R}, \qquad \qquad R \equiv |\vec{Z}_1-\vec{Z}_2| .

Therefore, there is a permanent distortion \Delta L \equiv L[t \to +\infty] - L[t \to -\infty], i.e., gravitational memory, if there is a non-trivial “DC-shift” of the gravitational perturbation between the two test masses over a large time period enveloping the duration of the primary GW train. The fractional distortion, in particular, is

(6):     \delta L[t]/R = - \frac{1}{2} \widehat{R}^i \widehat{R}^j \int_0^1 \Delta h_{ij}\left[\vec{Z}_1 + \lambda (\vec{Z}_2-\vec{Z}_1)\right] d\lambda ,


(6′):     \Delta h_{ij}\left[\vec{z}\right] \equiv h_{ij}\left[t \to +\infty, \vec{z}\right] - h_{ij}\left[t \to -\infty, \vec{z}\right] .

By setting up pairs of test masses with different orientations \widehat{R}^i, one can probe the full pattern of GW memory encoded within \Delta h_{ij}. In other words, the distortion of space is generically anisotropic.

Linear GW memory     Linear memory arises directly due to the matter source itself. Let r be the spatial distance between the observer and the center-of-mass of the said GW source. At finite frequencies \{ \omega \}, and non-self-gravitating sources, the synchronous gauge metric perturbation in the far zone |\omega|r \gg 1 and at first order in G_\text{N} reads

(7):     h_{ij} = h^{tt}_{ij} = -\left( P_{ia} P_{jb} - \frac{1}{2} P_{ij} P_{ab} \right) \frac{4 G_{\text{N}}}{r} \int_{\mathbb{R}^3} d^3 \vec{x}' T_{ab}[t-r+\vec{x}'\cdot\vec{x}/r, \vec{x}'] ,

with the projector

(7′):     P_{ij} \equiv \delta_{ij} - \widehat{r}_i \widehat{r}_i, \qquad \widehat{r}^i \equiv x^i/r.

Here, T_{ij} are the spatial components of the matter stress energy tensor. (Nonlinear memory would involve that of the GWs themselves.) If the source(s) were self-gravitating — for e.g., the binary systems held together by their mutual gravitational pull — then we may instead phrase the result in terms of the acceleration of the system’s quadrupole moment, at least in the non-relativistic limit:

(7”):     h_{ij} = h^{tt}_{ij} = -\left( P_{ia} P_{jb} - \frac{1}{2} P_{ij} P_{ab} \right) \frac{2 G_{\text{N}}}{r} \ddot{Q}_{ab}[t-r],


(7”’):    Q_{ab}[s] \equiv \int_{\mathbb{R}^3} d^3 \vec{x}'  x'^a x'^b T_{00}[s, \vec{x}'] .

Because t-r+\vec{x}'\cdot\vec{x}/r is essentially the retarded time (up to relativistic corrections), what eq. (7) (or, eq. (7”)) inserted into eq. (6) teaches us is that:

Since linear GWs propagate on the null cone in 4D Minkowski, the corresponding memory is really a probe of the difference in the asymptotic — i.e., far future versus far past — configurations of the matter source itself.

Causal Structure
Gravitational memory measured by the GW detector (world-line on the left) is the difference between the gravitational perturbation at C and that at A; for linear memory, this in turn probes the difference between the matter configuration (world-tube on the right) at C’ and that at A’. The dashed segment on the matter world-tube is the dominant duration of gravitational radiation production. Figure from arXiv: 1611.00018.

Cosmology     In a spatially flat FLRW spacetime, namely

(8):     g_{\mu\nu} = a[\eta]^2 \eta_{\mu\nu} , \qquad x^\mu \equiv (\eta,\vec{x}) .

I was able to show in arXiv:1504.06337 that the null cone portion of the massless scalar Green’s function takes a universal form, as the Minkowski Green’s function divided by 1 power of the scale factor each at the observer and emission time:

(9):     G^{\text{(light cone)}}_{\text{4D FLRW}}[x,x'] = \frac{\delta\left[ \eta-\eta' - |\vec{x}-\vec{x}'| \right]}{4\pi a[\eta] a[\eta'] |\vec{x}-\vec{x}'|} .

Now, in spatially flat cosmologies driven by perfect fluids, GWs obey a massless scalar wave equation. I also estimated that, for the most part, GW tails in cosmology are highly suppressed unless the time-duration of and observer distance to the source are of cosmological time/length scales. Altogether, these imply the GW memories known in 4D Minkowski spacetime should carry over to spatially flat FLRW, except for the redshift (from the 1/a[\eta]) due to cosmic expansion.


  • Zel’Dovich Y B and Polnarev A G 1974 Astron. Zh. 51 30 [Sov. Astron. 18, 17 (1974)]
  • A. Strominger, “Lectures on the Infrared Structure of Gravity and Gauge Theory,” arXiv:1703.05448 [hep-th].
  • D. Christodoulou, “Nonlinear nature of gravitation and gravitational-wave experiments,” Phys. Rev. Lett. 67, 1486 (1991)
  • Blanchet and Damour, 1989.
  • Y.Z. Chu, “Transverse traceless gravitational waves in a spatially flat FLRW universe: Causal structure from dimensional reduction,” Phys. Rev. D 92, no. 12, 124038 (2015) doi:10.1103/PhysRevD.92.124038 [arXiv:1504.06337 [gr-qc]].