1. Quantum Mechanics#

The goal of this chapter it to cover the basics of quantum mechanics, i.e., the quantum theory of particles.

1.1. What is a Quantum State?#

Quantum theory postulates that any physical state (of the world) can be represented by a ray in some complex Hilbert space. It’s worth noting that it is the state, rather than the Hilbert space, that we actually care about. Let’s write a state as

\[[\Psi] \coloneqq \{ c\Psi ~|~ c \in \Cbb \setminus 0 \}\]

where \(\Psi\) is a nonzero vector in the Hilbert space. It is, however, rather inconvenient to have to deal with \([\Psi]\) all the time. So instead, we will almost always pick a representative \(\Psi\), often out of a natural choice, and call it a state vector, and keep in mind that anything physically meaningful must not be sensitive to a scalar multiplication.

Assumption

Throughout this post we always assume that state vectors are normalized so that \(||\Psi|| = 1\).

In fact, we don’t really care about the states themselves either, because they are more of an abstraction rather than something one can physically measure. What we do care about are the (Hermitian) inner products between state vectors, denoted by \((\Psi, \Phi)\). According to the so-called Copenhagen interpretation of quantum mechanics, such inner product represents an amplitude, i.e., its squared norm gives the probability of finding a state \([\Psi]\) in \([\Phi]\) if we ever perform a measurement. We can write this statement as an equation as follows

\[P([\Psi] \to [\Phi]) = |(\Psi, \Phi)|^2\]

In particular, the probability of finding any state in itself is one, due to the normalization above.

1.2. What is a Symmetry?#

We start with a symmetry transformation, by which we mean a transformation that preserves all quantities that one can ever measure about a system. Since it is the probabilities, rather than the states themselves, that are measurable, one is led to define a quantum symmetry transformation as a transformation of states \(T\) such that

(1.2.1)#\[P(T[\Psi] \to T[\Phi]) = P([\Psi] \to [\Phi])\]

for any states \([\Psi]\) and \([\Phi]\). Now a theorem of E. Wigner asserts that such \(T\) can be realized either as a linear unitary or as an anti-linear anti-unitary transformation \(U = U(T)\) of state vectors in the sense that \([U\Psi] = T[\Psi]\) for any \(\Psi\). In other words, \(U\) satisfies either

\[U(c\Psi) = cU\Psi, \quad (U\Psi, U\Phi) = (\Psi, \Phi)\]

or

\[U(c\Psi) = c^{\ast} U\Psi, \quad (U\Psi, U\Phi) = (\Psi, \Phi)^{\ast}\]

where \(c\) is any complex number.

Proof of Wigner’s theorem

The construction of a realization \(U\) of \(T\) takes the following steps.

Step 1:

Fix an orthonormal basis \(\Psi_i, i \in \Nbb\), of the Hilbert space.

Step 2:

For each \(\Psi_i\), choose a unit vector \(\Psi'_i\) such that \(P[\Psi_i] = [\Psi'_i]\). Then \(\Psi'_i, i \in \Nbb\), also form an orthonormal basis by Eq.1.2.1. We’d like to define \(U\) by asking

\[U\Psi_i = \Psi'_i\]

for all \(i\), and extend by (anti-)linearity. But this is not going to realize \(T\) in general because we haven’t fixed the extra degrees of freedom – the phases of \(\Psi'_i\).

Step 3:

We fix the phases of \(\Psi'_k, k \geq 1\), relative to \(\Psi'_0\), by asking

\[T[\Psi_0 + \Psi_k] = [\Psi'_0 + \Psi'_k]\]

To see why this is possible, note first that \(T[\Psi_0 + \Psi_k] = [\alpha \Psi'_0 + \beta \Psi'_k]\), where \(\alpha, \beta\) are phase factors, due to Eq.1.2.1 and the basis being orthonormal. Now \([\alpha \Psi'_0 + \beta \Psi'_k] = [\Psi'_0 + (\beta/\alpha) \Psi'_k]\) and we can absorb the phase \(\beta/\alpha\) into the definition of \(\Psi'_k\). This is indeed the best one can do, because the last one degree of freedom, which is to multiply all \(\Psi'_i\) by a phase, cannot be fixed.

Step 4:

We have so far specified the value of \(U\) on all of \(\Psi_i, i \geq 0\), and \(\Psi_0 + \Psi_k, k \geq 1\). Notice that all the coefficients of \(\Psi\) are real. It is therefore instructive to ask what \(\Psi_0 + \ifrak \Psi_1\) should be. By the same argument as in the previous step, we can write

\[T[\Psi_0 + \ifrak \Psi_1] = [\Psi'_0 + c \Psi'_1]\]

where \(c\) is a phase. Let’s apply Eq.1.2.1 once again as follows

\[\begin{split}\sqrt{2} &= \left| \left( [\Psi_0 + \ifrak \Psi_1], [\Psi_0 + \Psi_1] \right) \right| \\ &= \left| \left( T[\Psi_0 + \ifrak \Psi_1], T[\Psi_0 + \Psi_1] \right) \right| \\ &= \left| \left( [\Psi'_0 + c \Psi'_1], [\Psi'_0 + \Psi'_1] \right) \right| \\ &= |1 + c|\end{split}\]

It follows that \(c = \pm \ifrak\), which correspond to \(U\) being (complex) linear or anti-linear, respectively.

At this point, we can extend \(U\) to either a linear or anti-linear map of the Hilbert space. But we’ll not be bothered about any further formal argument, including showing that (anti-)linearity must be coupled with (anti-)unitarity, respectively.

Note

The adjoint of a linear operator \(A\) is another linear operator \(A^{\dagger}\) such that

\[(\Psi, A\Phi) = (A^{\dagger} \Psi, \Phi)\]

for all any two state vectors \(\Psi\) and \(\Phi\). On the other hand, the adjoint of an anti-linear \(A\) is another anti-linear \(A^{\dagger}\) such that

\[(\Psi, A\Phi) = (A^{\dagger} \Psi, \Phi)^{\ast}\]

A (anti-)unitary operator \(U\) thus satisfies \(U^{\dagger} = U^{-1}\).

In general we’re not interested in just one symmetry transformation, but rather a group – whether continuous or discrete – of symmetry transformations, or just symmetry for short. In particular, if \(T_1, T_2\) are two symmetry transformations, then we’d like \(T_2 T_1\) to also be a symmetry transformation. In light of the \(U\)-realization of symmetry transformations discussed above, we can rephrase this condition as

(1.2.2)#\[U(T_2 T_1) \Psi = \exp(\ifrak \theta(T_1, T_2, \Psi)) U(T_2) U(T_1) \Psi\]

where \(\ifrak = \sqrt{-1}\), and \(\theta(T_1, T_2, \Psi)\) is an angle, which depends a priori on \(T_1, T_2\), and \(\Psi\).

It turns out, however, the angle \(\theta(T_1, T_2, \Psi)\) cannot depend on the state because if we apply Eq.1.2.2 to the sum of two linearly independent state vectors \(\Psi_A + \Psi_B\), then we’ll find

\[\exp(\pm \ifrak \theta(\Psi_A)) \Psi_A + \exp(\pm \ifrak \theta(\Psi_B)) \Psi_B = \exp(\pm \ifrak \theta(\Psi_A + \Psi_B)) (\Psi_A + \Psi_B)\]

where we have suppressed the dependency of \(\theta\) on \(T\), and the signs correspond to the cases of \(U\) being linear or anti-linear, respectively. In any case, it follows that

\[\exp(\pm \ifrak \theta(\Psi_A)) = \exp(\pm \ifrak \theta(\Psi_B)) = \exp(\pm \ifrak \theta(\Psi_A + \Psi_B))\]

which says nothing but the independence of \(\theta\) on \(\Psi\).

Todo

While the argument here appears to be purely mathematical, Weinberg pointed out in [Wei95] (page 53) the potential inabilities to create a state like \(\Psi_A + \Psi_B\). More precisely, he mentioned the general believe that it’s impossible to prepare a superposition of two states, one with integer total angular momentum and the other with half-integer total angular momentum, in which case there will be a “super-selection rule” between different classes of states. After all, one Hilbert space may just not be enough to describe all states. It’d be nice to elaborate a bit more on the super-selection rules.

We can now simplify Eq.1.2.2 to the following

\[U(T_2 T_1) = \exp(\ifrak \theta(T_1, T_2)) U(T_2) U(T_1)\]

which, in mathematical terms, says that \(U\) furnishes a projective representation of \(T\), or a representation up to a phase. It becomes a genuine representation if the phase is constantly one.

Assumption

We will assume that \(U\) furnishes a genuine representation of \(T\) unless otherwise stated, because it’s simpler and will be suffice for most scenarios of interest.

1.2.1. Continuous symmetry#

Besides a handful of important discrete symmetries such as the time, charge, and parity conjugations, most of the interesting symmetries come in a continuous family, mathematically known as Lie groups. Note that continuous symmetries are necessarily unitary (and linear) because they can be continuously deformed into the identity, which is obviously unitary.

In fact, it will be of great importance to just look at the symmetry up to the first order at the identity transformation, mathematically known as the Lie algebra. Let \(\theta\) be an element in the Lie algebra such that \(T(\theta) = 1 + \theta\) up to the first order. We can expand \(U(T(\theta))\) in a power series as follows

(1.2.3)#\[U(T(\theta)) = 1 + \ifrak \theta^a u_a + \tfrac{1}{2} \theta^a \theta^b u_{ab} + \cdots\]

where \(\theta^a\) are the (real) components of \(\theta\), and \(u_a\) are operators independent of \(\theta\), and as a convention, repeated indexes are summed up. Here we put a \(\ifrak\) in front of the linear term so that the unitarity of \(U\) implies that \(u_a\) are Hermitian.

Note

We’ve implicitly used a summation convention in writing Eq.1.2.3 that the same upper and lower indexes are automatically summed up. For example

\[\theta^a \theta^b u_{ab} \equiv \sum_{a, b} \theta^a \theta^b u_{ab}\]

This convention will be used throughout this note, unless otherwise specified.

Another noteworthy point is how one writes matrix or tensor elements using indexes. The point is that the indexes must come in certain order. This wouldn’t really cause a problem if all indexes are lower or upper. However, care must be taken when both lower and upper indexes appear. For example, an element written as \(M^a_b\) would be ambiguous as it’s unclear whether it refers to \(M_{ab}\) or \(M_{ba}\) assuming that one can somehow raise/lower the indexes. To avoid such ambiguity, one writes either \({M^a}_b\) or \({M_b}^a\).

This is a particularly convenient convention when dealing with matrix or tensor multiplications. For example, one can multiply two matrices as follows

\[{M^a}_b {N^b}_c = {(MN)^a}_c\]

while \({M^a}_b {N_c}^b\), though still summed up over \(b\), wouldn’t correspond to a matrix multiplication.

Now let \(\eta\) be another element of the Lie algebra, and expand both sides of the equality \(U(T(\eta)) U(T(\theta)) = U(T(\eta) T(\theta))\) as follows

\[\begin{split}U(T(\eta)) U(T(\theta)) &= \left( 1 + \ifrak \eta^a u_a + \tfrac{1}{2} \eta^a \eta^b u_{ab} + \cdots \right) \left( 1 + \ifrak \theta^a u_a + \tfrac{1}{2} \theta^a \theta^b u_{ab} + \cdots \right) \\ &= 1 + \ifrak (\eta^a + \theta^a) u_a \blue{- \eta^a \theta^b u_a u_b} + \cdots \\ U(T(\eta) T(\theta)) &= U \left( 1 + \eta + \theta + f_{ab} \eta^a \theta^b + \cdots \right) \\ &= 1 + \blue{\ifrak} \left( \eta^c + \theta^c + \blue{{f^c}_{ab} \eta^a \theta^b} + \cdots \right) \blue{u_c} \\ &\phantom{=} + \blue{\tfrac{1}{2}} \left( \blue{\eta^a + \theta^a} + \cdots \right) \left( \blue{\eta^b + \theta^b} + \cdots \right) \blue{u_{ab}} + \cdots\end{split}\]

where \({f^c}_{ab}\) are the coefficients of the expansion of \(T(f(\eta, \theta)) = T(\eta) T(\theta)\). Equating the coefficients of \(\eta^a \theta^b\), i.e., the terms colored in blue, we get

\[-u_a u_b = \ifrak {f^c}_{ab} u_c + u_{ab} \implies u_{ab} = -u_a u_b - \ifrak {f^c}_{ab} u_c.\]

It implies that one can calculate the higher-order operator \(u_{ab}\) from the lower-order ones, assuming of course that we know the structure of the symmetry (Lie) group/algebra. In fact, this bootstrapping procedure can be continued to all orders, but we’ll not be bothered about the details.

Next, note that \(u_{ab} = u_{ba}\) since they are just partial derivatives. It follows that

\[[u_a, u_b] \coloneqq u_a u_b - u_b u_a = \ifrak ({f^c}_{ba} - {f^c}_{ab}) u_c \eqqcolon \ifrak {C^c}_{ab} u_c\]

where the bracket is known as the Lie bracket and \({C^c}_{ab}\) are known as the structure constants.

We conclude the general discussion about continuous symmetry by considering a special, but important, case when \(T\) is additive in the sense that \(T(\eta) T(\theta) = T(\eta + \theta)\). Notable examples of such symmetry include translations and rotations about a fixed axis. In this case \(f\) vanishes, and it follows from Eq.1.2.3 that

(1.2.4)#\[U(T(\theta)) = \lim_{N \to \infty} (U(T(\theta / N)))^N = \lim_{N \to \infty} (1 + \ifrak \theta^a u_a / N)^N = \exp(\ifrak \theta^a u_a)\]

1.2.2. Lorentz symmetry#

A particularly prominent continuous symmetry in our physical world is the Lorentz symmetry postulated by Einstein’s special relativity, which supersedes the Galilean symmetry, which is respected by the Newtonian mechanics. We shall start from the classical theory of Lorentz symmetry, and then quantize it following the procedure discussed in the previous section.

1.2.2.1. Classical Lorentz symmetry#

Classical Lorentz symmetry is a symmetry that acts on the (flat) spacetime and preserves the so-called proper time

(1.2.5)#\[d\tau^2 \coloneqq dx_0^2 - dx_1^2 - dx_2^2 - dx_3^2 \eqqcolon -\eta^{\mu \nu} dx_{\mu} dx_{\nu}\]

where

  1. \(x_0\) is also known as the time, and sometimes denoted by \(t\), and

  2. the speed of light is set to \(1\), and

  3. \(\eta = \op{diag}(-1, 1, 1, 1)\) and the indexes \(\mu, \nu\) run from \(0\) to \(3\).

Note

  1. We will follow the common convention in physics that Greek letters such as \(\mu, \nu, \dots\) run from \(0\) to \(3\), while Roman letters such as \(i, j, \dots\) run from \(1\) to \(3\).

  2. We often write \(x\) for a spacetime point \((x_0, x_1, x_2, x_3)\), and \(\xbf\) for a spatial point \((x_1, x_2, x_3)\).

  3. A \(4\)-index, i.e., those named by Greek letters, of a matrix or a tensor can be raised or lowered by \(\eta\). For example, one can raise an index of a matrix \(M_{\mu \nu}\) by \(\eta^{\rho \mu} M_{\mu \nu} = {M^{\rho}}_{\nu}\) or \(\eta^{\rho \nu} M_{\mu \nu} = {M_{\mu}}^{\rho}\), such that the order of (regardless of upper or lower) indexes are kept.

Einstein’s special theory of relativity

Using the notations introduced above, we can rewrite Eq.1.2.5 as \(d\tau^2 = dt^2 - d\xbf^2\), so that it’s obvious that if a particle travels at the speed of light in one inertial frame, i.e., \(|d\xbf / dt| = 1\), and equivalently \(d\tau = 0\), then it travels at the speed of light in any other inertial frame, in direct contradiction with Newtonian mechanics.

Instead of working with the spacetime coordinates, it can sometimes be convenient to work with the “dual” energy-momentum coordinates, also known as the four momentum. The transition can be done by imagining a particle of mass \(m\), and defining \(p = (E, \pbf) \coloneqq m dx / d\tau\). It follows from Eq.1.2.5 that

(1.2.6)#\[1 = (dt / d\tau)^2 - (d\xbf / d\tau)^2 \implies m^2 = (m dt / d\tau)^2 - (m d\xbf / d\tau)^2 = E^2 - \pbf^2\]

which looks just like Eq.1.2.5, and indeed, the mass (in our convention) is invariant in all inertial frames.

One can also recover Newtonian mechanics at the low-speed limit (i.e., \(|\vbf| \ll 1\)) using \(d\tau / dt = \sqrt{1 - \vbf^2}\) as follows

(1.2.7)#\[\begin{split}\begin{alignat*}{2} \pbf &= m d\xbf / d\tau &&= \frac{m \vbf}{\sqrt{1 - \vbf^2}} = m \vbf + O(|\vbf|^3) \\ E &= m dt / d\tau &&= m + \tfrac{1}{2} m \vbf^2 + O(|\vbf|^4) \end{alignat*}\end{split}\]

More precisely, by a Lorentz transformation we mean an inhomogeneous linear transformation

\[L(\Lambda, a)x \coloneqq \Lambda x + a\]

which consists of a homogeneous part \(\Lambda\) and a translation by \(a\). The proper time is obviously preserved by any translation, and also by \(\Lambda\) if

(1.2.8)#\[\eta^{\mu \nu} dx_{\mu} dx_{\nu} = \eta^{\mu \nu} {\Lambda_{\mu}}^{\rho} {\Lambda_{\nu}}^{\kappa} dx_{\rho} dx_{\kappa} \ \implies \eta^{\mu \nu} = \eta^{\rho \kappa} {\Lambda_{\rho}}^{\mu} {\Lambda_{\kappa}}^{\nu}\]

for any \(\mu\) and \(\nu\). Moreover the group law is given by

\[L(\Lambda', a') L(\Lambda, a) x = L(\Lambda', a')(\Lambda x + a) = \Lambda' \Lambda x + \Lambda' a + a' = L(\Lambda' \Lambda, \Lambda' a + a') x\]

For later use, let’s also calculate the inverse matrix of \(\Lambda\) using Eq.1.2.8 as follows

(1.2.9)#\[\delta_{\sigma}^{\nu} = \eta_{\sigma \mu} \eta^{\mu \nu} = \eta_{\sigma \mu} \eta^{\rho \kappa} {\Lambda_{\rho}}^{\mu} {\Lambda_{\kappa}}^{\nu} \implies {(\Lambda^{-1})_{\sigma}}^{\kappa} = \eta_{\sigma\mu} \eta^{\rho\kappa} {\Lambda_{\rho}}^{\mu} = {\Lambda^{\kappa}}_{\sigma}\]

Now we’ll take a look at the topology of the group of homogeneous Lorentz transformations. Taking determinant on both sides of Eq.1.2.8, we see that \(\op{det}(\Lambda) = \pm 1\). Moreover, setting \(\mu = \nu = 0\), we have

\[1 = \left( {\Lambda_0}^0 \right)^2 - \sum_{i=1}^3 \left( {\Lambda_i}^0 \right) \implies \left| {\Lambda_0}^0 \right| \geq 1\]

It follows that the homogeneous Lorentz group has four components. In particular, the one with \(\op{det}(\Lambda) = 1\) and \({\Lambda_0}^0 \geq 1\) is the most common used and is given a name: proper orthochronous Lorentz group. Nonetheless, one can map one component to another by composing with either a time reversal transformation

(1.2.10)#\[\Tcal: (t, \xbf) \mapsto (-t, \xbf)\]

or a space reversal transformation

(1.2.11)#\[\Pcal: (t, \xbf) \mapsto (t, -\xbf)\]

or both.

So far everything have been rather abstract, but in fact, the (homogeneous) Lorentz group can be understood quite intuitively. There are basically two building blocks: one is a rotation in the \(3\)-space, which says that the space is homogeneous in all (spatial) directions, and the other is a so-called boost, which says that, as G. Galileo originally noted, one cannot tell if a system is at rest or is moving in a constant velocity without making a reference to outside of the system. To spell out the details, let’s consider a rest frame with \(d\xbf = 0\) and a moving frame with \(d\xbf' / dt' = \vbf\). Then the transformation \(dx' = \Lambda dx\) can be simplified as

\[dt' = {\Lambda_0}^0 dt, \quad dx'_i = {\Lambda_i}^0 dt \implies {\Lambda_i}^0 = v_i {\Lambda_0}^0\]

Then using Eq.1.2.8, we get

(1.2.12)#\[\begin{split}1 &= -\eta^{\mu \nu} {\Lambda_{\mu}}^0 {\Lambda_{\nu}}^0 \\ &= \left( {\Lambda_0}^0 \right)^2 - \left( {\Lambda_i}^0 \right)^2 \\ &= \left( 1 - \vbf^2 \right) \left( {\Lambda_0}^0 \right)^2 \implies {\Lambda_0}^0 = \frac{1}{\sqrt{1 - \vbf^2}} \eqqcolon \gamma\end{split}\]

assuming \(\Lambda\) is proper orthochronous. It follows that

(1.2.13)#\[{\Lambda_i}^0 = -{\Lambda^0}_i = \gamma v_i\]

The other components \({\Lambda_i}^j, 1 \leq i, j \leq 3\), are not uniquely determined because a composition with a (spatial) rotation about the direction of \(\vbf\) has no effect on \(\vbf\). To make it easier, one can apply a rotation so that \(\vbf\) aligns with the \(3\)-axis. Then an obvious choice of \(\Lambda\) is given by

(1.2.14)#\[\begin{split}\begin{alignat*}{2} t' &= {\Lambda_0}^{\mu} x_{\mu} &&= \gamma (t + v_3 x_3) \\ x'_1 &= {\Lambda_1}^{\mu} x_{\mu} &&= x_1 \\ x'_2 &= {\Lambda_2}^{\mu} x_{\mu} &&= x_2 \\ x'_3 &= {\Lambda_3}^{\mu} x_{\mu} &&= \gamma (x_3 + v_3 t) \end{alignat*}\end{split}\]
Time dilation and length contraction

A few consequences can be drawn from the boost transformation, most notably the effects of time dilation and length contraction. The time dilation, i.e., a clock ticks slower in a moving frame than in a rest frame, is quite obvious from Eq.1.2.13 and the fact that \(\gamma > 1\). But the length contraction requires some elaboration.

To be more concrete, let’s consider a rode of some fixed length. To measure the length, the measurement must be done simultaneously at the two ends of the rod. This constraint causes not much trouble in a rest frame, but must be taken care of in a moving frame since being simultaneous is not a Lorentz invariant property. Let \(x = (t, \xbf)\) and \(y = (t', \ybf)\) be the two endpoints of the rod in the rest frame, so that the length is \(|\xbf - \ybf|\) regardless of whether \(t\) and \(t'\) are the same or not. Under the Lorentz transformation defined by Eq.1.2.14, they become

\[\begin{split}\Lambda x &= (\gamma (t + v_3 x_3), x_1, x_2, \gamma(x_3 + v_3 t)) \\ \Lambda y &= (\gamma (t' + v_3 y_3), y_1, y_2, \gamma(y_3 + v_3 t'))\end{split}\]

respectively. Setting the equal-time condition \((\Lambda x)_0 = (\Lambda y)_0\) gives \(t' = t + v_3 (x_3 - y_3)\). Substituting it into \((\Lambda x)_3\) and \((\Lambda y)_3\) then gives

\[|(\Lambda x)_3 - (\Lambda y)_3| = \gamma \left| x_3 - y_3 - v_3^2 (x_3 - y_3) \right| = \frac{|x_3 - y_3|}{\gamma} < |x_3 - y_3|\]

This calculation says that the length of rod is contracted in the direction of movement.

It should be emphasized that such contraction of length can only be observed in a frame where the rod is moving. Imagine for example a scenario where you’re given a square box with equal sides while standing still, then after some unconscious period of time, e.g., sleeping, you wake up with the same box in hand, and you’d like to know if you’re now moving. If you happen to have heard of such contraction of length, you might try to measure the sides of the box again. If one of the sides suddenly becomes shorter, then you know not only that you’re moving, but also the direction of movement! This is of course absurd because the box is still at rest relative to you.

Finally, one can apply a rotation to Eq.1.2.14 to get the general formula

(1.2.15)#\[\Lambda_{ij} = \delta_{ij} + \frac{v_i v_j}{\vbf^2} (\gamma - 1)\]

for \(1 \leq i, j \leq 3\), which, together with Eq.1.2.13 and \({\Lambda_0}^i = {\Lambda_i}^0,\) gives the general formula for \(\Lambda\).

Note

Any Lorentz transformation can be written as the composition of a boost followed by a rotation.

1.2.2.2. Quantum Lorentz symmetry#

We will quantize the Lorentz symmetry \(L(\Lambda, a)\) by looking for unitarity representations \(U(\Lambda, a)\). As discussed in Continuous symmetry, we proceed by looking for infinitesimal symmetries. First of all, let’s expand \(\Lambda\) as

(1.2.16)#\[{\Lambda_{\mu}}^{\nu} = {\delta_{\mu}}^{\nu} + {\omega_{\mu}}^{\nu} + \cdots\]

where \(\delta\) is the Kronecker delta, and not a tensor. It follows from \(\eta^{\mu \nu} = \eta^{\rho \kappa} {\Lambda_{\rho}}^{\mu} {\Lambda_{\kappa}}^{\nu}\) that

(1.2.17)#\[\begin{split}\eta^{\mu \nu} &= \eta^{\rho \kappa} ({\delta_{\rho}}^{\mu} + {\omega_{\rho}}^{\mu} + \cdots) ({\delta_{\kappa}}^{\nu} + {\omega_{\kappa}}^{\nu} + \cdots) \\ &= \eta^{\mu \nu} + \eta^{\mu \kappa} {\omega_{\kappa}}^{\nu} + \eta^{\nu \rho} {\omega_{\rho}}^{\mu} + \cdots \\ &= \eta^{\mu \nu} + \omega^{\mu \nu} + \omega^{\nu \mu} + \cdots\end{split}\]

Comparing the first order terms shows that \(\omega^{\mu \nu} = -\omega^{\nu \mu}\) is anti-symmetric. It is therefore more convenient to use \(\omega^{\mu \nu}\), rather than \(\omega_{\mu}^{\nu}\), as the infinitesimal parameters in the expansion of \(\Lambda\).

Note

A count of free parameters shows that the inhomogeneous Lorentz symmetry has \(10\) degrees of freedom, \(4\) of which come from the translation, and the rest \(6\) come from the rank-\(2\) anti-symmetric tensor \(\omega\).

We first postulate that \(U(1, 0) = 1\) is the identity operator because the Lorentz transformation itself is the identity. Then we can write the power series expansion up to first order as follows

(1.2.18)#\[U(1 + \omega, \epsilon) = 1 - \ifrak \epsilon^{\mu} P_{\mu} + \frac{\ifrak}{2} \omega^{\mu \nu} J_{\mu \nu} + \cdots\]

Here we have inserted \(\ifrak\) as usual so that the unitarity of \(U\) implies that both \(P_{\mu}\) and \(J_{\mu \nu}\) are Hermitian. Moreover, since \(\omega^{\mu \nu}\) is anti-symmetric, we can assume the same holds for \(J_{\mu \nu}\).

Note

Since we are expanding \(U(1 + \epsilon)\) which is complex linear, the operators \(P\) and \(J\) are also complex linear. Hence we can freely move \(\ifrak\) around these operators in calculations that follow. However, this will become an issue when we later consider other operators such as the space and time inversions, which can potentially be either complex linear or anti-linear. In the later case, a sign needs to be added when commuting with the multiplication by \(\ifrak\).

Let’s evaluate how the expansion transformations under conjugation

\[\begin{split}U(\Lambda, a) U(1 + \omega, \epsilon) U^{-1}(\Lambda, a) &= U(\Lambda, a) U(1 + \omega, \epsilon) U(\Lambda^{-1}, -\Lambda^{-1} a) \\ &= U(\Lambda, a) U((1 + \omega) \Lambda^{-1}, \epsilon - (1 + \omega) \Lambda^{-1} a) \\ &= U(1 + \Lambda \omega \Lambda^{-1}, \Lambda \epsilon - \Lambda \omega \Lambda^{-1} a) \\ &= 1 - \ifrak ({\Lambda^{\rho}}_{\mu} \epsilon^{\mu} - (\Lambda \omega \Lambda^{-1})^{\rho \kappa} a_{\kappa}) P_{\rho} \ + \tfrac{\ifrak}{2} (\Lambda \omega \Lambda^{-1})^{\rho \kappa} J_{\rho \kappa} + \cdots \\ &= 1 -\ifrak \epsilon^{\mu} {\Lambda^{\rho}}_{\mu} P_{\rho} + \tfrac{\ifrak}{2} (\Lambda \omega \Lambda^{-1})^{\rho \kappa} (J_{\rho \kappa} + 2a_{\kappa} P_{\rho}) + \cdots \\ &= 1 -\ifrak \epsilon^{\mu} {\Lambda^{\rho}}_{\mu} P_{\rho} + \tfrac{\ifrak}{2} {\Lambda^{\rho}}_{\mu} \omega^{\mu \nu} {\Lambda^{\kappa}}_{\nu} (J_{\rho \kappa} + 2a_{\kappa} P_{\rho}) + \cdots\end{split}\]

where we have used Eq.1.2.9 for \(\Lambda^{-1}\). Substituting \(U(1 + \omega, \epsilon)\) with the expansion Eq.1.2.18 and equating the coefficients of \(\epsilon^{\mu}\) and \(\omega_{\mu \nu}\), we have

(1.2.19)#\[\begin{split}U(\Lambda, a) P_{\mu} U^{-1}(\Lambda, a) &= {\Lambda^{\rho}}_{\mu} P_{\rho} \\ U(\Lambda, a) J_{\mu \nu} U^{-1}(\Lambda, a) &= {\Lambda^{\rho}}_{\mu} {\Lambda^{\kappa}}_{\nu} (J_{\rho \kappa} + a_{\kappa} P_{\rho} - a_{\rho} P_{\kappa})\end{split}\]

where in the second equation, we have also made the right-hand-side anti-symmetric with respect to \(\mu\) and \(\nu\). It’s now clear that \(P\) transforms like a vector and is translation invariant, while \(J\) transforms like a \(2\)-tensor only for homogeneous Lorentz transformations and is not translation invariant in general. These are of course as expected since both \(P\) and \(J\) are quantization of rather familiar objects, which we now spell out.

We start with \(P\) by writing \(H \coloneqq P_0\) and \(\Pbf \coloneqq (P_1, P_2, P_3)\). Then \(H\) is the energy operator, also know as the Hamiltonian, and \(\Pbf\) is the momentum \(3\)-vector. Similarly, let’s write \(\Kbf \coloneqq (J_{01}, J_{02}, J_{03})\) and \(\Jbf = (J_{23}, J_{31}, J_{12})\), as the boost \(3\)-vector and the angular momentum \(3\)-vector, respectively.

Now that we have named all the players (i.e., \(H, \Pbf, \Jbf, \Kbf\)) in the game, it remains to find out their mutual commutation relations since they should form a Lie algebra of the (infinitesimal) Lorentz symmetry. This can be done by applying Eq.1.2.19 to \(U(\Lambda, a)\) that is itself infinitesimal. More precisely, keeping up to first order terms, we have \({\Lambda^{\rho}}_{\mu} = {\delta^{\rho}}_{\mu} + {\omega^{\rho}}_{\mu}\) and \(a_{\mu} = \epsilon_{\mu}\). It follows that Eq.1.2.19, up to first order, can be written as follows

\[\begin{split}\left( {\delta^{\rho}}_{\mu} + {\omega^{\rho}}_{\mu} \right) P_{\rho} &= \left( 1 - \ifrak \epsilon^{\nu} P_{\nu} + \tfrac{\ifrak}{2} \omega^{\rho \kappa} J_{\rho \kappa} \right) P_{\mu} \left( 1 + \ifrak \epsilon^{\nu} P_{\nu} - \tfrac{\ifrak}{2} \omega^{\rho \kappa} J_{\rho \kappa} \right) \\ &= P_{\mu} - \ifrak \epsilon^{\nu} [P_{\mu}, P_{\nu}] - \tfrac{\ifrak}{2} \omega^{\rho \kappa} [P_{\mu}, J_{\rho \kappa}]\end{split}\]

Equating the coefficients of \(\epsilon\) and \(\omega\) gives the following

(1.2.20)#\[\begin{split}[P_{\mu}, P_{\nu}] &= 0 \label{eq_bracket_p4_p4} \\ [P_{\mu}, J_{\rho \kappa}] &= \ifrak (\eta_{\kappa \mu} P_{\rho} - \eta_{\rho \mu} P_{\kappa})\end{split}\]

where for the second identity, we’ve also used the fact that \(J_{\rho \kappa} = -J_{\kappa \rho}\).

Similarly, expanding Eq.1.2.19 up to first order, we have

\[\begin{split}J_{\mu \nu} + \epsilon_{\nu} P_{\mu} - \epsilon_{\mu} P_{\nu} + {\omega^{\rho}}_{\mu} J_{\rho \nu} + {\omega^{\kappa}}_{\nu} J_{\mu \kappa} &= ({\delta^{\rho}}_{\mu} + {\omega^{\rho}}_{\mu}) ({\delta^{\kappa}}_{\nu} + {\omega^{\kappa}}_{\nu}) (J_{\rho \kappa} + \epsilon_{\kappa} P_{\rho} - \epsilon_{\rho} P_{\kappa}) \\ &= \left( 1 - \ifrak \epsilon^{\rho} P_{\rho} + \tfrac{\ifrak}{2} \omega^{\rho \kappa} J_{\rho \kappa} \right) J_{\mu \nu} \left( 1 + \ifrak \epsilon^{\rho} P_{\rho} - \tfrac{\ifrak}{2} \omega^{\rho \kappa} J_{\rho \kappa} \right) \\ &= J_{\mu \nu} - \ifrak \epsilon^{\rho} [P_{\rho}, J_{\mu \nu}] + \tfrac{\ifrak}{2} \omega^{\rho \kappa} [J_{\rho \kappa}, J_{\mu \nu}]\end{split}\]

Equating the coefficients of \(\epsilon\) reproduces Eq.1.2.20, but equating the coefficients of \(\omega\) gives the following additional

(1.2.21)#\[[J_{\rho \kappa}, J_{\mu \nu}] = \ifrak (\eta_{\rho \nu} J_{\mu \kappa} - \eta_{\rho \mu} J_{\nu \kappa} - \eta_{\kappa \mu} J_{\rho \nu} - \eta_{\kappa \nu} J_{\rho \mu})\]

Now that we have all the commutator relations, let’s reorganize Eq.1.2.20 and Eq.1.2.21 in terms of \(H, \Pbf, \Jbf, \Kbf\) as follows

(1.2.22)#\[\begin{split}[H, P_i] &= 0 \\ [H, J_i] &= 0 \\ [H, K_i] &= \ifrak P_i \\ [P_i, P_j] &= 0 \\ [P_i, J_j] &= \ifrak \epsilon_{ijk} P_k \\ [P_i, K_j] &= \ifrak \delta_{ij} H \\ [J_i, J_j] &= \ifrak \epsilon_{ijk} J_k \\ [J_i, K_j] &= \ifrak \epsilon_{ijk} K_k \\ [K_i, K_j] &= -\ifrak \epsilon_{ijk} J_k\end{split}\]

where \(\epsilon_{ijk}\) is totally anti-symmetric with respect to permutations of indexes and satisfies \(\epsilon_{123} = 1\). [1]

Note

The Lie algebra generated by \(H, \Pbf, \Jbf, \Kbf\) with commutation relations Eq.1.2.22 is known as the Poincaré algebra.

Since the time evolution of a physical system is dictated by the Hamiltonian \(H\), quantities that commute with \(H\) are conserved. In particular we see from Eq.1.2.22 that both momentum \(\Pbf\) and angular momentum \(\Jbf\) are conserved. Boosts \(\Kbf\), on the other hand, are not conserved, and therefore cannot be used to label (stable) physical states. Moreover, momenta (which generate translations) commute with each other, while angular momenta (which generate rotation) do not, indeed, they furnish an infinitesimal representation of the \(3\)-rotation group \(SO(3)\). This should be all consistent with our intuition.

1.3. One-Particle States#

One neat application of our knowledge about Lorentz symmetry is to classify (free) one-particle states according to their transformation laws under (inhomogeneous) Lorentz transformations. Throughout this section, the Lorentz transformations will be assumed to be proper orthochronous, i.e., \(\op{det}(\Lambda) = 1\) and \({\Lambda_0}^0 \geq 1\).

In order to do so, we need some labels to identify states, which are typically conserved quantities. According to the commutation relations between \(H, \Pbf\) and \(\Jbf\) obtained in the previous section, we see that \(p = (H, \Pbf)\) consists of mutually commutative conserved components, but not \(\Jbf\). Hence we can write our one-particle states as \(\Psi_{p, \sigma}\) such that

\[P_{\mu} \Psi_{p, \sigma} = p_{\mu}\]

where \(\sigma\) are additional labels such as spin components that we will later specify.

1.3.1. Reduction to the little group#

Let’s first consider translations \(U(1, a)\). Since translations form an abelian group, it follows from Eq.1.2.4 that

(1.3.1)#\[U(1, a) \Psi_{p, \sigma} = \exp(-\ifrak a^{\mu} P_{\mu}) \Psi_{p, \sigma} = \exp(-\ifrak a^{\mu} p_{\mu}) \Psi_{p, \sigma}\]

where the minus sign comes from our choice of expansion Eq.1.2.18. Hence it remains to consider the action of homogeneous Lorentz transformations. For the convenience of notation, let’s write \(U(\Lambda) \coloneqq U(\Lambda, 0)\). We would first like to know how \(U(\Lambda)\) affects the \(4\)-momentum. It follows from the following calculation (using Eq.1.2.19)

\[P_{\mu} U(\Lambda) \Psi_{p, \sigma} = U(\Lambda) (U^{-1} (\Lambda) P_{\mu} U(\Lambda)) \Psi_{p, \sigma} = U(\Lambda) {\Lambda_{\mu}}^{\nu} P_{\nu} \Psi_{p, \sigma} = \left( {\Lambda_{\mu}}^{\nu} p_{\nu} \right) U(\Lambda) \Psi_{p, \sigma}\]

that \(U(\Lambda) \Psi_{p, \sigma}\) has \(4\)-momentum \(\Lambda p\). Therefore we can write

(1.3.2)#\[U(\Lambda) \Psi_{p, \sigma} = C_{\sigma \sigma'} (\Lambda, p) \Psi_{\Lambda p, \sigma'}\]

where \(C_{\sigma \sigma'}\) furnishes a representation of \(\Lambda\) and \(p\) under straightforward transformation rules, and an implicit summation over \(\sigma'\) is assumed although it’s not a \(4\)-index.

Next we’d like to remove the dependency of \(C_{\sigma \sigma'}\) on \(p\) since, after all, it is \(\Lambda\) that carries the symmetry. We can achieve this by noticing that \(U(\Lambda)\) acts on the \(\Lambda\)-orbits of \(p\) transitively. The \(\Lambda\)-orbits of \(p\), in turn, are uniquely determined by the value of \(p^2\), and in the case of \(p^2 \leq 0\), also by the sign of \(p_0\). In light of Eq.1.2.6, we can pick a convenient representative \(k\) for each case as follows

Case

Standard \(k\)

Physical

\(p^2 = -M^2 < 0,~p_0 > 0\)

\((M, 0, 0, 0)\)

Yes

\(p^2 = -M^2 < 0,~p_0 < 0\)

\((-M, 0, 0, 0)\)

No

\(p^2 = 0,~p_0 > 0\)

\((1, 0, 0, 1)\)

Yes

\(p^2 = 0,~p_0 = 0\)

\((0, 0, 0, 0)\)

Yes

\(p^2 = 0,~p_0 < 0\)

\((-1, 0, 0, 1)\)

No

\(p^2 = N^2 > 0\)

\((0, N, 0, 0)\)

No

It turns out that only three of these cases are realized physically, and they correspond to the cases of a massive particle of mass \(M\), a massless particle and the vacuum, respectively. Since there is not much to say about the vacuum state, there are only two cases that we need to investigate.

With the choices of the standard \(k\) in hand, we need to make one more set of choices. Namely, we will choose for each \(p\) a standard Lorentz transformation \(L(p)\) such that \(L(p) k = p\). Such \(L(p)\) for a massive particle has been chosen in Eq.1.2.15, albeit in spacetime coordinates, and we’ll also handle the case of massless particles later. Once these choices have been made, we can define

(1.3.3)#\[\Psi_{p, \sigma} \coloneqq N(p) U(L(p)) \Psi_{k, \sigma}\]

where \(N(p)\) is a normalization factor to be determined later. In this way, we’ve also determined how \(\sigma\) depends on \(p\). Applying Eq.1.3.2 to Eq.1.3.3 we can refactor the terms as follows

(1.3.4)#\[\begin{split}U(\Lambda) \Psi_{p, \sigma} &= N(p) U(\Lambda) U(L(p)) \Psi_{k, \sigma} \\ &= N(p) U(L(\Lambda p)) U(L(\Lambda p)^{-1} \Lambda L(p)) \Psi_{k, \sigma}\end{split}\]

so that \(L(\Lambda p)^{-1} \Lambda L(p)\) maps \(k\) to itself, and hence \(U(L(\Lambda p)^{-1} \Lambda L(p))\) acts solely on \(\sigma\).

At this point, we have reduced the problem to the classification of representations of the so-called little group defined as the subgroup of (proper orthochronous) Lorentz transformations \(W\) that fixes \(k\), i.e., \({W_{\mu}}^{\nu} k_{\nu} = k_{\mu}\). Element in the little group is known as Wigner rotation (and hence \(W\)). More precisely, the task now is to find (unitary) representations \(D(W)\) such that

\[\sum_{\sigma'} D_{\sigma \sigma'}(W_1) D_{\sigma' \sigma''}(W_2) \Psi_{k, \sigma''} = D_{\sigma \sigma''}(W_1 W_2) \Psi_{k, \sigma''}\]

Once this is done, we can define

(1.3.5)#\[U(W) \Psi_{k, \sigma} \coloneqq \sum_{\sigma'} D_{\sigma' \sigma}(W) \Psi_{k, \sigma'}, \quad\text{where}~~ W(\Lambda, p) \coloneqq L(\Lambda p)^{-1} \Lambda L(p)\]
Validation of Eq.1.3.5

One can verify that Eq.1.3.5 indeed respects the group law as follows

\[\begin{split}U(W_2) U(W_1) \Psi_{k, \sigma} &= U(W_2) \sum_{\sigma'} D_{\sigma' \sigma}(W_1) \Psi_{k, \sigma'} \\ &= \sum_{\sigma' \sigma''} D_{\sigma' \sigma}(W_1) D_{\sigma'' \sigma'}(W_2) \Psi_{k, \sigma''} \\ &= \sum_{\sigma''} D_{\sigma'' \sigma}(W_2 W_1) \Psi_{k, \sigma''}\end{split}\]

Now we can rewrite Eq.1.3.4 (using Eq.1.3.3 and Eq.1.3.5) as follows

(1.3.6)#\[\begin{split}U(\Lambda) \Psi_{p, \sigma} &= N(p) U(L(\Lambda p)) U(W(\Lambda, p)) \Psi_{k, \sigma} \\ &= N(p) \sum_{\sigma'} D_{\sigma' \sigma}(W(\Lambda, p)) U(L(\Lambda p)) \Psi_{k, \sigma'} \\ &= \frac{N(p)}{N(\Lambda p)} \sum_{\sigma'} D_{\sigma' \sigma}(W(\Lambda, p)) \Psi_{\Lambda p, \sigma'}\end{split}\]

which gives the sought-after coefficients \(C_{\sigma \sigma'}\) in Eq.1.3.2.

It remains now, as far as the general discussion is concerned, to settle the normalization factor \(N(p)\). Indeed, it’d not have been needed at all if we’d like \(\Psi_{p, \sigma}\) be to orthonormal in the sense that

(1.3.7)#\[(\Psi_{p', \sigma'}, \Psi_{p, \sigma}) = \delta_{\sigma' \sigma} \delta(p' - p)\]

where the first delta is the Kronecker delta (for discrete indexes) and the second is the Dirac delta (for continuous indexes), since they are eigenvectors of the (Hermitian) operator \(P\). All we need is \(D_{\sigma \sigma'}\) being unitary as is obvious from Eq.1.3.6.

However, the Dirac delta in Eq.1.3.7 is tricky to use since \(p\) is constrained to the so-called mass shell, i.e., \(p_0 > 0\) together with \(p^2 = -M^2\) in the massive case and \(p^2 = 0\) in the massless case, respectively. Hence the actual normalization we’d like to impose on the one-particle states is, instead of Eq.1.3.7, the following

(1.3.8)#\[(\Psi_{p', \sigma'}, \Psi_{p, \sigma}) = \delta_{\sigma' \sigma} \delta(\pbf' - \pbf)\]

In fact, the problem eventually boils down to how to define the \(3\)-momentum space Dirac delta in a Lorentz-invariant manner.

Since \(\Psi_{p, \sigma}\) can be derived from \(\Psi_{k, \sigma}\) by Eq.1.3.3, we can first ask \(\Psi_{k, \sigma}\) to be orthonormal in the sense of Eq.1.3.8, where the Dirac delta plays no role, and then figure out how integration works on the mass shell (because Dirac delta is defined by integrals against test functions). As far as the mass shell integration is concerned, we can temporarily unify the massive and massless cases by allowing \(M \geq 0\). Consider a general mass shell integral of an arbitrary test function \(f(p)\)

\[\begin{split}\int d^4 p ~\delta(p^2 + M^2) \theta(p_0) f(p) &= \int d^3\pbf dp_0 ~\delta(p_0^2 - \pbf^2 - M^2) \theta(p_0) f(p_0, \pbf) \\ &= \int d^3\pbf ~\frac{f\left( \sqrt{\pbf^2 + M^2}, \pbf \right)}{2 \sqrt{\pbf^2 + M^2}}\end{split}\]

where \(\theta(p_0)\) is the step function defined to be \(0\) if \(p_0 \leq 0\) and \(1\) if \(p_0 > 1\). It follows that the Lorentz-invariant volume element in the \(3\)-momentum space is

(1.3.9)#\[\frac{d^3\pbf}{\sqrt{\pbf^2 + M^2}}\]

We can use it to find the Lorentz-invariant Dirac delta (marked in blue) as follows

\[\begin{split}f(\pbf') &\eqqcolon \int d^3\pbf ~\delta(\pbf' - \pbf) f(\pbf) \\ &= \int \frac{d^3\pbf}{\sqrt{\pbf^2 + M^2}} \blue{p_0 \delta(\pbf' - \pbf)} f(\pbf)\end{split}\]

It follows from Lorentz invariance that \(p_0 \delta(\pbf' - \pbf) = k_0 \delta(\kbf' - \kbf)\). Hence we can finally establish Eq.1.3.8 as follows

\[\begin{split}(\Psi_{p', \sigma'}, \Psi_{p, \sigma}) &= N(p) N(p')^{\ast} (U(L(p')) \Psi_{k', \sigma'}, U(L(p)) \Psi_{k, \sigma}) \\ &= |N(p)|^2 \delta_{\sigma' \sigma} \delta(\kbf' - \kbf) \\ &= \delta_{\sigma' \sigma} \delta(\pbf' - \pbf)\end{split}\]

if we define \(N(p) = \sqrt{k_0 / p_0}\).

Putting everything together, we’ve obtained the following grand formula for the Lorentz transformation law

(1.3.10)#\[U(\Lambda) \Psi_{p, \sigma} = \sqrt{\frac{(\Lambda p)_0}{p_0}} \sum_{\sigma'} D_{\sigma' \sigma}(W(\Lambda, p)) \Psi_{\Lambda p, \sigma'}\]

where \(D_{\sigma' \sigma}\) is a unitary representation of the little group, and \(W(\Lambda, p)\) is defined by Eq.1.3.5.

1.3.2. Massive particle states#

Recall the standard \(4\)-momentum \(k = (M, 0, 0, 0)\) in this case. Obviously the little group here is nothing but the \(3\)-rotation group \(SO(3)\). We can work out \(D_{\sigma \sigma'}(\Rcal)\) by a rotation \(\Rcal \in SO(3)\) up to first order as follows.

First write \(\Rcal^{ij} = \delta^{ij} + \Theta^{ij}\) such that \(\Theta\) is anti-symmetric. Then expand \(D_{\sigma \sigma'} (\Rcal)\) similar to Eq.1.2.18 up to first order as follows

\[D_{\sigma \sigma'} (\Rcal) = \delta_{\sigma \sigma'} + \tfrac{\ifrak}{2} \Theta^{ij} (J_{ij})_{\sigma \sigma'}\]

where \(J_{ij}\) is a collection of Hermitian operators that satisfy \(J_{ij} = -J_{ji}\) and the commutation relations Eq.1.2.22. It turns out that there exists an infinite number of such unitary representations indexed by nonnegative half-integers \(\jfrak = 0, \tfrac{1}{2}, 1, \tfrac{3}{2}, \cdots\), each of which has dimension \(2\jfrak + 1\). Choosing the \(3\)-axis as the preferred axis of (definite) spin, we can summarize the result as follows

(1.3.11)#\[D^{(\jfrak)}_{\sigma \sigma'} (\Rcal) = \delta_{\sigma \sigma'} + \tfrac{\ifrak}{2} \Theta^{ij} \left( J^{(\jfrak)}_{ij} \right)_{\sigma \sigma'}\]

where \(J^{\jfrak}_{ij}\) satisfy the following commutation relations

(1.3.12)#\[\begin{split}\left( J^{(\jfrak)}_{23} \pm \ifrak J^{(\jfrak)}_{31} \right)_{\sigma \sigma'} \equiv \left( J^{(\jfrak)}_1 \pm \ifrak J^{(\jfrak)}_2 \right)_{\sigma \sigma'} &= \delta_{\sigma \pm 1, \sigma'} \sqrt{(\jfrak \mp \sigma)(\jfrak \pm \sigma + 1)} \\ \left( J^{(\jfrak)}_{12} \right)_{\sigma \sigma'} \equiv \left( J^{(\jfrak)}_3 \right)_{\sigma \sigma'} &= \sigma \delta_{\sigma \sigma'} \label{eq_j3_matrix}\end{split}\]

where \(\sigma, \sigma'\) run through the values \(-\jfrak, -\jfrak + 1, \cdots, \jfrak - 1, \jfrak\).

We end the discussion about massive particle states by working out the little group elements \(W(\Lambda, p)\) defined by Eq.1.3.5. To this end, it suffices to work out the standard \(L(p)\) such that \(L(p) k = p\), where \(k = (M, 0, 0, 0)\). We have already worked out such a transformation in Eq.1.2.13 and Eq.1.2.15 in spacetime coordinates, so we only need to translate it into \(4\)-momentum coordinates.

Using Eq.1.2.7, we can rewrite \(\gamma\) defined by Eq.1.2.12 as follows

\[\pbf = \frac{M \vbf}{\sqrt{1 - \vbf^2}} \implies \gamma \coloneqq \frac{1}{\sqrt{1 - \vbf^2}} = \frac{\sqrt{M^2 + \pbf^2}}{M} \left( = \frac{p_0}{M} \right)\]

It follows that

(1.3.17)#\[\begin{split}{L(p)_0}^0 = {L(p)^0}_0 &= \gamma \\ {L(p)_i}^0 = -{L(p)^0}_i &= \frac{p_i}{M} \\ L(p)_{ij} &= \delta_{ij} + \frac{p_i p_j}{\pbf^2} (\gamma - 1)\end{split}\]

Finally, we note an important fact that when \(\Lambda = \Rcal\) is a \(3\)-rotation, then

(1.3.18)#\[W(\Rcal, p) = \Rcal\]

for any \(p\). To see this, we’ll work out how \(W(\Rcal, p)\) acts on \((1, \mathbf{0}), (0, \pbf)\), and \((0, \qbf)\), respectively, where \(\qbf\) is any \(3\)-vector perpendicular to \(\pbf\), as follows

\[\begin{split}\begin{alignat*}{2} W(\Rcal, p)(1, \mathbf{0}) &= L(\Rcal p)^{-1} \Rcal (\gamma, \pbf / M) &&= L(\Rcal p)^{-1} (\gamma, \Rcal p / M) &&= (1, \mathbf{0}) \\ W(\Rcal, p)(0, \pbf) &= L(\Rcal p)^{-1} \Rcal (\pbf^2 / M, \gamma \pbf) &&= L(\Rcal p)^{-1} (\pbf^2 / M, \gamma \Rcal p) &&= (0, \Rcal \pbf) \\ W(\Rcal, p)(0, \qbf) &= L(\Rcal p)^{-1} \Rcal (0, \qbf) &&= L(\Rcal p)^{-1} (0, \Rcal \qbf) &&= (0, \Rcal \qbf) \end{alignat*}\end{split}\]

where we have used that fact that \(\gamma\) is \(\Rcal\)-invariant.

This observation is important since it implies that non-relativistic calculations about angular momenta, such as the Clebsch-Gordan coefficients, can be literally carried over to the relativistic setting.

1.3.3. Massless particle states#

Recall the standard \(k = (1, 0, 0, 1)\) for massless particles. Our first task is to work out the little group, i.e., Lorentz transformations \(W\) such that \(Wk = k\). More precisely, we’ll work out the column vectors of \(W\) by thinking of them as the results of \(W\) acting on the standard basis vectors. Let’s start by \(v \coloneqq (1, 0, 0, 0)\), and perform the following calculations to \(Wv\) using properties of Lorentz transformation

\[\begin{split}\begin{alignat*}{2} (Wv)^{\mu} (Wv)_{\mu} &= v^{\mu} v_{\mu} &&= 1 \\ (Wv)^{\mu} k_\mu &= v^{\mu} k_{\mu} &&= 1 \end{alignat*}\end{split}\]

It follows that we can write \(Wv = (1 + c, a, b, c)\) with \(a^2 + b^2 = 2c\). Playing similar games to the other basis vectors, we can engineer a particular Lorentz transformation as follows

(1.3.27)#\[\begin{split}{S_{\mu}}^{\nu}(a, b) = \begin{bmatrix*}[r] 1 + c & a & b & -c \\ a & 1 & 0 & -a \\ b & 0 & 1 & -b \\ c & a & b & 1 - c \end{bmatrix*}\end{split}\]

which leaves \(k\) invariant, and satisfies \(Sv = Wv\). It follows that \(S^{-1} W\) must be a rotation about the \(3\)-axis, which can be written as follows

(1.3.28)#\[\begin{split}R(\theta) = \begin{bmatrix*}[r] 1 & 0 & 0 & 0 \\ 0 & \cos\theta & \sin\theta & 0 \\ 0 & -\sin\theta & \cos\theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix*}\end{split}\]

Hence we can write any element in the little group as \(W(a, b, \theta) = S(a, b) R(\theta)\).

The little group is \(~E^+(2)\)

Although not necessary for our purposes here, we’d like to better understand the little group for \(k = (1, 0, 0, 1)\) in terms of more familiar groups. It turns out that it’s isomorphic to the \(2\)-dimensional orientation-preserving Euclidean group \(E^+(2)\), i.e., the group of rotations and translations on the plane.

To see this, we go back to the defining property of \(W\) that it fixes \(k\). It follows that it must also fix the orthogonal complement \(k^{\bot}\) with respect to the bilinear form \(d\tau^2\) defined in Eq.1.2.5. Since \(k\) is orthogonal to itself, we can uniquely determine \(W\) by knowing its action on \((1, 1, 0, 1)\) and \((1, 0, 1, 1)\). Letting \(S(a, b)\) act on them, we see

\[\begin{split}S(a, b)(1, 1, 0, 1) &= (1 + a, 1, 0, 1 + a) \\ S(a, b)(1, 0, 1, 1) &= (1 + b, 0, 1, 1 + b)\end{split}\]

Hence \(S\) is isomorphic to a \(2\)-dimensional translation group. Moreover, the direction of translation is determined by the rotation on the plane spanned by the second the the third coordinates, which is nothing but \(R\).

Note that \(E^+ (2)\) is not semisimple in the sense that it possesses an abelian normal subgroup. Indeed, it’s obvious from the above discussion that a translation conjugated by a rotation is again a translation (in the rotated direction). The non-semisimplicity will have consequences on the representation as we will see below.

As in the massive case, we’ll work out \(D_{\sigma \sigma'}\) up to first order. To this end, note that up to first order

\[\begin{split}{W(a, b, \theta)_{\mu}}^{\nu} &= \left(1 + \begin{bmatrix*}[r] 0 & a & b & 0 \\ a & 0 & 0 & -a \\ b & 0 & 0 & -b \\ 0 & a & b & 0 \end{bmatrix*} \right) \left(1 + \begin{bmatrix*}[r] 0 & 0 & 0 & 0 \\ 0 & 0 & \theta & 0 \\ 0 & -\theta & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix*} \right) + \cdots \\ &= 1 + \begin{bmatrix*}[r] 0 & a & b & 0 \\ a & 0 & \theta & -a \\ b & -\theta & 0 & -b \\ 0 & a & b & 0 \end{bmatrix*} + \cdots\end{split}\]

where we’ve added the \(4\)-indexes since we recall from discussions in Quantum Lorentz symmetry that we must lift the \(\omega\) index to make it anti-symmetric. We now rewrite

\[\begin{split}W(a, b , \theta)^{\mu \nu} = \eta^{\mu \sigma} {W_{\sigma}}^{\nu} = 1 + \begin{bmatrix*}[r] 0 & -a & -b & 0 \\ a & 0 & \theta & -a \\ b & -\theta & 0 & -b \\ 0 & a & b & 0 \end{bmatrix*} + \cdots\end{split}\]

and spell out the expansion of \(D(a, b, \theta) \coloneqq D(W(a, b, \theta))\) as follows

(1.3.29)#\[D(a, b, \theta) = 1 + \ifrak aA + \ifrak bB + \ifrak \theta J_3\]

where

\[\begin{split}\begin{alignat*}{2} A &= -J_{01} - J_{13} &&= -K_1 + J_2 \\ B &= -J_{02} - J_{23} &&= -K_2 - J_1 \end{alignat*}\end{split}\]

Next we use Eq.1.2.22 to calculate commutation relations between \(A, B\) and \(J_3\) as follows

\[\begin{split}\begin{alignat*}{2} [J_3, A] &= -&&\ifrak K_2 &&- \ifrak J_1 &&= \ifrak B \\ [J_3, B] &= &&\ifrak K_1 &&- \ifrak J_2 &&= -\ifrak A \\ [A, B] &= -&&\ifrak J_3 &&+ \ifrak J_3 &&= 0 \end{alignat*}\end{split}\]

Since \(A, B\) commute, we can use their eigenvalues to label states as follows

\[\begin{split}A \Psi_{k, a, b} &= a \Psi_{k, a, b} \\ B \Psi_{k, a, b} &= b \Psi_{k, a, b}\end{split}\]

In fact, these states, corresponding to translation symmetries, come in continuous families as shown below

\[\begin{split}AU^{-1}(R(\theta)) \Psi_{a, b, k} &= (a\cos\theta - b\sin\theta)U^{-1}(R(\theta)) \Psi_{a, b, k} \\ BU^{-1}(R(\theta)) \Psi_{a, b, k} &= (a\sin\theta + b\cos\theta)U^{-1}(R(\theta)) \Psi_{a, b, k}\end{split}\]

According to [Wei95] (page 72), massless particle states are not observed to come in such \(S^1\)-families. Hence the only possibility is that \(a = b = 0\) and the only symmetry left then is \(J_3\), which corresponds to a rotation about the \(3\)-axis.

Unlike the \(SO(3)\)-symmetry discussed in Representations of angular momenta, representations of \(J_3\) alone cannot be characterized at the infinitesimal level, which would have resulted in a continuous spectrum. Instead, since a \(2\pi\)-rotation about the \(3\)-axis gives the identity transformation, one might expect an integer spectrum for \(J_3\). This is indeed the case if we assume the representation is genuine. However, since the Lorentz group is not simplify connected (with fundamental group \(\Zbb/2\)), one may encounter projective representations. Indeed, the \(2\pi\)-rotation about the \(3\)-axis represents a generator of the fundamental group, which has order \(2\), i.e., only the \(4\pi\)-rotation about the \(3\)-axis represents a contractible loop in the Lorentz group (see the Plate trick). As a result, the \(J_3\)-spectrum actually consists of half-integers, just like the spins. We can therefore write a general massless particle state as \(\Psi_{k, \sigma}\) such that

\[J_3 \Psi_{k, \sigma} = \sigma \Psi_{k, \sigma}\]

where \(\sigma\) are half-integers, known as the helicity.

Combining the discussions so far, we can write down the \(D\)-matrix defined by Eq.1.3.29 as follows

(1.3.30)#\[D_{\sigma \sigma'}(W(a, b, \theta)) = \exp(\ifrak \theta \sigma) \delta_{\sigma \sigma'}\]

where we recall \(W(a, b, \theta) = L(\Lambda p)^{-1} \Lambda L(p) = S(a, b)R(\theta)\). The Lorentz transformation formula Eq.1.3.10 for massless particles now becomes

(1.3.31)#\[U(\Lambda) \Psi_{p, \sigma} = \sqrt{\frac{(\Lambda p)_0}{p_0}} \exp(\ifrak \theta(\Lambda, p) \sigma) \Psi_{\Lambda p, \sigma}\]

In particular, we see that, unlike the spin \(z\)-component of massive particles, helicity is Lorentz invariant (at least under genuine representations). It is reasonable, therefore, to think of massless particles of different helicity as different particle species. Examples include photons with \(\sigma = \pm 1\) and gravitons with \(\sigma = \pm 2\), but not (anti-)neutrinos with hypothetical \(\sigma = \pm \tfrac{1}{2}\) as otherwise stated in [Wei95] (page 73 – 74), which are now known to have a nonzero mass. Here the \(\pm\) signs are related to the space-inversion symmetry Eq.1.2.11, which will be discussed in detail later.

In order to use Eq.1.3.30 for a general \((\Lambda, p)\), we first need to fix the choices of \(L(p)\) that takes the standard \(k = (1, 0, 0, 1)\) to \(p\). This can be done in two steps. First apply a (pure) boost along the \(3\)-axis

(1.3.32)#\[\begin{split}\begin{bmatrix} (p_0^2 + 1) / 2p_0 & 0 & 0 & (p_0^2 - 1) / 2p_0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ (p_0^2 - 1) / 2p_0 & 0 & 0 & (p_0^2 + 1) / 2p_0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} p_0 \\ 0 \\ 0 \\ p_0 \end{bmatrix} = \begin{bmatrix} p_0 \\ 0 \\ 0 \\ |\pbf| \end{bmatrix}\end{split}\]

Then apply a (pure) rotation that takes \((0, 0, |\pbf|)\) to \(\pbf\). However, in contrast to the massive case Eq.1.3.17, where \(L(p)\) depends continuously on \(p\), there exists no continuous family of rotations that take \((0, 0, |\pbf|)\) to any other \(3\)-vector (of the same length). Fortunately, any two choices of such rotations differ by (a pre-composition of) a rotation about the \(3\)-axis, which, according to Eq.1.3.30, only produces a physically immaterial phase factor.

1.3.4. Space and time inversions#

So far the discussions have been focused on orthochronous (and mostly homogeneous) Lorentz transformations, and in particular, the infinitesimal symmetries at the vicinity of the identity. Now it’s time to take a look at the space and time inversions, defined in Eq.1.2.11 and Eq.1.2.10, respectively, which takes us to the other components of the Lorentz group. The main goal is to understand their actions on the one-particle states, that have been worked out in the previous two sections.

Let’s write

\[U(\Pcal) \coloneqq U(\Pcal, 0), \quad U(\Tcal) \coloneqq U(\Tcal, 0)\]

for the corresponding quantum symmetry operators, which we haven’t decided whether should be complex linear or anti-linear. The same calculations that led to Eq.1.2.19 now give

(1.3.33)#\[\begin{split}U(\Pcal) \ifrak P_{\mu} U^{-1}(\Pcal) &= \ifrak {\Pcal_{\mu}}^{\rho} P_{\rho} \\ U(\Pcal) \ifrak J_{\mu \nu} U^{-1}(\Pcal) &= \ifrak {\Pcal_{\mu}}^{\rho} {\Pcal_{\nu}}^{\kappa} J_{\rho \kappa} \\ U(\Tcal) \ifrak P_{\mu} U^{-1}(\Tcal) &= \ifrak {\Tcal_{\mu}}^{\rho} P_{\rho} \\ U(\Tcal) \ifrak J_{\mu \nu} U^{-1}(\Tcal) &= \ifrak {\Tcal_{\mu}}^{\rho} {\Tcal_{\nu}}^{\kappa} J_{\rho \kappa}\end{split}\]

The complex (anti-)linearity of \(U(\Pcal)\) and \(U(\Tcal)\) can then be decided by the postulation that physically meaningful energy must not be negative. More precisely, recall that \(P_0\) is the energy operator. Then Eq.1.3.33 shows

\[U(\Pcal) \ifrak P_0 U^{-1}(\Pcal) = \ifrak P_0\]

If \(U(\Pcal)\) were anti-linear, then \(U(\Pcal) P_0 U^{-1}(\Pcal) = -P_0\). Then for any state \(\Psi\) with positive energy, i.e., \(P_0 \Psi = p_0 \Psi\), we would have a state \(U^{-1}(\Pcal) \Psi\) with negative energy \(-p_0\). Hence we conclude that \(U(\Pcal)\) must be linear. The same argument shows also that \(U(\Tcal)\) must be anti-linear (since \({\Tcal_0}^0 = -1\)).

As before, it’ll be useful to rewrite Eq.1.3.33 in terms of \(H, \Pbf, \Jbf, \Kbf\) as follows

(1.3.34)#\[\begin{split}\begin{alignat*}{3} U(\Pcal) &H U^{-1}(\Pcal) &&= &&H \\ U(\Pcal) &\Pbf U^{-1}(\Pcal) &&= -&&\Pbf \\ U(\Pcal) &\Jbf U^{-1}(\Pcal) &&= &&\Jbf \\ U(\Pcal) &\Kbf U^{-1}(\Pcal) &&= -&&\Kbf \\ U(\Tcal) &H U^{-1}(\Tcal) &&= &&H \\ U(\Tcal) &\Pbf U^{-1}(\Tcal) &&= -&&\Pbf \\ U(\Tcal) &\Jbf U^{-1}(\Tcal) &&= -&&\Jbf \\ U(\Tcal) &\Kbf U^{-1}(\Tcal) &&= &&\Kbf \\ \end{alignat*}\end{split}\]

One can (and should) try to reconcile these implications with commonsense. For example, the \(3\)-momentum \(\Pbf\) changes direction under either space or time inversion as expected. Moreover, the spin (of for example a basketball) remains the same under space inversion because both the direction of the axis and the handedness of the rotation get reversed simultaneously, but it gets reversed under time inversion because the direction of rotation is reversed if time flows backwards.

In what follows we will work out the effects of space and time inversions on massive and massless particles, respectively.

1.3.4.1. Space inversion for massive particles#

We start by considering a state at rest \(\Psi_{k, \sigma}\), where \(k = (M, 0, 0, 0)\) and \(\sigma\) is an eigenvalue of \(J_3\) under one of the spin representations discussed in Representations of angular momenta. Since the state is at rest and \(U(\Pcal)\) commutes with \(J_3\) according to Eq.1.3.33, we can write

(1.3.35)#\[U(\Pcal) \Psi_{k, \sigma} = \eta \Psi_{k, \sigma}\]

where \(\eta\) is a phase that depends a priori on \(\sigma\). It turns out, however, that \(\eta\) is actually independent of \(\sigma\), and hence justifies the notation, since \(U(\Pcal)\) commutes with the raising/lowering operators \(J_1 \pm \ifrak J_2\) by Eq.1.3.34.

To move on to the general case, we recall that the general formula Eq.1.3.3 takes the following form

\[\Psi_{p, \sigma} = \sqrt{\frac{M}{p_0}} U(L(p)) \Psi_{k, \sigma}\]

We can calculate as follows

(1.3.36)#\[U(\Pcal) \Psi_{p, \sigma} = \sqrt{\frac{M}{p_0}} U(\Pcal L(p) \Pcal^{-1}) U(\Pcal) \Psi_{k, \sigma} = \eta~\sqrt{\frac{M}{p_0}} U(L(\Pcal p)) \Psi_{k, \sigma} = \eta \Psi_{\Pcal p, \sigma}\]

which generalizes Eq.1.3.35. Such \(\eta\) is known as the intrinsic parity, which is intrinsic to a particle species.

1.3.4.2. Time inversion for massive particles#

Consider the same \(\Psi_{k, \sigma}\) as in the space inversion case. Now since \(U(\Tcal)\) anti-commutes with \(J_3\) according to Eq.1.3.33, we can write

\[U(\Tcal) \Psi_{k, \sigma} = \zeta_{\sigma} \Psi_{k, -\sigma}\]

where \(\zeta_{\sigma}\) is a phase. Applying the raising/lowering operators and using Eq.1.3.12, we can calculate the left-hand-side, recalling that \(U(\Tcal)\) is anti-linear, as follows

\[\begin{split}(J_1 \pm \ifrak J_2) U(\Tcal) \Psi_{k, \sigma} &= -U(\Tcal) (J_1 \mp \ifrak J_2) \Psi_{k, \sigma} \\ &= -U(\Tcal) \sqrt{(\jfrak \pm \sigma)(\jfrak \mp \sigma + 1)} \Psi_{k, \sigma \mp 1} \\ &= -\zeta_{\sigma \mp 1} \sqrt{(\jfrak \pm 1)(\jfrak \mp \sigma + 1)} \Psi_{k, -\sigma \pm 1}\end{split}\]

where \(\jfrak\) is the particle spin, and the right-hand-side as follows

\[(J_1 \pm \ifrak J_2) \zeta_{\sigma} \Psi_{k, -\sigma} = \zeta_{\sigma} \sqrt{(\jfrak \pm 1)(\jfrak \mp \sigma + 1)} \Psi_{k, -\sigma \pm 1}\]

Equating the two sides, we see that \(\zeta_{\sigma} = -\zeta_{\sigma \pm 1}\). Up to an overall phase, we can set \(\zeta_{\sigma} = \zeta (-1)^{\jfrak - \sigma}\) so that

\[U(\Tcal) \Psi_{k, \sigma} = \zeta (-1)^{\jfrak - \sigma} \Psi_{k, -\sigma}\]

Here we have chosen to keep the option of a physically inconsequential phase \(\zeta\) open. As in the case of space inversion, the formula generalizes to any \(4\)-momentum \(p\)

(1.3.37)#\[U(\Tcal) \Psi_{p, \sigma} = \zeta (-1)^{\jfrak - \sigma} \Psi_{\Pcal p, -\sigma}\]

since \(\Tcal L(p) \Tcal^{-1} = L(\Pcal p)\).

1.3.4.3. Space inversion for massless particles#

Let’s consider a state \(\Psi_{k, \sigma}\) with \(k = (1, 0, 0, 1)\) and \(\sigma\) being the helicity, i.e., \(J_3 \Psi_{k, \sigma} = \sigma \Psi_{k, \sigma}\). Since \(U(\Pcal)\) commutes with \(J_3\), the space inversion preserves \(\sigma\), just as in the massive case. However, since \(\Pcal\) reverses the direction of motion, the helicity in the direction of motion actually reverses sign. It follows, in particular, that (massless) particles that respect the space inversion symmetry must come in companion with another particle of opposite helicity.

To spell out more details, note that since \(\Pcal\) doesn’t fix \(k\), it’ll be convenient to introduce an additional rotation \(R_2\), which is defined to be a \(\pi\)-rotation about the \(2\)-axis, so that \(U(R_2) = \exp(\ifrak \pi J_2)\) and \(R_2 \Pcal k = k\). Since \(U(R_2)\) flips the sign of \(J_3\), as can be seen from the very definition of \(J_3\) in Eq.1.2.18, we have

\[U(R_2 \Pcal) \Psi_{k, \sigma} = \eta_{\sigma} \Psi_{k, -\sigma}\]

where we see indeed that the helicity reverses sign (when \(k\) is fixed).

To move on to the general case, recall that the \(L(p)\) that takes \(k\) to \(p\) consists of a boost \(B\) defined by Eq.1.3.32 followed by a (chosen) pure rotation \(R(\pbf)\) that takes \((0, 0, |\pbf|)\) to \(\pbf\). We calculate as follows

(1.3.38)#\[\begin{split}U(\Pcal) \Psi_{p, \sigma} &= p_0^{-1/2} U(\Pcal R(\pbf)B) \Psi_{k, \sigma} \\ &= p_0^{-1/2} U(R(\pbf) B R_2^{-1}) U(R_2 \Pcal) \Psi_{k, \sigma} \\ &= p_0^{-1/2} \eta_{\sigma} U(R(\pbf) R_2^{-1} B) \Psi_{k, -\sigma} \\ &= \eta_{\sigma} \rho \Psi_{\Pcal p, -\sigma}\end{split}\]

where \(\rho\) is an extra phase due to the fact that although \(R(\pbf) R_2^{-1}\) takes \((0, 0, |\pbf|)\) to \(-\pbf\), it may not be the chosen one.

To spell out \(\rho\), we need to be a bit more specific about \(R(\pbf)\). Following the usual convention of spherical coordinates, we can get from \((0, 0, |\pbf|)\) to \(\pbf\) by first rotate (according to the right-handed rule) about the \(1\)-axis at an angle \(0 \leq \phi \leq \pi\), known as the polar angle, and then rotate about the \(3\)-axis at an angle \(0 \leq \theta < 2\pi\), known as the azimuthal angle. Now since we know that \(R(\pbf)R_2^{-1}\) differs from \(R(-\pbf)\) by a rotation about the \(3\)-axis, we can figure the rotation out by examining their actions on some suitably generic \(\pbf\), for example, the unit vector along the \(2\)-axis, which is fixed by \(R_2\). In this case \(R(-\pbf)\) is a \(\pi/2\)-rotation about the \(1\)-axis, while \(R(\pbf)R_2^{-1}\) is the same \(\pi/2\)-rotation by the \(1\)-axis, followed by a a \(\pi\)-rotation about the \(3\)-axis. Therefore we conclude that the difference is a \(\pi\)-rotation about the \(3\)-axis. In other words, we should have \(\rho = \exp(-\ifrak \pi \sigma)\). However, recalling that the helicity \(\sigma\) may be a half-integer (thought not yet being found in Nature), there is a sign difference between \(\pm \pi\)-rotations about the \(3\)-axis. Without going into further details, we write down the general formula the space inversion as follows

(1.3.39)#\[U(\Pcal) \Psi_{p, \sigma} = \eta_{\sigma} \exp(\mp \ifrak \pi \sigma) \Psi_{\Pcal p, -\sigma}\]

where the sign depends on the sign of \(p_2\) (which can be seen by playing the same game as above with the negative unit vector along the \(2\)-axis).

1.3.4.4. Time inversion for massless particles#

Let \(k = (1, 0, 0, 1)\) as usual and consider the state \(\Psi_{k, \sigma}\). Since \(U(\Tcal)\) anti-commutes with both \(\Pbf\) and \(J_3\) by Eq.1.3.34, we have

\[U(\Tcal) \Psi_{k, \sigma} = \Psi_{\Pcal k, -\sigma}.\]

Composing with the rotation \(R_2\) as in the previous section to fix \(k\), we have

\[U(R_2 \Tcal) \Psi_{k, \sigma} = \zeta_{\sigma} \Psi_{k, \sigma}\]

where \(\zeta_{\sigma}\) is (yet another) phase. We see that, unlike the space inversion, the time inversion doesn’t produce a doublet of opposite helicity. Processing as in the space inversion case, one can derive the following general formula similar to Eq.1.3.39

(1.3.40)#\[U(\Tcal) \Psi_{p, \sigma} = \zeta_{\sigma} \exp(\pm \ifrak \pi \sigma) \Psi_{\Pcal p, \sigma}\]

where the sign depends on the sign of \(p_2\) as before.

1.3.4.5. Kramers’ degeneracy#

We end our discussion about space and time inversions of one-particle states with an interesting observation on the squared time inversion \(U(\Tcal)^2\). It follows from both Eq.1.3.37 and Eq.1.3.40 that

\[U(\Tcal)^2 \Psi_{p, \sigma} = (-1)^{2 s} \Psi_{p, \sigma}\]

where \(s \in \tfrac{1}{2} \Zbb\) equals the spin \(\jfrak\) in the massive case and the absolute helicity \(|\sigma|\) in the massless case.

Hence in a non-interacting system consisting of an odd number of half-integer spin/helicity particles and any number of integer spin/helicity particles, we have

(1.3.41)#\[U(\Tcal)^2 \Psi = -\Psi\]

Now for any eigenstate \(\Psi\) of the Hamiltonian, there is an accompanying eigenstate \(U(\Tcal) \Psi\) since \(U(\Tcal)\) commutes with the Hamiltonian. The key observation then is that they are necessarily different states! To see this, let’s suppose otherwise that \(U(\Tcal) \Psi = \zeta \Psi\) represent the same state, where \(\zeta\) is a phase. Then

\[U(\Tcal)^2 \Psi = U(\Tcal) \zeta \Psi = \zeta^{\ast} U(\Tcal) \Psi = |\zeta|^2 \Psi = \Psi\]

contradicts Eq.1.3.41.

As a conclusion, we see that for such systems, any energy eigenvalue has at least a two-fold degeneracy. This is known as Kramers’ degeneracy.

Footnotes