The Quantum Theory of Fields (S. Weinberg)#
Warning
This note is work in progress.
This note covers the three volumes [Wei95], [Wei96], and [Wei00], written by S. Weinberg on quantum field theory, with occasional references to [Wei15] by the same author. What I like the most about these books is his attempt to make logical deductions from the most basic principles, in particular, the principle of symmetries, rather than to make analogies to experience, e.g., from classical physics (from a historical perspective). Such an endeavor may not always be possible because, after all, physics is about how we interpret Nature based on nothing but experience, and not about how Nature actually works. By the same token, the arguments that are considered logical here should really be interpreted as “reasonable-sounding”, and have nothing to do with what mathematician would call “rigorous”.
The order of the sections correspond roughly to the order of the chapters of the books.
What is a Quantum State?#
Quantum theory postulates that any physical state (of the world) can be represented by a ray in some complex Hilbert space. It’s worth noting that it is the state, rather than the Hilbert space, that we actually care about. Let’s write a state as
where \(\Psi\) is a nonzero vector in the Hilbert space. It is, however, rather inconvenient to have to deal with \([\Psi]\) all the time. So instead, we will almost always pick a representative \(\Psi\), often out of a natural choice, and call it a state vector, and keep in mind that anything physically meaningful must not be sensitive to a scalar multiplication.
Assumption
Throughout this post we always assume that state vectors are normalized so that \(||\Psi|| = 1\).
In fact, we don’t really care about the states themselves either, because they are more of an abstraction rather than something one can physically measure. What we do care about are the (Hermitian) inner products between state vectors, denoted by \((\Psi, \Phi)\). According to the so-called Copenhagen interpretation of quantum mechanics, such inner product represents an amplitude, i.e., its squared norm gives the probability of finding a state \([\Psi]\) in \([\Phi]\) if we ever perform a measurement. We can write this statement as an equation as follows
In particular, the probability of finding any state in itself is one, due to the normalization above.
What is a Symmetry?#
We start with a symmetry transformation, by which we mean a transformation that preserves all quantities that one can ever measure about a system. Since it is the probabilities, rather than the states themselves, that are measurable, one is led to define a quantum symmetry transformation as a transformation of states \(T\) such that
for any states \([\Psi]\) and \([\Phi]\). Now a theorem of E. Wigner asserts that such \(T\) can be realized either as a linear unitary or as an anti-linear anti-unitary transformation \(U = U(T)\) of state vectors in the sense that \([U\Psi] = T[\Psi]\) for any \(\Psi\). In other words, \(U\) satisfies either
or
where \(c\) is any complex number.
Note
The adjoint of a linear operator \(A\) is another linear operator \(A^{\dagger}\) such that
for all any two state vectors \(\Psi\) and \(\Phi\). On the other hand, the adjoint of an anti-linear \(A\) is another anti-linear \(A^{\dagger}\) such that
A (anti-)unitary operator \(U\) thus satisfies \(U^{\dagger} = U^{-1}\).
In general we’re not interested in just one symmetry transformation, but rather a group – whether continuous or discrete – of symmetry transformations, or just symmetry for short. In particular, if \(T_1, T_2\) are two symmetry transformations, then we’d like \(T_2 T_1\) to also be a symmetry transformation. In light of the \(U\)-realization of symmetry transformations discussed above, we can rephrase this condition as
where \(\ifrak = \sqrt{-1}\), and \(\theta(T_1, T_2, \Psi)\) is an angle, which depends a priori on \(T_1, T_2\), and \(\Psi\).
It turns out, however, the angle \(\theta(T_1, T_2, \Psi)\) cannot depend on the state because if we apply \(\eqref{eq_u_depends_on_psi}\) to the sum of two linearly independent state vectors \(\Psi_A + \Psi_B\), then we’ll find
where we have suppressed the dependency of \(\theta\) on \(T\), and the signs correspond to the cases of \(U\) being linear or anti-linear, respectively. In any case, it follows that
which says nothing but the independence of \(\theta\) on \(\Psi\).
Todo
While the argument here appears to be purely mathematical, Weinberg pointed out in [Wei95] (page 53) the potential inabilities to create a state like \(\Psi_A + \Psi_B\). More precisely, he mentioned the general believe that it’s impossible to prepare a superposition of two states, one with integer total angular momentum and the other with half-integer total angular momentum, in which case there will be a “super-selection rule” between different classes of states. After all, one Hilbert space may just not be enough to describe all states. It’d be nice to elaborate a bit more on the super-selection rules.
We can now simplify \(\eqref{eq_u_depends_on_psi}\) to the following
which, in mathematical terms, says that \(U\) furnishes a projective representation of \(T\), or a representation up to a phase. It becomes a genuine representation if the phase is constantly one.
Assumption
We will assume that \(U\) furnishes a genuine representation of \(T\) unless otherwise stated, because it’s simpler and will be suffice for most scenarios of interest.
Continuous symmetry#
Besides a handful of important discrete symmetries such as the time, charge, and parity conjugations, most of the interesting symmetries come in a continuous family, mathematically known as Lie groups. Note that continuous symmetries are necessarily unitary (and linear) because they can be continuously deformed into the identity, which is obviously unitary.
In fact, it will be of great importance to just look at the symmetry up to the first order at the identity transformation, mathematically known as the Lie algebra. Let \(\theta\) be an element in the Lie algebra such that \(T(\theta) = 1 + \theta\) up to the first order. We can expand \(U(T(\theta))\) in a power series as follows
where \(\theta^a\) are the (real) components of \(\theta\), and \(u_a\) are operators independent of \(\theta\), and as a convention, repeated indexes are summed up. Here we put a \(\ifrak\) in front of the linear term so that the unitarity of \(U\) implies that \(u_a\) are Hermitian.
Note
We’ve implicitly used a summation convention in writing \(\eqref{eq_u_expansion}\) that the same upper and lower indexes are automatically summed up. For example
This convention will be used throughout this note, unless otherwise specified.
Another noteworthy point is how one writes matrix or tensor elements using indexes. The point is that the indexes must come in certain order. This wouldn’t really cause a problem if all indexes are lower or upper. However, care must be taken when both lower and upper indexes appear. For example, an element written as \(M^a_b\) would be ambiguous as it’s unclear whether it refers to \(M_{ab}\) or \(M_{ba}\) assuming that one can somehow raise/lower the indexes. To avoid such ambiguity, one writes either \({M^a}_b\) or \({M_b}^a\).
This is a particularly convenient convention when dealing with matrix or tensor multiplications. For example, one can multiply two matrices as follows
while \({M^a}_b {N_c}^b\), though still summed up over \(b\), wouldn’t correspond to a matrix multiplication.
Now let \(\eta\) be another element of the Lie algebra, and expand both sides of \(U(T(\eta)) U(T(\theta)) = U(T(\eta) T(\theta))\) as follows
where \({f^c}_{ab}\) are the coefficients of the expansion of \(T(f(\eta, \theta)) = T(\eta) T(\theta)\). Equating the coefficients of \(\eta^a \theta^b\), i.e., the terms colored in blue, we get
It implies that one can calculate the higher-order operator \(u_{ab}\) from the lower-order ones, assuming of course that we know the structure of the symmetry (Lie) group/algebra. In fact, this bootstrapping procedure can be continued to all orders, but we’ll not be bothered about the details.
Next, note that \(u_{ab} = u_{ba}\) since they are just partial derivatives. It follows that
where the bracket is known as the Lie bracket and \({C^c}_{ab}\) are known as the structure constants.
We conclude the general discussion about continuous symmetry by considering a special, but important, case when \(T\) is additive in the sense that \(T(\eta) T(\theta) = T(\eta + \theta)\). Notable examples of such symmetry include translations and rotations about a fixed axis. In this case \(f\) vanishes, and it follows from \(\eqref{eq_u_expansion}\) that
Lorentz symmetry#
A particularly prominent continuous symmetry in our physical world is the Lorentz symmetry postulated by Einstein’s special relativity, which supersedes the Galilean symmetry, which is respected by the Newtonian mechanics. We shall start from the classical theory of Lorentz symmetry, and then quantize it following the procedure discussed in the previous section.
Classical Lorentz symmetry#
Classical Lorentz symmetry is a symmetry that acts on the (flat) spacetime and preserves the so-called proper time
where
\(x_0\) is also known as the time, and sometimes denoted by \(t\), and
the speed of light is set to \(1\), and
\(\eta = \op{diag}(-1, 1, 1, 1)\) and the indexes \(\mu, \nu\) run from \(0\) to \(3\).
Note
We will follow the common convention in physics that Greek letters such as \(\mu, \nu, \dots\) run from \(0\) to \(3\), while Roman letters such as \(i, j, \dots\) run from \(1\) to \(3\).
We often write \(x\) for a spacetime point \((x_0, x_1, x_2, x_3)\), and \(\xbf\) for a spatial point \((x_1, x_2, x_3)\).
A \(4\)-index, i.e., those named by Greek letters, of a matrix or a tensor can be raised or lowered by \(\eta\). For example, one can raise an index of a matrix \(M_{\mu \nu}\) by \(\eta^{\rho \mu} M_{\mu \nu} = {M^{\rho}}_{\nu}\) or \(\eta^{\rho \nu} M_{\mu \nu} = {M_{\mu}}^{\rho}\), such that the order of (regardless of upper or lower) indexes are kept.
More precisely, by a Lorentz transformation we mean an inhomogeneous linear transformation
which consists of a homogeneous part \(\Lambda\) and a translation by \(a\). The proper time is obviously preserved by any translation, and also by \(\Lambda\) if
for any \(\mu\) and \(\nu\). Moreover the group law is given by
For later use, let’s also calculate the inverse matrix of \(\Lambda\) as follows
Now we’ll take a look at the topology of the group of homogeneous Lorentz transformations. Taking determinant on both sides of \(\eqref{eq_homogeneous_lorentz_transformation}\), we see that \(\op{det}(\Lambda) = \pm 1\). Moreover, setting \(\mu = \nu = 0\), we have
It follows that the homogeneous Lorentz group has four components. In particular, the one with \(\op{det}(\Lambda) = 1\) and \({\Lambda_0}^0 \geq 1\) is the most common used and is given a name: proper orthochronous Lorentz group. Nonetheless, one can map one component to another by composing with either a time reversal transformation
or a space reversal transformation
or both.
So far everything have been rather abstract, but in fact, the (homogeneous) Lorentz group can be understood quite intuitively. There are basically two building blocks: one is a rotation in the \(3\)-space, which says that the space is homogeneous in all (spatial) directions, and the other is a so-called boost, which says that, as G. Galileo originally noted, one cannot tell if a system is at rest or is moving in a constant velocity without making a reference to outside of the system. To spell out the details, let’s consider a rest frame with \(d\xbf = 0\) and a moving frame with \(d\xbf' / dt' = \vbf\). Then the transformation \(dx' = \Lambda dx\) can be simplified as
Then using \(\eqref{eq_homogeneous_lorentz_transformation}\), we get
assuming \(\Lambda\) is proper orthochronous. It follows that
The other components \({\Lambda_i}^j, 1 \leq i, j \leq 3\), are not uniquely determined because a composition with a (spatial) rotation about the direction of \(\vbf\) has no effect on \(\vbf\). To make it easier, one can apply a rotation so that \(\vbf\) aligns with the \(3\)-axis. Then an obvious choice of \(\Lambda\) is given by
Finally, one can apply a rotation to \(\eqref{eq_lambda_in_3_axis}\) to get the general formula
for \(1 \leq i, j \leq 3\), which, together with \(\eqref{eq_lambda_boost}\) and \({\Lambda_0}^i = {\Lambda_i}^0,\) gives the general formula for \(\Lambda\).
Note
Any Lorentz transformation can be written as the composition of a boost followed by a rotation.
Quantum Lorentz symmetry#
We will quantize the Lorentz symmetry \(L(\Lambda, a)\) by looking for unitarity representations \(U(\Lambda, a)\). As discussed in Continuous symmetry, we proceed by looking for infinitesimal symmetries. First of all, let’s expand \(\Lambda\) as
where \(\delta\) is the Kronecker delta, and not a tensor. It follows from \(\eta^{\mu \nu} = \eta^{\rho \kappa} {\Lambda_{\rho}}^{\mu} {\Lambda_{\kappa}}^{\nu}\) that
Comparing the first order terms shows that \(\omega^{\mu \nu} = -\omega^{\nu \mu}\) is anti-symmetric. It is therefore more convenient to use \(\omega^{\mu \nu}\), rather than \(\omega_{\mu}^{\nu}\), as the infinitesimal parameters in the expansion of \(\Lambda\).
Note
A count of free parameters shows that the inhomogeneous Lorentz symmetry has \(10\) degrees of freedom, \(4\) of which come from the translation, and the rest \(6\) come from the rank-\(2\) anti-symmetric tensor \(\omega\).
We first postulate that \(U(1, 0) = 1\) is the identity operator because the Lorentz transformation itself is the identity. Then we can write the power series expansion up to first order as follows
Here we have inserted \(\ifrak\) as usual so that the unitarity of \(U\) implies that both \(P_{\mu}\) and \(J_{\mu \nu}\) are Hermitian. Moreover, since \(\omega^{\mu \nu}\) is anti-symmetric, we can assume the same holds for \(J_{\mu \nu}\).
Note
Since we are expanding \(U(1 + \epsilon)\) which is complex linear, the operators \(P\) and \(J\) are also complex linear. Hence we can freely move \(\ifrak\) around these operators in calculations that follow. However, this will become an issue when we later consider other operators such as the space and time inversions, which can potentially be either complex linear or anti-linear. In the later case, a sign needs to be added when commuting with the multiplication by \(\ifrak\).
Let’s evaluate how the expansion transformations under conjugation
where we have used \(\eqref{eq_lambda_inverse}\) for \(\Lambda^{-1}\). Substituting \(U(1 + \omega, \epsilon)\) with the expansion \(\eqref{eq_u_lorentz_expansion}\) and equating the coefficients of \(\epsilon^{\mu}\) and \(\omega_{\mu \nu}\), we have
where in the second equation, we have also made the right-hand-side anti-symmetric with respect to \(\mu\) and \(\nu\). It’s now clear that \(P\) transforms like a vector and is translation invariant, while \(J\) transforms like a \(2\)-tensor only for homogeneous Lorentz transformations and is not translation invariant in general. These are of course as expected since both \(P\) and \(J\) are quantization of rather familiar objects, which we now spell out.
We start with \(P\) by writing \(H \coloneqq P_0\) and \(\Pbf \coloneqq (P_1, P_2, P_3)\). Then \(H\) is the energy operator, also know as the Hamiltonian, and \(\Pbf\) is the momentum \(3\)-vector. Similarly, let’s write \(\Kbf \coloneqq (J_{01}, J_{02}, J_{03})\) and \(\Jbf = (J_{23}, J_{31}, J_{12})\), as the boost \(3\)-vector and the angular momentum \(3\)-vector, respectively.
Now that we have named all the players (i.e., \(H, \Pbf, \Jbf, \Kbf\)) in the game, it remains to find out their mutual commutation relations since they should form a Lie algebra of the (infinitesimal) Lorentz symmetry. This can be done by applying \(\eqref{eq_p_conjugated_by_u}\) and \(\eqref{eq_j_conjugated_by_u}\) to \(U(\Lambda, a)\) that is itself infinitesimal. More precisely, keeping up to first order terms, we have \({\Lambda^{\rho}}_{\mu} = {\delta^{\rho}}_{\mu} + {\omega^{\rho}}_{\mu}\) and \(a_{\mu} = \epsilon_{\mu}\). It follows that \(\eqref{eq_p_conjugated_by_u}\), up to first order, can be written as follows
Equating the coefficients of \(\epsilon\) and \(\omega\) gives the following
where for the second identity, we’ve also used the fact that \(J_{\rho \kappa} = -J_{\kappa \rho}\).
Similarly, expanding \(\eqref{eq_j_conjugated_by_u}\) up to first order, we have
Equating the coefficients of \(\epsilon\) reproduces \(\eqref{eq_bracket_p4_j4}\), but equating the coefficients of \(\omega\) gives the following additional
Now that we have all the commutator relations, let’s reorganize \(\eqref{eq_bracket_p4_p4}, \eqref{eq_bracket_p4_j4}, \eqref{eq_bracket_j4_j4}\) in terms of \(H, \Pbf, \Jbf, \Kbf\) as follows
where \(\epsilon_{ijk}\) is totally anti-symmetric with respect to permutations of indexes and satisfies \(\epsilon_{123} = 1\). [1]
Note
The Lie algebra generated by \(H, \Pbf, \Jbf, \Kbf\) with commutation relations \(\eqref{eq_hp_commute}\) – \(\eqref{eq_kkj_commutation}\) is known as the Poincaré algebra.
Since the time evolution of a physical system is dictated by the Hamiltonian \(H\), quantities (i.e., observables) that commute with \(H\) are conserved. In particular \(\eqref{eq_hp_commute}\) and \(\eqref{eq_hj_commute}\) imply that both momentum and angular momentum are conserved. Boosts, on the other hand, are not conserved, and therefore cannot be used to label (stable) physical states. Moreover \(\eqref{eq_pp_commute}\) implies that translations commute with each other (as expected), which is not the case for the angular momenta according to \(\eqref{eq_jjj_commutation}\). Indeed, they furnish an infinitesimal representation of the \(3\)-rotation group \(SO(3)\).
One-Particle States#
One neat application of our knowledge about Lorentz symmetry is to classify (free) one-particle states according to their transformation laws under (inhomogeneous) Lorentz transformations. Throughout this section, the Lorentz transformations will be assumed to be proper orthochronous, i.e., \(\op{det}(\Lambda) = 1\) and \({\Lambda_0}^0 \geq 1\).
In order to do so, we need some labels to identify states, which are typically conserved quantities. According to the commutation relations between \(H, \Pbf\) and \(\Jbf\) obtained in the previous section, we see that \(p = (H, \Pbf)\) consists of mutually commutative conserved components, but not \(\Jbf\). Hence we can write our one-particle states as \(\Psi_{p, \sigma}\) such that
where \(\sigma\) are additional labels such as spin components that we will later specify.
Reduction to the little group#
Let’s first consider translations \(U(1, a)\). Since translations form an abelian group, it follows from \(\eqref{eq_additive_symmetry}\) that
where the minus sign comes from our choice of expansion \(\eqref{eq_u_lorentz_expansion}\). Hence it remains to consider the action of homogeneous Lorentz transformations. For the convenience of notation, let’s write \(U(\Lambda) \coloneqq U(\Lambda, 0)\). We would first like to know how \(U(\Lambda)\) affects the \(4\)-momentum. It follows from the following calculation
that \(U(\Lambda) \Psi_{p, \sigma}\) has \(4\)-momentum \(\Lambda p\). Therefore we can write
where \(C_{\sigma \sigma'}\) furnishes a representation of \(\Lambda\) and \(p\) under straightforward transformation rules, and an implicit summation over \(\sigma'\) is assumed although it’s not a \(4\)-index.
Next we’d like to remove the dependency of \(C_{\sigma \sigma'}\) on \(p\) since, after all, it is \(\Lambda\) that carries the symmetry. We can achieve this by noticing that \(U(\Lambda)\) acts on the \(\Lambda\)-orbits of \(p\) transitively. The \(\Lambda\)-orbits of \(p\), in turn, are uniquely determined by the value of \(p^2\), and in the case of \(p^2 \leq 0\), also by the sign of \(p_0\). In light of \(\eqref{eq_four_momentum_mass_identity}\), we can pick a convenient representative \(k\) for each case as follows
Case |
Standard \(k\) |
Physical |
---|---|---|
\(p^2 = -M^2 < 0,~p_0 > 0\) |
\((M, 0, 0, 0)\) |
Yes |
\(p^2 = -M^2 < 0,~p_0 < 0\) |
\((-M, 0, 0, 0)\) |
No |
\(p^2 = 0,~p_0 > 0\) |
\((1, 0, 0, 1)\) |
Yes |
\(p^2 = 0,~p_0 = 0\) |
\((0, 0, 0, 0)\) |
Yes |
\(p^2 = 0,~p_0 < 0\) |
\((-1, 0, 0, 1)\) |
No |
\(p^2 = N^2 > 0\) |
\((0, N, 0, 0)\) |
No |
It turns out that only three of these cases are realized physically, and they correspond to the cases of a massive particle of mass \(M\), a massless particle and the vacuum, respectively. Since there is not much to say about the vacuum state, there are only two cases that we need to investigate.
With the choices of the standard \(k\) in hand, we need to make one more set of choices. Namely, we will choose for each \(p\) a standard Lorentz transformation \(L(p)\) such that \(L(p) k = p\). Such \(L(p)\) for a massive particle has been chosen in \(\eqref{eq_general_lambda_in_spacetime}\), albeit in spacetime coordinates, and we’ll also handle the case of massless particles later. Once these choices have been made, we can define
where \(N(p)\) is a normalization factor to be determined later. In this way, we’ve also determined how \(\sigma\) depends on \(p\). Applying \(\eqref{eq_lorentz_acts_on_p_and_sigma}\) to \(\eqref{eq_def_of_one_particle_psi}\) we can refactor the terms as follows
so that \(L(\Lambda p)^{-1} \Lambda L(p)\) maps \(k\) to itself, and hence \(U(L(\Lambda p)^{-1} \Lambda L(p))\) acts solely on \(\sigma\).
At this point, we have reduced the problem to the classification of representations of the so-called little group defined as the subgroup of (proper orthochronous) Lorentz transformations \(W\) that fixes \(k\), i.e., \({W_{\mu}}^{\nu} k_{\nu} = k_{\mu}\). Element in the little group is known as Wigner rotation (and hence \(W\)). More precisely, the task now is to find (unitary) representations \(D(W)\) such that
Once this is done, we can define
One can verify that \(\eqref{eq_d_repr_of_little_group}\) indeed respects the group law as follows
Now we can rewrite \(\eqref{eq_def_of_one_particle_psi_refactored}\) as follows
which gives the sought-after coefficients \(C_{\sigma \sigma'}\) in \(\eqref{eq_lorentz_acts_on_p_and_sigma}\).
It remains now, as far as the general discussion is concerned, to settle the normalization factor \(N(p)\). Indeed, it’d not have been needed at all if we’d like \(\Psi_{p, \sigma}\) be to orthonormal in the sense that
where the first delta is the Kronecker delta (for discrete indexes) and the second is the Dirac delta (for continuous indexes), since they are eigenvectors of the (Hermitian) operator \(P\). All we need is \(D_{\sigma \sigma'}\) being unitary as is obvious from \(\eqref{eq_little_group_acts_on_p_and_sigma}\).
However, the Dirac delta in \(\eqref{eq_psi_p4_sigma_orthonormal}\) is tricky to use since \(p\) is constrained to the so-called mass shell, i.e., \(p_0 > 0\) together with \(p^2 = -M^2\) in the massive case and \(p^2 = 0\) in the massless case, respectively. Hence the actual normalization we’d like to impose on the one-particle states is, instead of \(\eqref{eq_psi_p4_sigma_orthonormal}\), the following
In fact, the problem eventually boils down to how to define the \(3\)-momentum space Dirac delta in a Lorentz-invariant manner.
Since \(\Psi_{p, \sigma}\) can be derived from \(\Psi_{k, \sigma}\) by \(\eqref{eq_def_of_one_particle_psi}\), we can first ask \(\Psi_{k, \sigma}\) to be orthonormal in the sense of \(\eqref{eq_psi_p3_sigma_orthonormal}\), where the Dirac delta plays no role, and then figure out how integration works on the mass shell (because Dirac delta is defined by integrals against test functions). As far as the mass shell integration is concerned, we can temporarily unify the massive and massless cases by allowing \(M \geq 0\). Consider a general mass shell integral of an arbitrary test function \(f(p)\)
where \(\theta(p_0)\) is the step function defined to be \(0\) if \(p_0 \leq 0\) and \(1\) if \(p_0 > 1\). It follows that the Lorentz-invariant volume element in the \(3\)-momentum space is
We can use it to find the Lorentz-invariant Dirac delta (marked in blue) as follows
It follows from Lorentz invariance that \(p_0 \delta(\pbf' - \pbf) = k_0 \delta(\kbf' - \kbf)\). Hence we can finally establish \(\eqref{eq_psi_p3_sigma_orthonormal}\) as follows
if we define \(N(p) = \sqrt{k_0 / p_0}\).
Putting everything together, we’ve obtained the following grand formula for the Lorentz transformation law
where \(D_{\sigma' \sigma}\) is a unitary representation of the little group, and \(W(\Lambda, p)\) is defined by \(\eqref{eq_w_from_l}\).
Massive particle states#
Recall the standard \(4\)-momentum \(k = (M, 0, 0, 0)\) in this case. Obviously the little group here is nothing but the \(3\)-rotation group \(SO(3)\). We can work out \(D_{\sigma \sigma'}(\Rcal)\) by a rotation \(\Rcal \in SO(3)\) up to first order as follows.
First write \(\Rcal^{ij} = \delta^{ij} + \Theta^{ij}\) such that \(\Theta\) is anti-symmetric. Then expand \(D_{\sigma \sigma'} (\Rcal)\) similar to \(\eqref{eq_u_lorentz_expansion}\) up to first order as follows
where \(J_{ij}\) is a collection of Hermitian operators that satisfy \(J_{ij} = -J_{ji}\) and the commutation relations \(\eqref{eq_jjj_commutation}\). It turns out that there exists an infinite number of such unitary representations indexed by nonnegative half-integers \(\jfrak = 0, \tfrac{1}{2}, 1, \tfrac{3}{2}, \cdots\), each of which has dimension \(2\jfrak + 1\). Choosing the \(3\)-axis as the preferred axis of (definite) spin, we can summarize the result as follows
where \(\sigma, \sigma'\) run through the values \(-\jfrak, -\jfrak + 1, \cdots, \jfrak - 1, \jfrak\).
We end the discussion about massive particle states by working out the little group elements \(W(\Lambda, p)\) defined by \(\eqref{eq_w_from_l}\). To this end, it suffices to work out the standard \(L(p)\) such that \(L(p) k = p\), where \(k = (M, 0, 0, 0)\). We have already worked out such a transformation in \(\eqref{eq_lambda_boost}\) and \(\eqref{eq_general_lambda_in_spacetime}\) in spacetime coordinates, so we only need to translate it into \(4\)-momentum coordinates.
Using \(\eqref{eq_p_from_v}\), we can rewrite \(\gamma\) defined by \(\eqref{eq_def_gamma}\) as follows
It follows that
Finally, we note an important fact that when \(\Lambda = \Rcal\) is a \(3\)-rotation, then
for any \(p\). To see this, we’ll work out how \(W(\Rcal, p)\) acts on \((1, \mathbf{0}), (0, \pbf)\), and \((0, \qbf)\), respectively, where \(\qbf\) is any \(3\)-vector perpendicular to \(\pbf\), as follows
where we have used that fact that \(\gamma\) is \(\Rcal\)-invariant.
This observation is important since it implies that non-relativistic calculations about angular momenta, such as the Clebsch-Gordan coefficients, can be literally carried over to the relativistic setting.
Massless particle states#
Recall the standard \(k = (1, 0, 0, 1)\) for massless particles. Our first task is to work out the little group, i.e., Lorentz transformations \(W\) such that \(Wk = k\). More precisely, we’ll work out the column vectors of \(W\) by thinking of them as the results of \(W\) acting on the standard basis vectors. Let’s start by \(v \coloneqq (1, 0, 0, 0)\), and perform the following calculations to \(Wv\) using properties of Lorentz transformation
It follows from \(\eqref{eq_vk_is_one}\) that we can write \(Wv = (1 + c, a, b, c)\), and then from \(\eqref{eq_vv_is_one}\) that \(c = (a^2 + b^2) / 2\). Playing similar games to the other basis vectors, we can engineer a particular Lorentz transformation as follows
which leaves \(k\) invariant, and satisfies \(Sv = Wv\). It follows that \(S^{-1} W\) must be a rotation about the \(3\)-axis, which can be written as follows
Hence we can write any element in the little group as \(W(a, b, \theta) = S(a, b) R(\theta)\).
As in the massive case, we’ll work out \(D_{\sigma \sigma'}\) up to first order. To this end, note that up to first order
where we’ve added the \(4\)-indexes since we recall from discussions in Quantum Lorentz symmetry that we must lift the \(\omega\) index to make it anti-symmetric. We now rewrite
and spell out the expansion of \(D(a, b, \theta) \coloneqq D(W(a, b, \theta))\) as follows
where
Next we use \(\eqref{eq_jjj_commutation}, \eqref{eq_jkk_commutation}\) and \(\eqref{eq_kkj_commutation}\) to calculate commutation relations between \(A, B\) and \(J_3\) as follows
Since \(A, B\) commute, we can use their eigenvalues to label states as follows
In fact, these states, corresponding to translation symmetries, come in continuous families as shown below
According to [Wei95] (page 72), massless particle states are not observed to come in such \(S^1\)-families. Hence the only possibility is that \(a = b = 0\) and the only symmetry left then is \(J_3\), which corresponds to a rotation about the \(3\)-axis.
Unlike the \(SO(3)\)-symmetry discussed in Representations of angular momenta, representations of \(J_3\) alone cannot be characterized at the infinitesimal level, which would have resulted in a continuous spectrum. Instead, since a \(2\pi\)-rotation about the \(3\)-axis gives the identity transformation, one might expect an integer spectrum for \(J_3\). This is indeed the case if we assume the representation is genuine. However, since the Lorentz group is not simplify connected (with fundamental group \(\Zbb/2\)), one may encounter projective representations. Indeed, the \(2\pi\)-rotation about the \(3\)-axis represents a generator of the fundamental group, which has order \(2\), i.e., only the \(4\pi\)-rotation about the \(3\)-axis represents a contractible loop in the Lorentz group (see the Plate trick). As a result, the \(J_3\)-spectrum actually consists of half-integers, just like the spins. We can therefore write a general massless particle state as \(\Psi_{k, \sigma}\) such that
where \(\sigma\) are half-integers, known as the helicity.
Combining the discussions so far, we can write down the \(D\)-matrix defined by \(\eqref{eq_massless_D_matrix_expansion}\) as follows
where we recall \(W(a, b, \theta) = L(\Lambda p)^{-1} \Lambda L(p) = S(a, b)R(\theta)\). The Lorentz transformation formula \(\eqref{eq_lorentz_transformation_formula_for_particle_state}\) for massless particles now becomes
In particular, we see that, unlike the spin \(z\)-component of massive particles, helicity is Lorentz invariant (at least under genuine representations). It is reasonable, therefore, to think of massless particles of different helicity as different particle species. Examples include photons with \(\sigma = \pm 1\) and gravitons with \(\sigma = \pm 2\), but not (anti-)neutrinos with hypothetical \(\sigma = \pm \tfrac{1}{2}\) as otherwise stated in [Wei95] (page 73 – 74), which are now known to have a nonzero mass. Here the \(\pm\) signs are related to the space-inversion symmetry \(\eqref{eq_space_inversion}\), which will be discussed in detail later.
In order to use \(\eqref{eq_lorentz_transformation_formula_for_massless}\) for a general \((\Lambda, p)\), we first need to fix the choices of \(L(p)\) that takes the standard \(k = (1, 0, 0, 1)\) to \(p\). This can be done in two steps. First apply a (pure) boost along the \(3\)-axis
Then apply a (pure) rotation that takes \((0, 0, |\pbf|)\) to \(\pbf\). However, in contrast to the massive case \(\eqref{eq_L_transformation_for_massive_1}\) – \(\eqref{eq_L_transformation_for_massive_3}\), where \(L(p)\) depends continuously on \(p\), there exists no continuous family of rotations that take \((0, 0, |\pbf|)\) to any other \(3\)-vector (of the same length). Fortunately, any two choices of such rotations differ by (a pre-composition of) a rotation about the \(3\)-axis, which, according to \(\eqref{eq_lorentz_transformation_formula_for_massless}\), only produces a physically immaterial phase factor.
Space and time inversions#
So far the discussions have been focused on orthochronous (and mostly homogeneous) Lorentz transformations, and in particular, the infinitesimal symmetries at the vicinity of the identity. Now it’s time to take a look at the space and time inversions, defined in \(\eqref{eq_space_inversion}\) and \(\eqref{eq_time_inversion}\), which takes us to the other components of the Lorentz group. The main goal is to understand their actions on the one-particle states, that have been worked out in the previous two sections.
Let’s write
for the corresponding quantum symmetry operators, which we haven’t decided whether should be complex linear or anti-linear. The same calculations that led to \(\eqref{eq_p_conjugated_by_u}\) and \(\eqref{eq_j_conjugated_by_u}\) now give
The complex (anti-)linearity of \(U(\Pcal)\) and \(U(\Tcal)\) can then be decided by the postulation that physically meaningful energy must not be negative. More precisely, recall that \(P_0\) is the energy operator. Then \(\eqref{eq_p_conjugated_by_p}\) shows
If \(U(\Pcal)\) were anti-linear, then \(U(\Pcal) P_0 U^{-1}(\Pcal) = -P_0\). Then for any state \(\Psi\) with positive energy, i.e., \(P_0 \Psi = p_0 \Psi\), we would have a state \(U^{-1}(\Pcal) \Psi\) with negative energy \(-p_0\). Hence we conclude that \(U(\Pcal)\) must be linear. The same argument shows also that \(U(\Tcal)\) must be anti-linear (since \({\Tcal_0}^0 = -1\)).
As before, it’ll be useful to rewrite \(\eqref{eq_p_conjugated_by_p}\) – \(\eqref{eq_j_conjugated_by_t}\) in terms of \(H, \Pbf, \Jbf, \Kbf\) as follows
One can (and should) try to reconcile these implications with commonsense. For example, \(\eqref{eq_p3_conjugated_by_p}\) and \(\eqref{eq_p3_conjugated_by_t}\) say that the \(3\)-momentum changes direction under either space or time inversion, which is of course as expected. Moreover \(\eqref{eq_j3_conjugated_by_p}\) says that the spin (of for example a basketball) remains the same under space inversion because both the direction of the axis and the handedness of the rotation get reversed simultaneously, but it gets reversed under a time inversion according to \(\eqref{eq_j3_conjugated_by_t}\) because the direction of rotation is reversed if time flows backwards.
In what follows we will work out the effects of space and time inversions on massive and massless particles, respectively.
Space inversion for massive particles#
We start by considering a state at rest \(\Psi_{k, \sigma}\), where \(k = (M, 0, 0, 0)\) and \(\sigma\) is an eigenvalue of \(J_3\) under one of the spin representations discussed in Representations of angular momenta. Since the state is at rest and \(U(\Pcal)\) commutes with \(J_3\) according to \(\eqref{eq_j3_conjugated_by_p}\), we can write
where \(\eta\) is a phase that depends a priori on \(\sigma\). It turns out, however, that \(\eta\) is actually independent of \(\sigma\), and hence justifies the notation, since \(U(\Pcal)\) commutes with the raising/lowering operators \(J_1 \pm \ifrak J_2\) by \(\eqref{eq_j3_conjugated_by_p}\).
To move on to the general case, we recall that the general formula \(\eqref{eq_def_of_one_particle_psi}\) takes the following form
We can calculate as follows
which generalizes \(\eqref{eq_space_inversion_on_massive_standard}\). Such \(\eta\) is known as the intrinsic parity, which is intrinsic to a particle species.
Time inversion for massive particles#
Consider the same \(\Psi_{k, \sigma}\) as in the space inversion case. Now since \(U(\Tcal)\) anti-commutes with \(J_3\) according to \(\eqref{eq_j3_conjugated_by_t}\), we can write
where \(\zeta_{\sigma}\) is a phase. Applying the raising/lowering operators and using \(\eqref{eq_j1_j2_matrix}\), we can calculate the left-hand-side, recalling that \(U(\Tcal)\) is anti-linear, as follows
where \(\jfrak\) is the particle spin, and the right-hand-side as follows
Equating the two sides, we see that \(\zeta_{\sigma} = -\zeta_{\sigma \pm 1}\). Up to an overall phase, we can set \(\zeta_{\sigma} = \zeta (-1)^{\jfrak - \sigma}\) so that
Here we have chosen to keep the option of a physically inconsequential phase \(\zeta\) open. As in the case of space inversion, the formula generalizes to any \(4\)-momentum \(p\)
since \(\Tcal L(p) \Tcal^{-1} = L(\Pcal p)\).
Space inversion for massless particles#
Let’s consider a state \(\Psi_{k, \sigma}\) with \(k = (1, 0, 0, 1)\) and \(\sigma\) being the helicity, i.e., \(J_3 \Psi_{k, \sigma} = \sigma \Psi_{k, \sigma}\). Since \(U(\Pcal)\) commutes with \(J_3\), the space inversion preserves \(\sigma\), just as in the massive case. However, since \(\Pcal\) reverses the direction of motion, the helicity in the direction of motion actually reverses sign. It follows, in particular, that (massless) particles that respect the space inversion symmetry must come in companion with another particle of opposite helicity.
To spell out more details, note that since \(\Pcal\) doesn’t fix \(k\), it’ll be convenient to introduce an additional rotation \(R_2\), which is defined to be a \(\pi\)-rotation about the \(2\)-axis, so that \(U(R_2) = \exp(\ifrak \pi J_2)\) and \(R_2 \Pcal k = k\). Since \(U(R_2)\) flips the sign of \(J_3\), as can be seen from the very definition of \(J_3\) in \(\eqref{eq_u_lorentz_expansion}\), we have
where we see indeed that the helicity reverses sign (when \(k\) is fixed).
To move on to the general case, recall that the \(L(p)\) that takes \(k\) to \(p\) consists of a boost \(B\) defined by \(\eqref{eq_massless_boost}\) followed by a (chosen) pure rotation \(R(\pbf)\) that takes \((0, 0, |\pbf|)\) to \(\pbf\). We calculate as follows
where \(\rho\) is an extra phase due to the fact that although \(R(\pbf) R_2^{-1}\) takes \((0, 0, |\pbf|)\) to \(-\pbf\), it may not be the chosen one.
To spell out \(\rho\), we need to be a bit more specific about \(R(\pbf)\). Following the usual convention of spherical coordinates, we can get from \((0, 0, |\pbf|)\) to \(\pbf\) by first rotate (according to the right-handed rule) about the \(1\)-axis at an angle \(0 \leq \phi \leq \pi\), known as the polar angle, and then rotate about the \(3\)-axis at an angle \(0 \leq \theta < 2\pi\), known as the azimuthal angle. Now since we know that \(R(\pbf)R_2^{-1}\) differs from \(R(-\pbf)\) by a rotation about the \(3\)-axis, we can figure the rotation out by examining their actions on some suitably generic \(\pbf\), for example, the unit vector along the \(2\)-axis, which is fixed by \(R_2\). In this case \(R(-\pbf)\) is a \(\pi/2\)-rotation about the \(1\)-axis, while \(R(\pbf)R_2^{-1}\) is the same \(\pi/2\)-rotation by the \(1\)-axis, followed by a a \(\pi\)-rotation about the \(3\)-axis. Therefore we conclude that the difference is a \(\pi\)-rotation about the \(3\)-axis. In other words, we should have \(\rho = \exp(-\ifrak \pi \sigma)\). However, recalling that the helicity \(\sigma\) may be a half-integer (thought not yet being found in Nature), there is a sign difference between \(\pm \pi\)-rotations about the \(3\)-axis. Without going into further details, we write down the general formula the space inversion as follows
where the sign depends on the sign of \(p_2\) (which can be seen by playing the same game as above with the negative unit vector along the \(2\)-axis).
Time inversion for massless particles#
Let \(k = (1, 0, 0, 1)\) as usual and consider the state \(\Psi_{k, \sigma}\). Since \(U(\Tcal)\) anti-commutes with both \(\Pbf\) and \(J_3\) by \(\eqref{eq_p3_conjugated_by_t}\) and \(\eqref{eq_j3_conjugated_by_t}\), we have
Composing with the rotation \(R_2\) as in the previous section to fix \(k\), we have
where \(\zeta_{\sigma}\) is (yet another) phase. We see that, unlike the space inversion, the time inversion doesn’t produce a doublet of opposite helicity. Processing as in the space inversion case, one can derive the following general formula similar to \(\eqref{eq_space_inversion_on_massless_general}\)
where the sign depends on the sign of \(p_2\) as before.
Kramers’ degeneracy#
We end our discussion about space and time inversions of one-particle states with an interesting observation on the squared time inversion \(U(\Tcal)^2\). It follows from both \(\eqref{eq_time_inversion_on_massive_general}\) and \(\eqref{eq_time_inversion_on_massless_general}\) that
where \(s \in \tfrac{1}{2} \Zbb\) equals the spin \(\jfrak\) in the massive case and the absolute helicity \(|\sigma|\) in the massless case.
Hence in a non-interacting system consisting of an odd number of half-integer spin/helicity particles and any number of integer spin/helicity particles, we have
Now for any eigenstate \(\Psi\) of the Hamiltonian, there is an accompanying eigenstate \(U(\Tcal) \Psi\) since \(U(\Tcal)\) commutes with the Hamiltonian. The key observation then is that they are necessarily different states! To see this, let’s suppose otherwise that \(U(\Tcal) \Psi = \zeta \Psi\) represent the same state, where \(\zeta\) is a phase. Then
contradicts \(\eqref{eq_time_inversion_squared_reverse_sign}\).
As a conclusion, we see that for such systems, any energy eigenvalue has at least a two-fold degeneracy. This is known as Kramers’ degeneracy.
Scattering Theory#
Physics would have been rather boring if nothing interacts, like the free particles that we have been studying so far. On the flip side, physics would have been impossible if we try to know exactly what happens in the interactions. The middle ground, where we assume that the particles are non-interacting long before and after the interaction, and something mysterious happened in between, is called scattering theory – a place where theories meet experiments.
Non-interacting many-particles state#
We shall, as always, start from the easiest part of the theory, which is clearly the non-interacting parts. Recall our grand formulae for the Lorentz transformation on one-particle state \(\eqref{eq_translation_formula_for_particle_state}\) and \(\eqref{eq_lorentz_transformation_formula_for_particle_state}\). For a non-interacting many-particles system, it’s conceivable to assume that the Lorentz transformation law is simply a direct product of the individual particles as follows (recall that \(U(\Lambda, a) = U(1, a) U(\Lambda, 0)\))
where the first component is the translation transformation \(\eqref{eq_translation_formula_for_particle_state}\), the second component is the normalization factor, and the third component is the little group representation, and the \(\sigma\)’s are either the spin \(z\)-component for massive particles or the helicity for massless particles, and the \(n\)’s are additional (discrete) labels such as mass, charge, spin, etc.
Notice that by writing a many-particles state as \(\Psi_{p_1, \sigma_1, n_1; ~p_2, \sigma_2, n_2; ~\cdots}\), we have given the particles an order, which is by no means unique. Hence the normalization of these states must take permutations into account as follows
The sign in front of the permutations has to do with the species of the particles, which will be discussed later. Note that although there are many terms in \(\eqref{eq_many_particles_state_normalization_rough}\), there is at most one nonzero term, which happens exactly when the two states differ by a permutation.
To suppress the annoyingly many sub-indexes in the states, we shall use letters such as \(\alpha, \beta, \cdots\) to denote the compound index such as \((p_1, \sigma_1, n_1; ~p_2, \sigma_2, n_2; ~\cdots)\), so that, for example, \(\eqref{eq_many_particles_state_normalization_rough}\) can be simplified as
where the integral volume element reads
We have postulated that the transformation law \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\) works for non-interacting particles, but in fact, it’s also only possible for non-interacting particles. One way to see this is through an energy calculation by letting \(\Lambda = 1\) and \(a = (\tau, 0, 0, 0)\) in \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\) to see that
where \(E_i \coloneqq (p_i)_0\) is the energy of the \(i\)-th particle. There is obviously no energy left for any interaction.
In- and out-states#
As mentioned earlier, scattering theory is concerned with a scenario where interactions happen within a finite time period, long before and after which the system can be regarded as non-interacting. We can therefore define the in-state \(\Psi_{\alpha}^-\) and the out-state \(\Psi_{\alpha}^+\), where \(\alpha\) is the compound index as defined in the previous section, such that the states appear to be non-interacting with the prescribed particle states when observed at \(t \to \mp \infty\), respectively. [2]
Now it’s time to bring forward an implicit assumption on the quantum states that we’ve been studying so far: they’re defined in one chosen inertial frame. Indeed, the Lorentz transformation law \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\) tells us exactly how to transform the state to any other frame. States of this sort are called Heisenberg picture states: they contain the entire history/future of the system and are not dynamical in time as opposed to the so-called Schrödinger picture states.
Back to the scattering scenario, let’s imagine a reference observer \(\Ocal\), who at \(t = 0\) observes that the system is in a state \(\Psi\). Then imagine another observer \(\Ocal'\) at rest with respect to \(\Ocal\), who sets his clock \(t' = 0\) when \(t = \tau\), in other words \(t' = t - \tau\). Then from the viewpoint of \(\Ocal'\), the time-\(0\) state should look like \(\exp(-\ifrak \tau H) \Psi\). It follows that the state \(\Psi\), viewed long before and long after the reference \(t = 0\), should look like \(\exp(-\ifrak \tau H) \Psi\) for \(\tau \to \mp\infty\), respectively.
It follows that energy eigenstates such as \(\Psi_{\alpha}\) will look the same at all time because \(\exp(-\ifrak \tau H) \Psi_{\alpha} = \exp(-\ifrak \tau E_{\alpha}) \Psi_{\alpha}\) creates merely an inconsequential phase factor. This is one form of the uncertainty principle: if the energy is definitely known, then the time is completely unknown. Therefore we must consider a localized packet (or superposition) of states as follows
where \(g(\alpha)\) is a reasonably smooth function (e.g. without poles) which is non-vanishing within a finite range of energies. We can then demand that the time limits
as \(\tau \to \pm\infty\), respectively, approach the corresponding superpositions of non-interacting particle states.
To be more precise, let’s split the Hamiltonian into the free part and the interaction part as follows
such that the energy eigenstates \(\Phi_{\alpha}\) of \(H_0\) (in the same frame as \(\Psi_{\alpha}^{\pm}\)) transform according to \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\). Then the asymptotic freeness translates into the following conditions
or equivalently in terms of the Hamiltonians
This motivates the following definition
so that \(\Psi_{\alpha}^{\pm} = \Omega(\pm\infty) \Phi_{\alpha}\), at least formally. Moreover, since \(\Omega\) is unitary, the in- and out-states \(\Psi_{\alpha}^{\pm}\) are normalized as long as \(\Phi_{\alpha}\) are normalized.
In practice it will be assumed that the interaction term \(V\) in \(\eqref{eq_h_as_h0_plus_v}\) is relatively small so that a formal solution as power series in \(V\) may be meaningful. As the first step, let’s try to apply \(\eqref{eq_h_as_h0_plus_v}\) to \(\Psi_{\alpha}^{\pm}\) as follows
Note that \(\Phi_{\alpha}\) is also annihilated by \(E_{\alpha} - H_0\). Considering the asymptotic \(\eqref{eq_in_out_states_asymptotic_by_energy}\) or \(\eqref{eq_in_out_states_asymptotic_by_hamiltonian}\), it’s reasonable to guess the following formal solution
where the infinitesimal \(\mp \ifrak \epsilon\) is a mathematical trick added to avoid division by zero, and the signs will be justified momentarily. One can obviously apply \(\eqref{eq_lippmann_schwinger_mixed}\) recursively to get an expansion of \(\Psi_{\alpha}^{\pm}\) as a power series in \(V\), and we shall come back to this point later. In order to express \(\Psi_{\alpha}^{\pm}\) in terms of \(\Phi_{\alpha}\), let’s expand the right-hand-side of \(\eqref{eq_lippmann_schwinger_mixed}\) as follows
Both \(\eqref{eq_lippmann_schwinger_mixed}\) and \(\eqref{eq_lippmann_schwinger_pure}\) are known as the Lippmann-Schwinger equation.
Now let’s justify the term \(\pm \ifrak \epsilon\) by showing that \(\eqref{eq_lippmann_schwinger_pure}\) indeed satisfies the asymptotic condition \(\eqref{eq_in_out_states_asymptotic_by_energy}\) as follows
Now the integral colored in blue can be integrated over \(E_{\alpha}\) by a contour that runs from \(-\infty\) to \(+\infty\), followed by a semicircle at infinity, in the upper-half-plane in the case of \(\Psi_{\alpha}^-\) and the lower-half-plane in the case of \(\Psi_{\alpha}^+\), back to \(-\infty\). In either case, the sign in \(\mp \ifrak \epsilon\) is chosen so that the integrant has no poles with infinitesimally small imaginary part, though both \(g(\alpha)\) and \((\Phi_{\beta}, V \Psi_{\alpha}^{\pm})\), viewed as complex functions, may have poles with finite imaginary parts. It follows then from the residual theorem and the damping factor \(\exp(-\ifrak \tau E_{\alpha})\) as \(\tau \to \pm\infty\) that the integral in blue vanishes, as desired.
S-matrix and its symmetry#
The S-matrix defined by
records the probability amplitude of finding the out-state \(\Psi_{\beta}^+\) given the in-state \(\Psi_{\alpha}^-\). Note that since the in- and out-states both form an orthonormal basis of the same Hilbert space, the S-matrix is unitary. However, the way \(S\) is defined in \(\eqref{eq_defn_s_matrix_by_in_and_out_states}\) disqualifies it as an operator on the Hilbert space. Therefore it’ll be convenient to convert both in- and out-states to the free states and define the S-operator by
Using \(\eqref{eq_defn_of_Omega}\) we see that
where
The most straightforward way to calculate \(S_{\beta \alpha}\) is probably to use \(\eqref{eq_lippmann_schwinger_pure}\) directly. However. this turns out to be rather involved, and doesn’t lead to a simple result. The issue is that we don’t really want to convert both the in- and out-states to the non-interacting states, but rather to push, say, the in-states from the far past to the far future and compare with the out-states. To spell out the details, let’s first calculate the asymptotic of the in-packet as \(\tau \to \infty\) (but omitting the \(\lim_{\tau \to \infty}\) symbol) using \(\eqref{eq_packet_expansion_by_lippmann_schwinger}\)
where we’ve used the residue theorem again in the second equality. Next expand the left-hand-side of the equation in terms of the out-states and then let \(\tau \to \infty\)
where we’ve used the fact that the S-matrix contains a \(\delta(E_{\alpha} - E_{\beta})\) factor by energy conservation in the second equality, and the defining property \(\eqref{eq_in_out_states_asymptotic_by_energy}\) of the out-state in the third equality.
Equating the blue terms from \(\eqref{eq_positive_limit_of_in_state_by_lippmann_schwinger}\) and \(\eqref{eq_positive_limit_of_in_state_by_expanding_out_states}\), we’ve derived the following formula
Up to the first order in \(V\), one can replace \(\Psi_{\alpha}^-\) on the right-hand-side by \(\Phi_{\alpha}\) and arrive at the so-called Born approximation of the S-matrix.
Lorentz symmetry#
Recall that in \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\), or really in Lorentz symmetry of one-particle states, we understood how Lorentz transformations act on particle states. Now we’d like to understand how they act on the S-matrix. Of course, since \(U(\Lambda, a)\) is unitary, we always have
but this is not what we mean by Lorentz symmetry. What we do want to know is, just like in \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\), how Lorentz transformation acts on the particle states, i.e., the (compound) indexes \(\alpha\) and \(\beta\). Now although \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\) doesn’t work for general (interacting) states, it does work for, say, \(\Psi_{\alpha}^-\) in the \(\tau \to -\infty\) limit because of the asymptotic freeness. By Lorentz we mean that \(U(\Lambda, a)\) acts the same way on both in- and out-states. In other words, we’ll be looking for some \(U(\Lambda, a)\) such that the following general formula holds.
where we’ve used primes to distinguish between labels from in- and out-states, and bars to distinguish between labels, specifically the spin-\(z\) or helicity, before and after the Lorentz transformation.
Since the left-hand-side doesn’t depend on the translation parameter \(a\), the blue term on the right-hand-side must be \(1\). In other words,
which is nothing but the conservation of (total) momentum. Note that a special case, which is the energy conservation, has already been used in the derivation of \(\eqref{eq_positive_limit_of_in_state_by_expanding_out_states}\) from the previous section.
As a consequence, we can now extract a delta function from the S-matrix as follows
which should be compared with \(\eqref{eq_s_matrix_pre_born_approx}\).
Back to the core question of this section, how in the world can one engineer a magic \(U(\Lambda, a)\) to satisfy the monstrous \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\)? One cannot. But remember that \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\) is readily satisfied for non-interacting particles. It follows that if we consider instead the S-operator defined by \(\eqref{eq_defn_s_operator}\), and let \(U_0(\Lambda, a)\) be the Lorentz transformation on free particles defined by \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\), then \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\) would be satisfied if \(U_0(\Lambda, a)\) commutes with \(S\). Indeed, using shorthand notations, we have
where \(\cdots\) denotes the coefficients on the right-hand-side of \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\) and the sub-indexes \(\underline{\beta \alpha}\) is short for the compound index on the right-hand-side of \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\).
In order for \(S\) to commute with \(U_0(\Lambda, a)\), it suffices that it commutes with the infinitesimal generators of \(U_0(\Lambda, a)\), namely,
where \(H_0, \Pbf_0, \Jbf_0, \Kbf_0\) are discussed in Quantum Lorentz symmetry and satisfy the commutation relations \(\eqref{eq_hp_commute}\) – \(\eqref{eq_kkj_commutation}\).
This shall be done in three steps, where \(\eqref{eq_p30_s_commute}, \eqref{eq_j30_s_commute}\) will be handled first, followed by \(\eqref{eq_k30_s_commute}\), and finally \(\eqref{eq_h0_s_commute}\).
- Step 1.
Recall from \(\eqref{eq_s_operator_by_u}\) and \(\eqref{eq_defn_u_operator}\) that the S-operator can be understood as time translations dictated by \(H\) and \(H_0\). It’s therefore necessary to understand how the free infinitesimal Lorentz transformations commute with \(H\). To this end, let’s consider the in-states at \(\tau \to -\infty\), which is approximately free. We can similarly define infinitesimal operators \(\Pbf, \Jbf, \Kbf\) that together with \(H\) satisfy the same commutation relations \(\eqref{eq_hp_commute}\) – \(\eqref{eq_kkj_commutation}\).
Now comes the crucial part, which is to make assumptions about \(H\) so that \(\eqref{eq_h0_s_commute}\) – \(\eqref{eq_k30_s_commute}\) are satisfied. Recall that \(H = H_0 + V\) where \(V\) describes the interactions. The first assumption we’ll make is the following
Assumption on \(H\) for Lorentz invariance of S-matrix #1
The interaction \(V\) affects neither the momentum \(\Pbf\) nor the angular momentum \(\Jbf\). In other words, we assume that
\begin{equation} \Pbf = \Pbf_0, ~~\Jbf = \Jbf_0, ~~[V, \Pbf_0] = [V, \Jbf_0] = 0 \label{eq_s_matrix_lorentz_invariance_assump_1} \end{equation}It follows from this assumption that \(\eqref{eq_p30_s_commute}\) and \(\eqref{eq_j30_s_commute}\) hold.
- Step 2.
Next we turn to \(\eqref{eq_k30_s_commute}\). This time we cannot “cheat” by assuming that \(\Kbf = \Kbf_0\) because it would led to the undesirable consequence \(H = H_0\) by \(\eqref{eq_pkh_commutation}\). So instead, let’s write
\begin{equation} \Kbf = \Kbf_0 + \Wbf \label{eq_k_as_k0_plus_w} \end{equation}where \(\Wbf\) denotes the perturbation term. Let’s calculate
\begin{equation} [\Kbf_0, S] = \lim_{\substack{\tau_0 \to -\infty \\ \tau_1 \to \infty\phantom{-}}} [\Kbf_0, U(\tau_1, \tau_0)] \ = \lim_{\substack{\tau_0 \to -\infty \\ \tau_1 \to \infty\phantom{-}}} [\Kbf_0, \exp(\ifrak \tau_1 H_0) \exp(\ifrak (\tau_0 - \tau_1) H) \exp(-\ifrak \tau_0 H_0)] \label{eq_k0_s_commutator_by_u} \end{equation}as follow. First using \(\eqref{eq_hkp_commutation}\) again
\begin{alignat*}{2} [\Kbf_0, \exp(\ifrak \tau H_0)] &= [\Kbf_0, \ifrak \tau H_0] \exp(\ifrak \tau H_0) &&= \tau \Pbf_0 \exp(\ifrak \tau H_0) \\ [\Kbf, \exp(\ifrak \tau H)] &= [\Kbf, \ifrak \tau H] \exp(\ifrak \tau H) &&= \tau \Pbf \exp(\ifrak \tau H) =\tau \Pbf_0 \exp(\ifrak \tau H) \end{alignat*}from which we can calculate
\begin{align} [\Kbf_0, U(\tau_1, \tau_0)] &= [\Kbf_0, \exp(\ifrak \tau_1 H_0) \exp(\ifrak (\tau_0 - \tau_1) H) \exp(-\ifrak \tau_0 H_0)] \label{eq_k30_u_commutation} \\ &= [\Kbf_0, \exp(\ifrak \tau_1 H_0)] \exp(\ifrak (\tau_0 - \tau_1) H) \exp(-\ifrak \tau_0 H_0) \nonumber \\ &\phantom{=} + \exp(\ifrak \tau_1 H_0) [\Kbf - \Wbf, \exp(\ifrak (\tau_0 - \tau_1) H)] \exp(-\ifrak \tau_0 H_0) \nonumber \\ &\phantom{=} + \exp(\ifrak \tau_1 H_0) \exp(\ifrak (\tau_0 - \tau_1) H) [\Kbf_0, \exp(-\ifrak \tau_0 H_0)] \nonumber \\ &= \blue{\tau_1 \Pbf_0 \exp(\ifrak \tau H_0) \exp(\ifrak (\tau_0 - \tau_1) H) \exp(-\ifrak \tau_0 H_0)} \nonumber \\ &\phantom{=} \blue{+ (\tau_0 - \tau_1) \Pbf_0 \exp(\ifrak \tau_1 H_0) \exp(\ifrak (\tau_0 - \tau_1) H) \exp(-\ifrak \tau_0 H_0)} \nonumber \\ &\phantom{=} \blue{- \tau_0 \Pbf_0 \exp(\ifrak \tau_1 H_0) \exp(\ifrak (\tau_0 - \tau_1) H) \exp(-\ifrak \tau_0 H_0)} \nonumber \\ &\phantom{=} - \exp(\ifrak \tau_1 H_0) [\Wbf, \exp(\ifrak (\tau_0 - \tau_1) H)] \exp(-\ifrak \tau_0 H_0) \nonumber \\ &= -\Wbf(\tau_1) U(\tau_1, \tau_0) + U(\tau_1, \tau_0) \Wbf(\tau_0) \nonumber \end{align}where \(\Wbf(\tau) \coloneqq \exp(\ifrak \tau H_0) \Wbf \exp(-\ifrak \tau H_0)\). Note that the three blue terms cancel out.
We see that \(\eqref{eq_k0_s_commutator_by_u}\), and hence \(\eqref{eq_k30_s_commute}\), would follow if \(W(\tau) \to 0\) as \(\tau \to \pm\infty\). The latter, in turn, would follow from the following assumption
Assumption on \(H\) for Lorentz invariance of S-matrix #2
The matrix elements of \(W\) with respect to the eigenstates \(\Phi_{\alpha}\) of \(H_0\) is smooth, so that \(W(\tau)\) vanishes on any local packet of \(\Phi_{\alpha}\) as in \(\eqref{eq_psi_packet}\) as \(\tau \to \pm\infty\).
This assumption should be compared with \(\eqref{eq_in_out_states_asymptotic_by_energy}\) and \(\eqref{eq_in_out_states_asymptotic_by_hamiltonian}\), and can be justified by the asymptotic freeness of S-matrix theory.
- Step 3.
Finally let’s handle \(\eqref{eq_h0_s_commute}\). Recall from \(\eqref{eq_s_operator_by_u}\) that \(S = \Omega^{\dagger}(\infty) \Omega(-\infty)\). Hence the idea is to work out how \(H\) and \(H_0\) intertwine with \(\Omega(\pm\infty)\). To this end, let’s use \(\eqref{eq_k30_u_commutation}\) by setting \(\tau_1 = 0\) and \(\tau_0 = \mp\infty\) as follows
\begin{equation*} [\Kbf_0, \Omega(-\infty)] = -\Wbf \Omega(-\infty) \implies \Kbf \Omega(\mp\infty) = \Omega(\mp\infty) \Kbf_0 \end{equation*}Moreover, by \(\eqref{eq_s_matrix_lorentz_invariance_assump_1}\), we have also \(\Pbf \Omega(\mp\infty) = \Omega(\mp\infty) \Pbf_0\). Now using the commutation relation \(\eqref{eq_pkh_commutation}\) we conclude that
\begin{equation*} H \Omega(\mp\infty) = \Omega(\mp\infty) H_0 \end{equation*}which readily implies \(\eqref{eq_h0_s_commute}\).
Note
Besides showing that \(\eqref{eq_h0_s_commute}\) – \(\eqref{eq_k30_s_commute}\) hold, our calculations actually establish the following intertwining identities
which imply, in particular, that the standard commutation relations \(\eqref{eq_hp_commute}\) – \(\eqref{eq_kkj_commutation}\) also hold in a frame where \(\tau \to \infty\), as expected.
Internal symmetry#
An internal symmetry is a symmetry that leaves \(p\) and \(\sigma\) invariant and acts on the other labels such as charge, spin, and so on. We can write the general form of an internal symmetry on in- and out-states as follows
where \(U(T)\) is the unitary operator associated with the symmetry transformation \(T\), and the \(\Dscr\)’s are analogs of the little group representations from \(\eqref{eq_d_repr_of_little_group}\).
Similar to \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\), we can formulate the internal symmetry of S-matrix as follows
where we have suppressed the irrelevant \(p\) and \(\sigma\) labels.
For what kind of Hamiltonian \(H\) does there exist an internal symmetry \(U(T)\) that acts like \(\eqref{eq_internal_symmetry_transformation_for_in_and_out_states}\)? The answer is similar to the case of Lorentz symmetry. Namely, if we can split \(H = H_0 + V\) into the free and perturbation terms, such that the free symmetry transformation \(U_0(T)\), which satisfies \(\eqref{eq_internal_symmetry_transformation_for_in_and_out_states}\) with \(\Phi\) in place of \(\Psi^{\pm}\), commutes with both \(H_0\) and \(V\).
Similar to the translations in Lorentz symmetry, let’s consider a symmetry \(T(\theta)\) parametrized by a real number. It follows from \(\eqref{eq_additive_symmetry}\) that we can write
where \(Q\) is a Hermitian operator called the charge. Probably the best known example of it is the electric charge. In this case, we can also write
The general formula \(\eqref{eq_internal_symmetry_transformation_formula_for_s_matrix}\) then translates into
which is nothing about the conservation of charges. Besides the electric charge, there exist also other similar conserved, or approximately conserved, quantities, such as baryon number and lepton number.
Parity symmetry#
Recall from Space inversion for massive particles that for non-interacting massive particles
where \(\eta_n\) denotes the intrinsic parity of particle \(n\). The S-matrix version of the parity symmetry is as follows
Although the space inversion operator \(\Pcal\) is defined explicitly in \(\eqref{eq_space_inversion}\), the parity operator \(U(\Pcal)\), as far as the S-matrix is concerned, is completely determined by \(\eqref{eq_space_inversion_acts_on_in_and_out_states}\) and \(\eqref{eq_space_inversion_formula_for_s_matrix}\). In particular, it’s not uniquely determined if the particle species under question possesses internal symmetries as discussed in the previous section, because their composition with \(\Pcal\) will also satisfy \(\eqref{eq_space_inversion_acts_on_in_and_out_states}\) and \(\eqref{eq_space_inversion_formula_for_s_matrix}\), and therefore may equally well be called a parity operator.
Since \(\Pcal^2 = 1\), it’s an obvious question to ask whether \(U(\Pcal)^2 = 1\) necessarily. This would have been the case if \(U\) furnishes a genuine representation, but it doesn’t have to. In general, we have
which looks just like an internal symmetry. Now if \(U(\Pcal)^2\) belongs to a continuous family of internal symmetries, then it may be redefined, by suitably composing with internal symmetries, so that all \(\eta^2 = 1\). Examples of this kind include notably protons and neutrons. On the other hand, non-examples, i.e., those whose intrinsic parity cannot be reduced to \(\pm 1\), include only those hypothetical Majorana fermions.
Todo
Revise this part after I’ve learned more…
Time inversion symmetry#
Recall from Time inversion for massive particles that for a single massive particle
To generalize this to the in- and out-states, we need to remember that the time inversion also interchanges the very frame with respect to which the in- and out-states are defined. The result is as follows
The invariance of S-matrix can then be formulated as follows
Since we’ll be mainly concerned with the rate of interactions in this section, the phase factors in front of \(\Psi\) play little role. So let’s simplify the notations in \(\eqref{eq_time_inversion_acts_on_in_and_out_states}\) and \(\eqref{eq_time_inversion_acts_on_s_matrix}\) using compound indexes as follows
where the phase factors have been “absorbed” in the right-hand-side.
Unlike the space inversions discussed in the previous section, time inversions don’t directly lead to implications on reaction rates because, after all, we cannot turn time around in any experiment. However, under certain circumstances, one can use a trick to draw experimentally verifiable conclusions, which we now present.
The main assumption here is that one can expand the S-operator as follows
such that \(S^{(1)} \ll S^{(0)}\) can be regarded as the first-order perturbation. The unitarity of \(S\) shows that
which, in turn, implies
Using \(\eqref{eq_time_inversion_formula_for_s_matrix}\), the (anti-)Hermitian condition can be spelled out in matrix notations as follows
where we recall that the adjoint \(\dagger\) equals the composition of the (complex) conjugation \(\ast\) and transpose. Together with the unitarity of \(S^{(0)}\), we see that the rate of reaction \(\left| S^{(1)}_{\beta \alpha} \right|^2\), when summed up against a complete set of \(S^{(0)}\) eigenstates, remains the same after applying \(\Tcal\) to both initial and final states.
The simplest case where \(\eqref{eq_first_order_s_matrix_is_hermitian}\) becomes applicable is obviously when both \(\alpha\) and \(\beta\) are eigenstates of \(S^{(0)}\), with eigenvalues, say, \(\exp(\ifrak \theta_{\alpha})\) and \(\exp(\ifrak \theta_{\beta})\), respectively. In this case \(\eqref{eq_first_order_s_matrix_is_hermitian}\) becomes
This is to say that under the assumption that \(\eqref{eq_s_operator_first_order_expansion}\) is valid, at least approximately, the rate of reaction \(S^{(1)}_{\beta \alpha}\) should be invariant under a flip of the \(3\)-momentum as well as the spin \(z\)-component. This is not contradicted by Wu’s experiment which disproved the parity conservation.
Rates and cross-sections#
As we already mentioned, the S-matrix entries \(S_{\beta \alpha}\) can be interpreted as probability amplitudes of a reaction that turns an in-state \(\Psi^-_{\alpha}\) into an out-state \(\Psi^+_{\beta}\). In other words, the probability \(P(\Psi^-_{\alpha} \to \Psi^+_{\beta}) = \left| S_{\beta \alpha} \right|^2\). It is, however, not completely straightforward to square S-matrix entries because, as we’ve seen in \(\eqref{eq_s_matrix_with_m}\), they contain Dirac delta functions.
Derivation in box model#
One trick that is often used in physics to deal with integration over an infinite space is to restrict the space to a (large) box, often with additional periodic boundary conditions, and hope that the final results will not depend on the size of the box, as long as it’s large enough. This is exactly what we shall do.
Consider a cubic box whose sides have length \(L\) and has volume \(V = L^3\). Imposing the periodic boundary condition on the cube, the \(3\)-momentum is discretized as follows
where \(n_1, n_2, n_3\) are nonnegative integers. Of course, the higher the \(n\), the shorter the wave length if we interpret it as wave mechanics. By analogy with the continuous case, we can define the Dirac delta function as follows
where \(\delta_{\pbf \pbf'}\) is the usual Kronecker delta. With this setup, the states inner product \(\eqref{eq_many_particles_state_normalization_rough}\) will produce, from the Dirac deltas, an overall factor of \(\left( V/(2\pi)^3 \right)^N\) where \(N\) denotes the number of particles in the box. In order for the amplitudes to be independent of the size of the box, let’s normalize the states as follows
such that \(\left( \Psi^{\square}_{\beta}, \Psi^{\square}_{\alpha} \right) = \delta_{\beta \alpha}\) is properly normalized. Correspondingly, we can express the S-matrix with respect to the box-normalized states as follows
where \(N_{\alpha}, N_{\beta}\) are the numbers of particles in the in- and out-states, respectively.
Now the transition probability in the box model takes the following form
which we can further turn to a differential form as follows
where \(d\beta\) denotes an infinitesimal volume element around the state \(\beta\), or more precisely, a product of \(d^3 \pbf\), one for each particle. Then \(\Nscr_{\beta}\) counts the number of states within the infinitesimal \(d\beta\), which can be readily calculated from \(\eqref{eq_momentum_by_wave_number}\).
Back to our core problem, which is to define \(\left| S_{\beta \alpha} \right|^2\) as calculated by \(\eqref{eq_s_matrix_with_m}\). The first assumption we will make, at least for now, is a genericity condition
Genericity assumption on the S-matrix
No subset of particles in the state \(\beta\) have exactly the same (total) \(4\)-momentum as some subset in the state \(\alpha\).
Under this assumption, we can remove the term \(\delta(\beta - \alpha)\) from \(\eqref{eq_s_matrix_with_m}\) and write
and moreover, ensure that \(M_{\beta \alpha}\) contains no more delta functions. Now the question becomes how to define \(\left| \delta^4(p_{\beta} - p_{\alpha}) \right|^2\). In fact, to align with the main theme of using in- and out-states to calculate the S-matrix, the interaction must be turned on for a finite period of time, say, \(T\). Hence the time-wise delta function becomes
We can then modify \(\eqref{eq_generic_s_matrix_with_m}\) in a “timed box” as follows
Now using \(\eqref{eq_3_momentum_delta_in_a_box}\) and \(\eqref{eq_time_delta_in_a_period}\), we can calculate the squares as follows
All together, we can now rewrite \(\eqref{eq_differential_form_of_s_matrix_probability}\) as follows
where we have restored \(\delta^4(p_{\beta} - p_{\alpha})\) by taking the large \(V\) and \(T\) limits.
If taking partial limits in \(V\) and \(T\) in the above derivation is not suspicious enough, then let’s define the rate of transition by moving the \(T\) factor from the right to the left as follows
where \(M_{\beta \alpha}\) is defined by \(\eqref{eq_generic_s_matrix_with_m}\) instead of \(\eqref{eq_generic_s_matrix_in_time_box}\) because we have restored \(\delta^4(p_{\beta} - p_{\alpha})\). This is totally wild because by taking rate one typically think of processes that happen within infinitesimal time periods, but we have at the same taken large time limit to recover the \(\delta^4(p_{\beta} - p_{\alpha})\) factor. Despite the insane derivation, the end result seems reasonable and it will be the key formula that connects S-matrix to experimental measurement of probabilities.
Examples with few initial particles#
One special case of interest is when \(N_{\alpha} = 1\), or in other words, processes where one particle decays into multi-particles. In this case \(\eqref{eq_rate_of_reaction_master_formula}\) becomes
which becomes independent of the volume of the box. This is reasonable because the decay rate of one particle shouldn’t care about the size of the containing box. However, the \(T \to \infty\) limit in \(\delta^4(p_{\beta} - p_{\alpha})\) is no longer valid. In fact, it cannot be longer than the (mean) lifetime \(\tau_{\alpha}\) of the particle \(\alpha\), because the interaction wouldn’t make sense if the particle itself already disintegrates. In this case, in order for \(\eqref{eq_time_delta_in_a_period}\) to still approximate a delta function, we must assume that any characteristic energy of the interaction satisfies
where the right-hand-side is known as the total decay rate.
Another case of interest is when \(N_{\alpha} = 2\). In this case \(\eqref{eq_rate_of_reaction_master_formula}\) takes the following form
It turns out that in the world of experimentalists, it’s more common to use, instead of the transition rate, something called cross-section, or equivalently, rate per flux, where the flux is defined as [4]
and \(u_{\alpha}\) is the (relativistic) relative velocity between the two particles, to be discussed in more detail in the next section by considering Lorentz symmetry. We can then rewrite \(\eqref{eq_differential_reaction_rate_two_particles}\) in terms of the cross-section as follows
Note that \(d\sigma\) has the dimension of an area.
Lorentz symmetry of rates and cross-sections#
We can investigate the Lorentz symmetry on the rates and cross-sections as follows. Squaring \(\eqref{eq_lorentz_transformation_formula_for_s_matrix}\), and using the fact that the little group representations are unitary, we see that the following quantity
is Lorentz invariant, where \(E = p_0 = \sqrt{\pbf^2 + m^2}\) for each particle in \(\alpha\) and \(\beta\), respectively.
It follows that in the one-particle case, \(\eqref{eq_differential_reaction_rate_one_particle}\) gives
In particular, we recognize \(d\beta / \prod_{\beta} E\) as a product of the Lorentz invariant \(3\)-momentum volume elements constructed in \(\eqref{eq_lorentz_invariant_3_momentum_volume_element}\). Hence the only factor in the right-hand-side which is not Lorentz invariant is \(E_{\alpha}^{-1}\). It follows that the decay rate of a particle, summed up over all spins, is inverse proportional to its energy, or in other words, a faster moving particle decays slower, which is consistent with the special theory of relativity and experimentally observed slow decay rates of high energy particles coming from cosmic rays.
Next, let’s turn to the two-particles case. In this case \(\eqref{eq_cross_section_two_particles}\) gives
where \(E_1, E_2\) are the energies of the two particles in state \(\alpha\). As in the one-particle case, in order for the cross-section to be Lorentz invariant, we must define the relative velocity \(u_{\alpha}\) such that the product \(u_{\alpha} E_1 E_2\) is Lorentz invariant. Indeed, such a quantity is uniquely determined by the requirement that when one of the particles stays still, then \(u_{\alpha}\) should be the velocity of the other particle, and it takes the following form
For later use, let’s rewrite \(u_{\alpha}\) in the center-of-mass frame as follows. In the center-of-mass frame, the total momentum vanishes, and therefore we can write \(p_1 = (E_1, \pbf)\) and \(p_2 = (E_2, -\pbf)\). It follows that
which indeed looks more like a relative velocity. Note, however, that this is not a physical velocity because its value may approach \(2\) (i.e., faster than the speed of light) in relativistic limit.
The phase-space factor#
By phase-space factor we mean the factor \(\delta^4(p_{\beta} - p_{\alpha}) d\beta\) that appears in transition probabilities, rates and cross-sections discussed above. The goal of this section is to calculate it, particularly in the scenario where the final state consists of two particles. We’ll use the center-of-mass frame with respect to the initial state so that \(\pbf_{\alpha} = 0\). Then the phase-space factors can be written as follows
where we recall that the primes indicate that the quantities are taken from state \(\beta\), and \(E_{\alpha}\) denotes the total energy of state \(\alpha\). In the case where the final state consists of exactly two particles, the phase-space factor can be further simplified as follows
where \(\Omega\) is the solid angle in \(\pbf'_1\)-space, if in the integration we replace any occurrence of \(\pbf'_2\) with \(-\pbf'_1\).
To further simply the delta function in \(\eqref{eq_simplified_two_final_particles_phase_space_factor}\), we recall the following identity, which is an incarnation of integration by substitution,
where \(x_0\) is a simple zero of \(f\). In the case of \(\eqref{eq_simplified_two_final_particles_phase_space_factor}\), we make the following choices
where \(k'\) is the unique simple zero of \(f\). Differentiating \(f\) at \(k'\) we get
where
Putting all together, we can further simplify \(\eqref{eq_simplified_two_final_particles_phase_space_factor}\) as follows
where \(k', E'_1\) and \(E'_2\) are defined by \(\eqref{eq_defn_root_k_prime}, \eqref{eq_defn_e1_prime}\) and \(\eqref{eq_defn_e2_prime}\), respectively.
Substituting \(\eqref{eq_two_particles_final_state_phase_factor_formula}\) into \(\eqref{eq_differential_reaction_rate_one_particle}\), we see that in the case of one particle decaying into two particles,
The two-body scattering \(1~2 \to 1'~2'\), according to \(\eqref{eq_cross_section_two_particles}\) and \(\eqref{eq_two_particles_relative_velocity_in_center_of_mass_frame}\), takes the following form
where \(k \coloneqq |\pbf_1| = |\pbf_2|\). These calculations will be used in the next section to get some insights into the scattering process.
Implications of the unitarity of S-matrix#
In this section we’ll no longer assume the Genericity of the S-matrix. This means that we’ll get back to use \(\eqref{eq_s_matrix_with_m}\), instead of \(\eqref{eq_generic_s_matrix_with_m}\), which we recall as follows
However, all the calculations from the previous sections can still be used here because we’ll be caring about, for example, the total rates, which are integrations over all possible final states, and the degenerate ones will not contribute to such integrals.
First, let’s spell out the consequence of the unitarity of the S-matrix, or more precisely \(S^{\dagger} S = 1\), as follows
which implies
In the special case where \(\alpha = \gamma\), \(\eqref{eq_s_matrix_unitarity_implication_on_m_general}\) gives the following key identity, known as the generalized optical theorem
As an application we can calculate the total rate of all transitions produced by the initial state \(\alpha\) using \(\eqref{eq_rate_of_reaction_master_formula}\) as follows
Another application of the unitary of the S-matrix is along the lines of statistical mechanics. Applying the same calculation in \(\eqref{eq_s_matrix_unitarity_first_half}\) to \(S S^{\dagger} = 1\), we get the counterpart to \(\eqref{eq_s_matrix_unitarity_implication_on_m_special}\)
Combining with the master equation \(\eqref{eq_rate_of_reaction_master_formula}\) we have
where \(c_{\alpha} \coloneqq \left( V / (2\pi)^3 \right)^{N_{\alpha}}\).
We shall carry out an equilibrium analysis for state \(\alpha\). To this end, let \(P_{\alpha} d\alpha\) be the infinitesimal probability of finding the system in state \(\alpha\). Then we have
where the first term calculates the total rate that other states transit into \(\alpha\), and the second term calculates the total rate that the state \(\alpha\) transits into other states. Recall that the entropy of the system is defined to be
Its rate of change can be estimated as follows
where the fourth inequality follows from the general inequality \(\ln(x) \geq 1 - 1 / x\) for any \(x > 0\). This is nothing but the famous slogan: entropy never decreases! And we see that as a consequence of the unitarity of the S-matrix.
Perturbation theory of S-matrix#
Rather than being the epilogue of Scattering Theory, this section is more like a prelude to what comes next. In particular, we will work out a candidate Hamiltonian that satisfies the Lorentz invariance condition discussed in S-matrix and its symmetry.
One possible starting point of the perturbation theory is \(\eqref{eq_s_matrix_pre_born_approx}\) together with the Lippmann-Schwinger formula \(\eqref{eq_lippmann_schwinger_pure}\) which we recollect as follows
Applying \(V\) to \(\eqref{eq_lippmann_schwinger_repeated}\) and taking scalar product with \(\Phi_{\beta}\), we get
where \(V_{\beta \alpha} \coloneqq \left( \Phi_{\beta}, V\Phi_{\alpha} \right)\). One can apply \(\eqref{eq_base_iter_old_fashioned_s_matrix_perturbation}\) iteratively to get the following
and therefore a power series expansion in \(V\) of \(S_{\beta \alpha}\) in view of \(\eqref{eq_s_matrix_pre_born_approx_repeated}\).
One obvious drawback of the expansion \(\eqref{eq_s_matrix_power_series_expansion_old_fashioned}\) is that it obscures the Lorentz symmetry of the S-matrix because the denominators consist of only the energy terms. To overcome this, we shall use instead the other interpretation of the S-matrix in terms of the Hamiltonians given by \(\eqref{eq_s_operator_by_u}\) and \(\eqref{eq_defn_u_operator}\), which we recall as follows
Differentiating \(\eqref{eq_defn_u_operator_repeated}\) in \(\tau\) gives
Here
is a time-dependent operator in the so-called interaction picture, to be distinguished from the Heisenberg picture operator where the true Hamiltonian \(H\) should be used in place of \(H_0\). The differential equation \(\eqref{eq_evolution_equation_of_u_operator}\) can be easily solved as follows
which can then be iterated to give the following
Letting \(\tau \to \infty\) and \(\tau_0 \to -\infty\) we get another power series expansion of \(S\) in \(V\) as follows
It’s somewhat inconvenient that the integral limits in \(\eqref{eq_s_matrix_power_series_expansion_raw}\) ruins the permutation symmetry of the products of \(V\). But this can be fixed by introducing a time-ordered product as follows
where \(\theta(\tau)\) is the step function which equals \(1\) for \(\tau > 0\) and \(0\) for \(\tau < 0\), and it doesn’t matter what the value at \(\tau = 0\) is because it doesn’t contribute to the integrals in \(\eqref{eq_s_matrix_power_series_expansion_raw}\) anyway. With this definition, we can rewrite \(\eqref{eq_s_matrix_power_series_expansion_raw}\) as follows
where the division by \(n!\) is to account for the duplicated integrals introduced by the time-ordered product. Note that this power series looks much like the Taylor series of an exponential function. Indeed, in the unlikely event where \(V(t)\) at different times all commute, one can remove the time-ordering and write \(\eqref{eq_s_matrix_power_series_expansion_time_ordered}\) as an exponential function.
One great benefit of writing \(S\) as in the form of \(\eqref{eq_s_matrix_power_series_expansion_time_ordered}\) is that we can reformulate the condition of \(S\) being Lorentz symmetric in terms of some condition on \(V\). Recall from S-matrix and its symmetry that a sufficient condition for a Lorentz invariant S-matrix is that the S-operator commutes with \(U_0(\Lambda, a)\), or equivalently in infinitesimal terms \(\eqref{eq_h0_s_commute}\) – \(\eqref{eq_k30_s_commute}\) are satisfied. Now the main postulation is to express \(V\) using a density function as follows
such that \(\Hscr(x)\) is a scalar in the sense that
Under these assumptions, we can further rewrite \(\eqref{eq_s_matrix_power_series_expansion_time_ordered}\) in terms of \(\Hscr(x)\) as follows
This expression of \(S\) is manifestly Lorentz invariant, except for the time-ordering part. In fact, the time-ordering between two spacetime points \(x_1, x_2\) are Lorentz invariant if and only if \(x_1 - x_2\) is time-like, namely, \((x_1 - x_2)^2 \geq 0\). This is consistent with intuition because events with time-like (or light-like) separations may be observed by one observer, who definitely should know which event happened first. Therefore we obtain a sufficient condition for the Lorentz invariance of \(S\) as follows
where we’ve also included the light-like case for technical reasons that will only become clear later. This condition is also referred to as the causality condition as it may be interpreted as saying that interactions happening at space-like separations should not be correlated.
At last we’ve finally climbed the highest peak in scattering theory, namely \(\eqref{eq_h_commutativity_for_space_like_separations}\). It is specific to the relativistic theory because time-ordering is always preserved in Galilean symmetry. It is also this restriction that eventually leads us to a quantum field theory. In the words of the author
“It is this condition that makes the combination of Lorentz invariance and quantum mechanics so restrictive.” [5]
—S. Weinberg
The Cluster Decomposition Principle#
Nearly all modern texts on quantum field theory use the so-called creation and annihilation operators to describe Hamiltonians, but few explain why it is the way it is. Historically speaking, this formalism grew out of the so-called canonical quantization of electromagnetic fields. But of course we prefer logical reasons over historical ones, and it is the goal of this chapter to explain how this formalism leads to an S-matrix that satisfies the so-called cluster decomposition principle, or in plain words, that in effect distant experiments yield uncorrelated results. This is a quite fundamental assumption to keep because otherwise it’d be impossible to make an “isolated” experiment.
Bosons and fermions#
We will now address an issue we left behind from \(\eqref{eq_many_particles_state_normalization_rough}\), namely, the permutations of particles in a many-particles state
Note that we’ve used the \(3\)-momenta instead of the \(4\)-momenta to label the particles since we implicitly assume that the particles are all living on their mass shells. Moreover, we’ve decided to use the free particles states, which could equally well be in- or out-states. Since there is really no ordering of the particles, it’s conceivable that any swap of two particles should just give the same state back. More precisely, there should exist a phase \(\alpha = \alpha(\pbf, \sigma, n, \pbf', \sigma', n')\), which depends a priori on the swapping particles, such that
First of all, let’s first argue why \(\alpha\) should not depend on the other particles that appear in the states. This is in fact another essence of the cluster decomposition principle, namely, what happens between two particles should not depend on other unrelated particles, that in principle can be arbitrarily far away. Next, we argue that \(\alpha\) should not depend on the spin \(z\)-component (or the helicity for massless particles). This is because \(\alpha\) would otherwise have to furnish a \(1\)-dimensional representation of the rotation group, which, as we’ve seen in Clebsch-Gordan coefficients, doesn’t exist. Finally, if \(\alpha\) would depend on the momenta of the two swapping particles, then the Lorentz invariance would demand that the dependency takes the form of \(p^i p'_i\) which is symmetric under the swap. Hence we can conclude, by applying \(\eqref{eq_factor_alpha_for_swapping_two_particles}\) twice, that \(\alpha^2 = 1\).
Warning
The argument above that led to the conclusion \(\alpha^2 = 1\) neglected a possibility that \(\alpha\) depends on the path that the particle are brought to the momenta \(\pbf_1, \pbf_2\) and so on. We will come back to this point (much) later.
Now the question has become: should \(\alpha\) be \(1\) or \(-1\)? At this point we shall just make up a story as follows. In this world there exist two types of particles, known as bosons and fermions, such that \(\alpha = -1\) if the two swapping particles are both fermions and \(\alpha = 1\) otherwise. This is really a convention rather than any sort of dark magic – we could have from the beginning agreed upon a rule about how the particles should be ordered and always write states in that order. This convention, however, will turn out to be mathematically convenient when we have to deal with symmetries that involve multiple particles, such as the isospin symmetry.
We can now fix the signs in \(\eqref{eq_many_particles_state_normalization_rough}\). For the simplicity of notations, we shall write \(q \coloneqq (\pbf, \sigma, n)\), when details of the particle states are not important, so that a state can be shorthanded as \(\Phi_{q_1 q_2 \cdots q_N}\). In this notation \(\eqref{eq_many_particles_state_normalization_rough}\) can be written as follows
where \(\Pscr: \{1, 2, \cdots, N\} \to \{1, 2, \cdots, N\}\) is a permutation and \(\delta_{\Pscr} = -1\) if and only if \(\Pscr\), written as a product of swaps, contains an odd number of swaps of fermions.
Creation and annihilation operators#
In a nutshell, creation and annihilation operators provide us a different way to write states like \(\Phi_{q_1 \cdots q_N}\) and to write operators, e.g., the Hamiltonian, that act on the states. In this section, we shall
define the creation and annihilation operators by how they act on particle states,
calculate their (anti-)commutator,
show how to write any operator in terms of the creation and annihilation operators, and
work out the Lorentz and CPT transformation laws.
Definition of the operators#
Let’s start with the creation operator that “creates” a particle with quantum numbers \(q\) as follows
By introducing a special state \(\Phi_{\VAC}\), called the vacuum state, which is a state with no particles, we can express any state as follows
The adjoint of \(a^{\dagger}(q)\), denoted by \(a(q)\), is then the annihilation operator, which “removes” a particle from the state. Unlike the the creation operator, which according to \(\eqref{eq_defn_creation_operator}\) always add from the left to the list of existing particles, the annihilation operator necessarily needs to be able to remove the particle from anywhere in the state due to the permutation symmetry discussed in the previous section. To work out the formula for \(a(q)\), let’s first write down the most general expression as follows
where the hat means that the corresponding term is missing, and \(\sigma(i) = \pm 1\) are the indeterminants that we need to solve for. Next we pair it with a test state \(\Phi_{q'_1 q'_2 \cdots q'_N}\) and calculate the result in two ways. The first is a direct calculation using \(\eqref{eq_indeterminate_annihilation_formula}\)
where \(\Pscr: \{1, 2, \cdots, N-1\} \to \{1, 2, \cdots, \hat{i}, \cdots, N\}\) is a bijection. The second calculation uses the fact that \(a\) and \(a^{\dagger}\) are adjoint operators
A few notes are in order to explain the above calculation
If we think of \(q\) in \(\Phi_{q q'_1 \cdots q'_{N-1}}\) as having index \(0\), then \(\Pscr': \{0, 1, 2, \cdots, N-1\} \to \{1, 2, \cdots, N\}\) is a bijection such that \(\Pscr' 0 = i\) and the rest being the same as \(\Pscr\).
The sign in \(\pm 1\) is positive if \(q\) is a boson and negative if \(q\) is a fermion.
The power \(c_i\) counts the number of fermions among \(q_1, \cdots, q_{i-1}\) because the map \(\Pscr' 0 = i\) can be realized by a product of \(i\) swaps \((0 \leftrightarrow 1)(1 \leftrightarrow 2) \cdots (i-1 \leftrightarrow i)\) and only those swaps with a fermion may contribute a \(-1\).
Comparing the results of the two ways of calculating the same quantity, we see that \(\sigma(i) = (\pm 1)^{c_i}\) and therefore can rewrite \(\eqref{eq_indeterminate_annihilation_formula}\) as follows
Note that \(a(q)\) annihilates the vacuum state \(\Phi_{\VAC}\), whether \(q\) is boson or fermion, since there is no state that contains \(-1\) particles.
Note
Although \(a^{\dagger}(q)\) and \(a(q)\) are called the creation and annihilation operators and they indeed appear to add and remove particles from a state, respectively, at least in our framework, it is really more of a mathematical convenience than anything physically realistic – one should not imagine particles getting created or destroyed like magic.
The (anti-)commutation relation#
Just like all the operators we’ve encountered so far, it’ll be important to calculate some sort of commutator \(\left[ a(q'), a^{\dagger}(q) \right]\). The calculation naturally splits into two halves: the first half is as follows
where the power \(c'_i = c_i + 1\) if both \(q\) and \(q'\) are fermions, and \(c'_i = c_i\) otherwise.
The second half is more straightforward and is done as follows
Now we would like to combine the two halves to cancel the blue expressions. More precisely, we need to sum up the two if both \(q\) and \(q'\) are fermions, and subtract the two otherwise. The result can be formulated as follows
where the sign \(\pm\) is positive if both \(q\) and \(q'\) are fermions, and negative otherwise.
Moreover, one can use the definitions \(\eqref{eq_defn_creation_operator}\) and \(\eqref{eq_defn_annihilation_operator}\) to show the following complementing identities
with the same sign convention as in \(\eqref{eq_creation_annihilation_commutator}\).
A universal formula of operators#
We will show that any operator (on states) \(\Ocal\) can be written as a sum of products of creation and annihilation operators as follows
where \(C_{NM}(q'_1, \cdots, q'_N, q_1, \cdots, q_M)\) are the coefficients to be determined. Indeed the coefficients \(C_{NM}\) can be worked out by induction as follows. The base case if when \(N = M = 0\), where we simply define
Now suppose inductively that \(C_{NM}\) have been defined for all \(N < L, M \leq K\) or \(N \leq L, M < K\). Then one calculates
where the factorials \(L!\) and \(K!\) are included to account for the total permutations of \(q'_1, \cdots, q'_L\) and \(q_1, \cdots q_K\), respectively. The coefficient \(C_{LK}\) is therefore uniquely determined, and hence all \(C_{NM}\) by induction.
Note
In \(\eqref{eq_general_operator_expansion_in_creation_and_annihilation}\) we’ve chosen a specific ordering of the creation and annihilation operators, known as the normal order. Namely, all the creation operators lie to the left of all the annihilation operators. It has at least one advantage of making \(\eqref{eq_defn_c00}\) obvious. Finally we note that any ordering of a composition of creation and annihilation operators can be normally ordered by applying \(\eqref{eq_creation_annihilation_commutator}\).
The Lorentz and CPT transformation laws#
Let’s first work out how \(a^{\dagger}(\pbf, \sigma, n)\) and \(a(\pbf, \sigma, n)\) transform under Lorentz transformations. To this end, we recall the general Lorentz transformation law on free-particles state \(\eqref{eq_lorentz_transformation_formula_for_many_free_particles}\), and use \(b\) for the translational parameter to avoid conflict of notations, as follows
where we also recall that \(U_0\) is the Lorentz transformation on free-particles state. Expanding \(\eqref{eq_particles_state_from_creation_operators}\) as follows
and imposing the assumption that the vacuum state is fixed by any Lorentz transformation
we see that \(a^{\dagger}(\pbf, \sigma, n)\) better transforms as follows
where \(\pbf_{\Lambda}\) is the \(3\)-momentum part of \(\Lambda p\). The corresponding transformation law for the annihilation operator \(a(\pbf, \sigma, n)\) can be obtained by taking the adjoint of \(\eqref{eq_lorentz_transformation_formula_for_creation_operator}\) and remembering that \(U_0\) is unitary as follows
The parity, time, and charge symmetry on the creation and annihilation operators can be worked out as follows
where \(\Ccal\) replaces a particle of species \(n\) with its antiparticle – a notion we haven’t really explained yet – \(n^c\). The corresponding transformation laws for \(a(\pbf, \sigma, n)\) can be derived from the above identities by taking the adjoint, and are omitted here.
Cluster decomposition of S-matrix#
We are now ready to formalize the cluster decomposition principle, which states that experiments conducted at places far away from each other – a notion which will be made more precise later in the section – should yield uncorrelated results, in terms of the S-matrix.
Recall that \(S_{\beta \alpha} \coloneqq \left( \Psi^+_{\beta}, \Psi^-_{\alpha} \right)\) is the amplitude of a reaction with in-state \(\alpha\) and out-state \(\beta\), both of which are compound indexes of many-particles states. Now suppose the particles admit partitions \(\alpha = \alpha_1 \sqcup \cdots \sqcup \alpha_N\) and \(\beta = \beta_1 \sqcup \cdots \sqcup \beta_N\), respectively, so that any particle from \(\alpha_i \cup \beta_i\) is spatially far away from any particle from \(\alpha_j \cup \beta_j\) for \(i \neq j\), or in plain words, the in- and out-state particles form \(N\) clusters that are spatially far away from each other. Then the cluster decomposition principle demands a corresponding splitting of the S-matrix as follows
It is, however, not clear from \(\eqref{eq_s_matrix_splitting}\) what conditions on \(S_{\beta \alpha}\) would guarantee the cluster decomposition principle. To this end, we’ll introduce a recursive expansion of \(S_{\beta \alpha}\) in terms of the so-called “connected parts” \(S^C_{\beta \alpha}\), which can be best understood visually within the framework of Feynman diagrams to be introduced later. Roughly speaking, the idea is to decompose \(S_{\beta \alpha}\) into components that involve only a (possibly full) subset of the particles.
Let’s work out the recursive definitions of \(S^C_{\beta \alpha}\) from bottom up. For simplicity, we’ll in addition make the following assumption
Assumption on particle stability
The single-particle states are stable, i.e., it’s not possible for a single-particle state to transit into any other state, including the vacuum state.
Under this assumption we can define the base case as follows
For a two-body interaction, we can define \(S^C_{q'_1 q'_2,~q_1 q_2}\) by the following identity
where the sign \(\pm\) is negative if both \(q_1\) and \(q_2\) are fermions, and positive otherwise. This should be interpreted as saying that \(S^C_{q'_1 q'_2,~q_1 q_2}\) is the amplitude that the two particles actually interact, and the rest is the amplitude that they do not interact at all. Indeed \(\eqref{eq_defn_connected_part_two_body}\) can be seen as another incarnation of \(\eqref{eq_s_matrix_with_m}\).
This procedure can be continued to, say, a three-body problem as follows
and in general by recursion as follows
where the sum is taken over all nontrivial partitions of \(\alpha\) and \(\beta\).
The upshot of all the trouble of writing \(S_{\beta \alpha}\) as a sum of connected parts is that the cluster decomposition principle \(\eqref{eq_s_matrix_splitting}\) can now be rephrased as \(S^C_{\beta \alpha} = 0\) if any two particles in \(\alpha \cup \beta\) are (spatially) far apart. This can be illustrated by the following example of a four-body reaction \(1234 \to 1'2'3'4'\) such that the particles \(\{1,2,1',2'\}\) are far away from \(\{3,4,3',4'\}\). With obvious shorthand notations, we have
as expected.
Finally, let’s quantify the phrase “spatially far apart”, which we’ve been using so far without much explanation. Since our states are defined in the momentum space, we first need to translate the amplitude into spatial coordinates using Fourier transform as follows
where we’ve kept only the momentum of each particle because the other quantum numbers do not play a role in the current discussion.
Now if \(S^C_{\pbf'_1 \pbf'_2 \cdots,~\pbf_1 \pbf_2 \cdots}\) were (reasonably) smooth, e.g., Lebesgue integrable, then the Riemann-Lebesgue lemma asserts that the left-hand-side \(S^C_{\xbf'_1 \xbf'_2 \cdots,~\xbf_1 \xbf_2 \cdots}\) vanishes as any linear combinations of \(\xbf'_1, \xbf'_2, \cdots, \xbf_1, \xbf_2, \cdots\) go to infinity. This is a slightly too strong constraint because we know from the translational invariance that the left-hand-side shouldn’t change by an overall translation, no matter how large it is. To remedy this defect, we introduce the factors of energy and momentum conservation delta functions in \(S^C_{\pbf'_1 \pbf'_2 \cdots,~\pbf_1 \pbf_2 \cdots}\) as follows
which guarantees that an overall translation will not change the integral. Moreover, we see that, in fact, the remaining \(C_{\pbf'_1 \pbf'_2 \cdots,~\pbf_1 \pbf_2 \cdots}\) cannot contain any more delta functions of the momenta, such as e.g., \(\delta^3(\pbf'_1 - \pbf_1)\), because otherwise one could translate a subset of the particles, such as e.g., \(\{\xbf'_1, \xbf_1\}\), far away from the others, while keeping their relative position fixed, and henceforth leaving \(S^C_{\pbf'_1 \pbf'_2 \cdots,~\pbf_1 \pbf_2 \cdots}\) unchanged. But this would violate the cluster decomposition principle. All in all, we’ve arrived at the following key conclusion
The cluster decomposition principle is equivalent to the condition that every connected parts of the S-matrix contain exactly one momentum-conservation delta function.
at least under the assumption on particle stability.
Cluster decomposable Hamiltonians#
We have seen from the previous section how the cluster decomposition principle is equivalent to a “single momentum-conservation delta function” condition on the connected parts of the S-matrix. The goal of this section is to further translate this condition to one on the Hamiltonian, as a prelude to the next chapter. Writing the Hamiltonian in its most general form as follows
we claim that the S-matrix corresponding to \(H\) satisfies the clustering decomposition principle if each of the coefficients \(h_{NM}\) contains exactly one momentum-conservation delta function.
To this end, we’ll use the time-dependent perturbation of the S-matrix given by \(\eqref{eq_s_matrix_power_series_expansion_time_ordered}\), rephrased in terms of the matrix entries as follows
where we also recall \(V(t) \coloneqq \exp(\ifrak H_0 t) V \exp(-\ifrak H_0 t)\) and \(H = H_0 + V\). Now we remember the following facts
Both \(\Phi_{\alpha}\) and \(\Phi_{\beta}\) can be written as a number of creation operators applied to \(\Phi_{\VAC}\).
The creation operators from \(\Phi_{\beta}\) can be moved to the front of \(T\{V(t_1) \cdots V(t_n)\} \Phi_{\alpha}\) by adjoint, and become annihilation operators.
Each \(V(t)\) can be written as a polynomial in creation and annihilation operators as in \(\eqref{eq_general_operator_expansion_in_creation_and_annihilation}\).
Creation and annihilation operators satisfy the canonical commutation relations \(\eqref{eq_creation_annihilation_commutator}\), which can be reorganized as \(a(q') a^{\dagger}(q) = \pm a^{\dagger}(q) a(q') + \delta(q' - q)\) to highlight the effect of moving a creation operator left over an annihilation operator – it produces a delta function. Moreover, since we’ll only care about the momentum in this section, we’ll assume that all the Kronecker deltas of discrete quantum numbers have already been summed up , so that \(\delta(q' - q)\) becomes really \(\delta(p' - p)\), or even \(\delta(\pbf' - \pbf)\) assuming the particles are on their mass shells.
One particularly convenient way to calculate \(\eqref{eq_time_dependent_perturbative_s_matrix_element}\) is to move all the creation operators to the left of the annihilation operators, while collecting all the delta functions along the way. In the end, the only nonzero term is a vacuum expectation value which is a polynomial of these delta functions. In other words, none of the creation and annihilation operators will survive because otherwise by construction, the rightmost operator is necessarily an annihilation operator, which then would be vanishing by acting on \(\Phi_{\VAC}\). This procedure can be rather messy, but luckily, there exists already a convenient bookkeeping device, known as the Feynman diagrams. We will encounter Feynman diagrams many times going forward as it is such a convenient tool, and here we’ll just focus on the momentum conservations.
Let’s write \(S^{(n)}_{\beta \alpha} \coloneqq \left( \Phi_{\beta}, T\{V(t_1) \cdots V(t_n)\} \Phi_{\alpha} \right)\). Then the general recipe for constructing a Feynman diagram to keep track of the delta functions in \(S^{(n)}_{\beta \alpha}\) consists of the following steps.
Warning
The Feynman diagram described here will not be very useful in quantum-theoretic calculations since, as we’ll see in the next chapter, the interaction densities can only be constructed out of quantum fields, rather than the individual annihilation and creation operators. The only purpose of the diagrams to be described below is to illustrate the cluster decomposition principle in terms of the Hamiltonian. The generally useful Feynman diagrams will be introduced in The Feynman Rules.
Orient the paper on which the diagram will be drawn so that the time direction goes upwards. (This is of course a random choice just to set up the scene.)
Place as many vertical strands as the number of particles in \(\Phi_{\alpha}\) at the bottom of the page. Do the same to \(\Phi_{\beta}\) but at the top of the page. In actual Feynman diagrams the strands will be oriented according to the flow of time (and a distinction between particles and antiparticles), but it’s not important for our purposes here – we will only use the orientations to guarantee that an annihilation operator is paired with a creation operator. For definiteness, let’s orient these strands to point upwards.
Place one vertex (i.e., a fat dot) for each \(V(t)\), or more precisely, each monomial in annihilation and creation operators in \(V(t)\), with as many incoming strands as the number of annihilation operators, and outgoing strands as the number of creation operators. Note that since we’ll only be doing a momentum conservation analysis, the details of \(V(t)\), e.g., the coefficients in front of the product of the annihilation and creation operators, which will be worked out in the next chapter, is not important here.
To each strand from the previous two steps, associate a \(3\)-momentum \(\pbf\) of the corresponding particle. To each vertex, associate a momentum-conservation delta function so that the total incoming momenta equals the total outgoing momenta. This is in fact a consequence of an assumption on \(V(t)\), which is related to our assumption on \(h_{NM}\) in \(\eqref{eq_general_expansion_of_hamiltonian}\), and will be elaborated on later in this section.
Connect all the loose ends of the strands in a way that is compatible with the orientations on the strands.
To each edge from the previous step, which connects two strands with momenta \(\pbf\) and \(\pbf'\), respectively, associate a delta function \(\delta(\pbf' - \pbf)\), which comes from the canonical commutation relation \(\eqref{eq_creation_annihilation_commutator}\).
Finally \(S^{(n)}_{\beta \alpha}\) is simply a sum over products of delta functions, one for each such diagram.
As an example to illustrate the above steps, let’s consider a five-body scattering with all particle identical to each other except for their momenta as follows
where the coefficient of \(V(t)\), except for the momentum conserving delta function, is suppressed. The figure below illustrates a few summands of the third order \(S^{(3)}_{\beta \alpha} = \left( \Phi_{\beta}, T\{V(t_1)V(t_2)V(t_3)\}\Phi_{\alpha} \right)\).
Here we’ve used shorthand notations such as \(\delta_{1,6'} \coloneqq \delta^3(\pbf_1 - \pbf'_6)\) and \(\delta_{8,9 - 6,7} \coloneqq \delta^3(\pbf_8 + \pbf_9 - \pbf_6 - \pbf_7)\).
Notice that some summands of \(S^{(n)}_{\beta \alpha}\) are connected, in the sense that the graph, viewed as undirected, is connected, and some, for example the second one, are disconnected. For the disconnected ones, it’s obvious that the product of the delta functions splits into a product of products, one for each connected components. Therefore we can rewrite \(S^{(n)}_{\beta \alpha}\) as follows
where the sum is taken over all partitions \(\op{PART}'\) of \(\alpha = \sqcup_{j=1}^{\kappa} \alpha_j\), \(\beta = \sqcup_{j=1}^{\kappa} \beta_j\) and \(\{1,2,\cdots,n\} = \sqcup_{j=1}^{\kappa} \{j1, \cdots, jn_j\}\), and the subscript \(C\) indicates that only connected diagrams are allowed. Such refactorization is possible because in the eventual evaluation of a Feynman diagram, such as the ones worked out in the above example, different components are essentially independent of each other.
When evaluating \(\eqref{eq_time_dependent_perturbative_s_matrix_element}\) using Feynman diagrams as explained above, we note that all the \(V(t)\)’s in \(T\{V(t_1) \cdots V(t_n)\}\) are interchangeable. It follows that a partition \(\op{PART}'\) can be split up into a partition \(\op{PART}\) of \(\alpha\) and \(\beta\) respectively into \(\kappa\) clusters, and, modulo permutations, a partition \(n = n_1 + \cdots n_{\kappa}\). We can then evaluate \(\eqref{eq_time_dependent_perturbative_s_matrix_element}\) as follows
Comparing with \(\eqref{eq_s_matrix_recursive_by_connected_parts}\), we see that
which also justifies calling \(S^C_{\beta \alpha}\) a connected part of \(S_{\beta \alpha}\), because it corresponds to connected Feynman diagrams.
It remains to argue that \(S^C_{\beta \alpha}\) contains exactly one momentum-conservation delta function, assuming that the same applies to the coefficients \(h_{NM}\) in \(\eqref{eq_general_expansion_of_hamiltonian}\). Indeed, since \(H = H_0 + V\) and the single momentum-conservation condition holds automatically true for \(H_0\), the same holds for \(V\). In other words, each vertex in the Feynman diagram produces one single momentum-conservation delta function, as promised earlier.
Finally, we note that the fact that each connected Feynman diagram gives rise to exactly one momentum-conservation delta function is the consequence of an elimination process. More precisely, we can first get rid of delta functions on the internal edges, i.e., edges between vertices, by equating the momenta on the two ends of the edge. So we’re left with as many delta functions as there are vertices. Then for each internal edge, we can choose one of its endpoints, and solve the associated momentum in terms of the other momenta. Hence we can get rid of all but one delta function, as claimed.
An example elimination process is presented for the following Feynman diagram.
We skip the first step of equating momenta on the ends of edges, and continue with the conservation delta functions on the vertices as follows
where we’ve first eliminated \(\pbf_3\) using the first delta function, and then \(\pbf_5\) using either one of the two remaining delta functions, to arrive at the final result.
Quantum Fields and Antiparticles#
In this chapter we will construct the Hamiltonians in the form of \(H = H_0 + V\), where \(H_0\) is the Hamiltonian of free particles, and \(V = \int d^3 x~\Hscr(0, \xbf)\) is a (small) interaction term in the form of \(\eqref{eq_defn_v_by_density}\), and the interaction density \(\Hscr(x)\) is a Lorentz scalar in the sense of \(\eqref{eq_h_density_is_scalar}\) and satisfies the cluster decomposition principle \(\eqref{eq_h_commutativity_for_space_like_separations}\). As a byproduct of the construction, we’ll also demystify the so-called antiparticles which have been mentioned a number of times so far without definition.
Symmetries and quantum fields#
Following the discussions in The Cluster Decomposition Principle, we’ll construct \(\Hscr(x)\) out of creation and annihilation operators. However, as we’ve seen in The Lorentz and CPT transformation laws, the Lorentz transformation of both \(a^{\dagger}(q)\) and \(a(q)\) involve in the coefficients a matrix element \(D_{\sigma \sigma'}(W(\Lambda, p))\) that depend on the momenta of the particles, and hence are not scalars. The idea, then, is to construct \(\Hscr(x)\) out of the so-called creation and annihilation fields defined as follows
where \(\ell\) is reserved for labeling particles later. We see, in particular, that the creation field \(\psi_{\ell}^-(x)\) is a superposition of creation operators. Applying it to the vacuum state and let \(x\) wander around the whole spacetime then creates a (quantum) field.
Note
Just like particles, fields may be either bosonic or fermionic, but not mixed. For example \(\psi^-_{\ell}(x)\) is a bosonic/fermionic if and only if all the particles created by \(a^{\dagger}\) in \(\eqref{eq_defn_creation_field}\) are bosonic/fermionic, respectively.
Lorentz symmetries#
Now the hope is that the creation and annihilation fields transform, under (proper orthochronous) Lorentz symmetry, by a matrix that is independent of the spacetime coordinate \(x\). More precisely, we’d like the following to hold
Note that we’ve put \(\Lambda^{-1}\) inside \(D_{\ell \ell'}\) so that \(D\) furnishes a representation of the homogeneous Lorentz transformations in the sense that \(D(\Lambda_1) D(\Lambda_2) = D(\Lambda_1 \Lambda_2)\). [6] There is a priori no reason to use the same representation \(D\) for both \(\psi^{\pm}_{\ell}\), but this turns out to be possible just by calculation. Moreover, the representation \(D\) is not assumed to be irreducible. Indeed, as we’ll see later, it generally decomposes into blocks fixed by further labels.
Then we can try to construct \(\Hscr(x)\) by a formula similar to \(\eqref{eq_general_expansion_of_hamiltonian}\) as follows
It follows from \(\eqref{eq_conjugate_annihilation_field}\) and \(\eqref{eq_conjugate_creation_field}\) that the interaction density defined by \(\eqref{eq_construct_interaction_density_by_fields}\) is a scalar if the following holds
The solution to the last problem relies on a classification of the representations of the Lorentz group, which has been discussed (in terms of the little group representations) in One-Particle States, and shall be dealt with at a later point. The main goal of this section is to pin down the conditions \(u_{\ell}\) and \(v_{\ell}\) must satisfy so that they can be explicitly solved in the following sections.
For this section, we’ll focus on the massive case. Recall the Lorentz transformation laws for the creation and annihilation of massive particles \(\eqref{eq_lorentz_transformation_formula_for_creation_operator}, \eqref{eq_lorentz_transformation_formula_for_annihilation_operator}\) as follows
where we’ve used the fact that \(D\) is unitary in the sense that \(D^{\dagger} = D^{-1}\) to invert \(W(\Lambda, p)\) (and flip the indexes) for later convenience – mostly because of the use of \(\Lambda^{-1}\) in \(\eqref{eq_conjugate_annihilation_field}\) and \(\eqref{eq_conjugate_creation_field}\).
Applying \(\eqref{eq_lorentz_transformation_formula_for_annihilation_operator_revisited}\) to \(\eqref{eq_defn_annihilation_field}\), respectively, we get
where in the last equality we’ve used that the fact \(\eqref{eq_lorentz_invariant_3_momentum_volume_element}\) that \(d^3 p / p_0\) is Lorentz invariant. Comparing this with the right-hand-side of \(\eqref{eq_conjugate_annihilation_field}\)
Equating the blue parts the two calculations, and inverting \(D^{(j_n)}_{\sigma \sigma'} (W^{-1}(\Lambda, p))\) and \(D_{\ell \ell'}(\Lambda^{-1})\), we get
A parallel calculation for the \(v_{\ell}\) in \(\eqref{eq_conjugate_creation_field}\), which we’ll omit, gives the following
The identities \(\eqref{eq_annihilation_u_transformation}\) and \(\eqref{eq_creation_v_transformation}\) pose the fundamental conditions on \(u_{\ell}\) and \(v_{\ell}\), respectively, which we’ll utilize to eventually solve for their solutions. Currently both \(u_{\ell}\) and \(v_{\ell}\) depend on \(x, \pbf, \sigma\) and \(n\), and the goal is to use the Lorentz symmetry to reduce the dependencies. This will be carried out in three steps, corresponding to the three types of Lorentz symmetries: translations, boosts, and rotations, as follows.
- Translations
Taking \(\Lambda = 1\) in \(\eqref{eq_creation_v_transformation}\) gives \(\exp(\ifrak b \cdot p) u_{\ell}(x;~\pbf, \sigma, n) = u_{\ell}(x + b;~\pbf, \sigma, n)\), which then implies
\begin{equation} u_{\ell}(x;~\pbf, \sigma, n) = (2\pi)^{-3/2} e^{\ifrak p \cdot x} u_{\ell}(\pbf, \sigma, n) \label{eq_redefine_u_after_translation} \end{equation}and similarly
\begin{equation} v_{\ell}(x;~\pbf, \sigma, n) = (2\pi)^{-3/2} e^{-\ifrak p \cdot x} v_{\ell}(\pbf, \sigma, n) \label{eq_redefine_v_after_translation} \end{equation}where we’ve slightly abused notations by keeping the names of \(u_{\ell}\) and \(v_{\ell}\), while changing their arguments. Here the seemingly redundant \((2\pi)^{-3/2}\) is inserted so that the fields
\begin{align} \psi^+_{\ell}(x) &= \sum_{\sigma, n} (2\pi)^{-3/2} \int d^3 p~e^{\ifrak p \cdot x} u_{\ell}(\pbf, \sigma, n) a(\pbf, \sigma, n) \label{eq_annihilation_field_simplified_by_translation} \\ \psi^-_{\ell}(x) &= \sum_{\sigma, n} (2\pi)^{-3/2} \int d^3 p~e^{-\ifrak p \cdot x} v_{\ell}(\pbf, \sigma, n) a^{\dagger}(\pbf, \sigma, n) \label{eq_creation_field_simplified_by_translation} \end{align}look like the usual Fourier transforms.
Plugging \(\eqref{eq_redefine_u_after_translation}\) and \(\eqref{eq_redefine_v_after_translation}\) into \(\eqref{eq_annihilation_u_transformation}\) and \(\eqref{eq_creation_v_transformation}\), respectively, they can be simplified as follows
\begin{align} \sqrt{\frac{p_0}{(\Lambda p)_0}} \sum_{\ell} D_{\ell' \ell}(\Lambda) u_{\ell}(\pbf, \sigma, n) &= \sum_{\sigma'} u_{\ell'}(\pbf_{\Lambda}, \sigma', n) D^{(j_n)}_{\sigma' \sigma}(W(\Lambda, p)) \label{eq_annihilation_u_transformation_simplified_by_translation} \\ \sqrt{\frac{p_0}{(\Lambda p)_0}} \sum_{\ell} D_{\ell' \ell}(\Lambda) v_{\ell}(\pbf, \sigma, n) &= \sum_{\sigma'} v_{\ell'}(\pbf_{\Lambda}, \sigma', n) D^{(j_n) \ast}_{\sigma' \sigma}(W(\Lambda, p)) \label{eq_creation_v_transformation_simplified_by_translation} \end{align}for any homogeneous Lorentz transformation \(\Lambda\).
- Boosts
Taking \(\pbf = 0\) and \(\Lambda = L(q)\) which takes a particle at rest to one with (arbitrary) momentum \(q\), we see that
\begin{equation*} W(\Lambda, p) \xlongequal{\eqref{eq_w_from_l}} L(\Lambda p)^{-1} \Lambda L(p) = L(q)^{-1} L(q) = 1 \end{equation*}In this case \(\eqref{eq_annihilation_u_transformation_simplified_by_translation}\) and \(\eqref{eq_creation_v_transformation_simplified_by_translation}\) take the following form (with \(\qbf\) substituted by \(\pbf\))
\begin{align} \sqrt{\frac{m}{p_0}} \sum_{\ell} D_{\ell' \ell}(L(p)) u_{\ell}(0, \sigma, n) &= u_{\ell'}(\pbf, \sigma, n) \label{eq_annihilation_u_transformation_simplified_by_boost} \\ \sqrt{\frac{m}{p_0}} \sum_{\ell} D_{\ell' \ell}(L(p)) v_{\ell}(0, \sigma, n) &= v_{\ell'}(\pbf, \sigma, n) \label{eq_creation_v_transformation_simplified_by_boost} \end{align}It follows that one can calculate \(u_{\ell}(\pbf, \sigma, n)\) for any \(\pbf\) from the special case of \(\pbf = 0\) given a representation \(D\).
- Rotations
Taking \(\pbf = 0\) and \(\Lambda = \Rcal\) a \(3\)-rotation, and recalling from \(\eqref{eq_little_group_rotation}\) that \(W(\Lambda, p) = \Rcal\), we get special cases of \(\eqref{eq_annihilation_u_transformation_simplified_by_translation}\) and \(\eqref{eq_creation_v_transformation_simplified_by_translation}\) as follows
\begin{align*} \sum_{\ell} D_{\ell' \ell}(\Rcal) u_{\ell}(0, \sigma, n) &= \sum_{\sigma'} u_{\ell'}(0, \sigma', n) D_{\sigma' \sigma}^{(j_n)}(\Rcal) \\ \sum_{\ell} D_{\ell' \ell}(\Rcal) v_{\ell}(0, \sigma, n) &= \sum_{\sigma'} v_{\ell'}(0, \sigma', n) D_{\sigma' \sigma}^{(j_n) \ast}(\Rcal) \end{align*}Using \(\eqref{eq_representation_rotation_first_order}\) we can further reduce it to the first order as follows
\begin{align} \sum_{\ell} \hat{\Jbf}_{\ell' \ell} u_{\ell}(0, \sigma, n) &= \sum_{\sigma'} u_{\ell'}(0, \sigma', n) \Jbf^{(j_n)}_{\sigma' \sigma} \label{eq_j_intertwines_u} \\ \sum_{\ell} \hat{\Jbf}_{\ell' \ell} v_{\ell}(0, \sigma, n) &= -\sum_{\sigma'} v_{\ell'}(0, \sigma', n) \Jbf^{(j_n) \ast}_{\sigma' \sigma} \label{eq_j_intertwines_v} \end{align}where \(\hat{\Jbf}\) denotes the angular momentum vector for the representation \(D_{\ell' \ell}(\Rcal)\), in analogy with the usual angular momentum \(\Jbf^{(\jfrak)}\) for \(D^{(j)}(\Rcal)\).
The cluster decomposition principle#
Let’s verify that the fields defined by \(\eqref{eq_annihilation_field_simplified_by_translation}\) and \(\eqref{eq_creation_field_simplified_by_translation}\), when plugged into \(\eqref{eq_construct_interaction_density_by_fields}\), indeed satisfy the cluster decomposition principle as discussed in The Cluster Decomposition Principle. It’s really just a straightforward but tedious calculation which we spell out as follows
Besides re-ordering the terms, the only actual calculation is highlighted in the two blue terms, where the second one is the integral of the first. One can compare this calculation with \(\eqref{eq_general_expansion_of_hamiltonian}\) and see that the cluster decomposition principle is indeed satisfied because there is a unique momentum conservation delta function in each coefficient, as long as \(g, u, v\) are reasonably smooth, i.e., it’s fine to have poles and/or branching singularities but no delta functions.
Causality and antiparticles#
We now turn to the other crucial condition on the Hamiltonian, namely, the causality condition \(\eqref{eq_h_commutativity_for_space_like_separations}\). Given the general formula \(\eqref{eq_construct_interaction_density_by_fields}\) of the interaction density, we are forced to require that \([\psi^+_{\ell}(x), \psi^-_{\ell'}(y)] = 0\) whenever \(x - y\) is space-like. However, according to \(\eqref{eq_annihilation_field_simplified_by_translation}\) and \(\eqref{eq_creation_field_simplified_by_translation}\), we have
where the sign \(\pm\) is positive if the field is fermionic, and negative otherwise. This quantity is not necessarily vanishing even if \(x - y\) is space-like.
Therefore in order to construct \(\Hscr\) in the form of \(\eqref{eq_construct_interaction_density_by_fields}\) that satisfies \(\eqref{eq_h_commutativity_for_space_like_separations}\), we must not just use \(\psi^{\pm}(x)\) as the building blocks. It turns out that one may consider a linear combination of the two as follows
as well as its adjoint \(\psi^{\dagger}_{\ell}(x)\), and hope that they satisfy
whenever \(x-y\) is space-like, and replace \(\psi^{\pm}_{\ell}(x)\) with \(\psi_{\ell}(x), \psi^{\dagger}_{\ell}(x)\) in \(\eqref{eq_construct_interaction_density_by_fields}\). Under these assumptions, we can then construct the interaction density \(\Hscr\) as a polynomial in \(\psi_{\ell}(x), \psi_{\ell}^{\dagger}(x)\) with an even number of fermionic fields (so that the sign in \(\eqref{eq_space_like_commutativity_for_combined_field}\) is negative).
There remains, however, one issue with field like \(\eqref{eq_space_like_commutativity_for_combined_field}\) that mixes creation and annihilation fields. Namely, special conditions must hold in order for such fields to play well with conserved quantum numbers. To be more specific, let \(Q\) be a conserved quantum number, e.g., the electric charge. Then the following hold
where \(q(n)\) denotes the quantum number of the particle species \(n\). These identities can be verified by applying both sides to \(\Psi_{\pbf, \sigma, n}\) and \(\Psi_{\VAC}\), respectively.
Now in order for \(Q\) to commute with \(\Hscr\), which is constructed as a polynomial of \(\psi_{\ell}(x)\) and \(\psi_{\ell}^{\dagger}(x)\), we better have
so that each monomial (with coefficient neglected) \(\psi^{\dagger}_{\ell'_1}(x) \cdots \psi^{\dagger}_{\ell'_M}(x) \psi_{\ell_1}(x) \cdots \psi_{\ell_N}(x)\) in \(\Hscr\) will commute with \(Q\) if
Note that the negative sign in \(\eqref{eq_charge_of_psi_field}\) is a formal analogy to \(\eqref{eq_charge_of_creation_operator}\), where we think of \(\psi_{\ell}(x)\) as an annihilation field even though it’s really not. Since \(\psi_{\ell}(x)\) is a linear combination of \(\psi^+_{\ell}(x)\) and \(\psi^-_{\ell}(x)\), which in turn are superpositions of annihilation and creation operators, respectively, it follows from \(\eqref{eq_charge_of_creation_operator}\) and \(\eqref{eq_charge_of_annihilation_operator}\) that in order for \(\eqref{eq_charge_of_psi_field}\) to hold, the following conditions must be satisfied
all particles annihilated by \(\psi^+_{\ell}(x)\) must have the same charge \(q(n) = q_{\ell}\),
all particles created by \(\psi^-_{\ell}(x)\) must have the same charge \(q(n) = -q_{\ell}\), and
for any particle of species \(n\), which is annihilated by \(\psi^+_{\ell}(x)\), there exists a particle of species \(\bar{n}\), which is created by \(\psi^-_{\ell}(x)\), such that \(q(n) = -q(\bar{n})\).
The particles of species \(n\) and \(\bar{n}\) are called antiparticles of each other – they are exactly the same except for the charges which are opposite. It is the last condition that demands the existence of particle-antiparticle pairs so that one can formulate a consistent (relativistic) quantum field theory.
Scalar fields#
We’ll start, as always, with the simplest case of scalar fields, namely, when \(\psi^+(x) = \psi^+_{\ell}(x)\) and \(\psi^-(x) = \psi^-_{\ell}(x)\) are scalar functions. We argue first that such fields can only create/annihilate spinless particles. Indeed, since \(\hat{\Jbf}\) necessarily vanishes, it follows from \(\eqref{eq_j_intertwines_u}\) and \(\eqref{eq_j_intertwines_v}\) that \(u\) and \(v\) may be nonzero if and only if \(j_n = 0\). If we, for the moment, are concerned with just one particle species, then we can write \(u(\pbf, \sigma, n) = u(\pbf)\) and \(v(\pbf, \sigma, n) = v(\pbf)\). Lastly, we note that since \(D = 1\) in this case, \(\eqref{eq_annihilation_u_transformation_simplified_by_translation}\) and \(\eqref{eq_creation_v_transformation_simplified_by_translation}\) become
It follows that
where the factor \(2\) is purely conventional. In particular \(u(0) = v(0) = (2m)^{-1/2}\).
Plugging \(\eqref{eq_scalar_u_and_v}\) into \(\eqref{eq_annihilation_field_simplified_by_translation}, \eqref{eq_creation_field_simplified_by_translation}\), we get
To construct \(\Hscr\), we first note that \(\eqref{eq_coefficient_g_transformation_law}\) trivially holds in this case because \(D = 1\) and \(g\) is simply a scalar. Next let’s consider the causality condition which demands that \(\left[ \psi^+(x), \psi^-(y) \right] = 0\) whenever \(x - y\) is space-like. Using the canonical commutation relation \(\eqref{eq_creation_annihilation_commutator}\) we calculate
where
We notice that \(\Delta_+(x)\) is manifestly (proper orthochronous) Lorentz invariant – the invariance of the volume element comes from \(\eqref{eq_lorentz_invariant_3_momentum_volume_element}\). It is, however, not in general invariant under transformations like \(x \to -x\). But, as we’ll see, such invariance holds assuming \(x\) is space-like.
Note
The plus subscript in \(\Delta_+(x)\) is there to distinguish it from an anti-symmetrized version \(\Delta(x)\) to be introduced later.
Now we’ll restrict ourselves to the special case of a space-like \(x\) which, up to a Lorentz transformation, can be assumed to take the form \(x = (0, \xbf)\) with \(|\xbf| > 0\). In this case, we can then calculate \(\Delta_+(x)\) as follows [7]
The last integral cannot be easily evaluated, at least without some knowledge about special functions. Nonetheless, we observe that \(\Delta_+(x) \neq 0\), which means that \(\Hscr\) cannot be just any polynomial in \(\psi^{\pm}(x)\). Moreover, we note that \(\Delta_+(x) = \Delta_+(-x)\) as promised earlier. As already mentioned in \(\eqref{eq_defn_psi_field}\), let’s try
We can then try to make \(\eqref{eq_space_like_commutativity_for_combined_field}\) hold by the following calculations
We see that \(\eqref{eq_space_like_commutativity_for_combined_field}\) holds for scalar fields if the fields are bosonic, i.e., the bottom sign in \(\pm\) applies, and \(|\kappa| = |\lambda|\). By adjust the phase of \(a(\pbf)\), we can actually arrange so that \(\kappa = \lambda\), in which case we have
Note
Although the arrangement of phase so that \(\kappa = \lambda\) is a mere convention, it’s a convention that needs to be applied to all scalar fields appearing in \(\Hscr\). Namely, one cannot have both \(\psi(x)\) as in \(\eqref{eq_scalar_field_psi_fixed_phase}\) and another
for some \(\theta\), because \(\psi(x)\) won’t commute with \(\psi'(y)\) even if \(x - y\) is space-like.
Now if the particle created and annihilated by \(\psi(x)\) carries a (non-vanishing) conserved quantum number \(Q\), then by the discussions on the charge conservation from the previous section, a density \(\Hscr\) made up of \(\psi(x)\) as defined by \(\eqref{eq_scalar_field_psi_fixed_phase}\) will not commute with \(Q\). Instead, one must assume the existence of a field \(\psi^{+ c}(x)\) that creates and annihilates the corresponding antiparticle, in the sense that
Here the supscript \(c\) stands for charge (conjugation). Now instead of \(\eqref{eq_scalar_field_first_defn_of_psi}\), let’s try
so that \([Q, \psi(x)] = -q \psi(x)\). We calculate the commutators, assuming the antiparticle is different from the particle, just as before as follows
where we’ve assumed, in particular that the particle and its particle share the same mass so that \(\eqref{eq_scalar_field_commutator_as_Delta}\) equally applies.
By the same argument as in the case where no quantum number is involved, we see that a scalar field can satisfy the causality condition if it describes a boson. Moreover, by adjusting the phase of \(a(\pbf)\), one can arrange so that \(\kappa = \lambda\) so that
Note that this is compatible with \(\eqref{eq_scalar_field_psi_fixed_phase}\) in the case where the particle is its own antiparticle.
Using \(\eqref{eq_scalar_field_psi_plus}\) and \(\eqref{eq_scalar_field_psi_plus_and_minus_are_adjoints}\), we can write \(\psi(x)\) in terms of the creation and annihilation operators as follows
with the possibility of \(a^{c \dagger}(\pbf) = a^{\dagger}(\pbf)\) in the case where the created particle is its own antiparticle.
For later use (e.g., the evaluation of Feynman diagrams), we note the following identity which holds for any, and not just space-like, \(x\) and \(y\).
where \(\Delta(x)\) is defined as follows
The CPT symmetries#
Let’s investigate how a scalar field transforms under spatial inversion \(\Pcal\), time inversion \(\Tcal\), and charge conjugation \(\Ccal\). This follows essentially from \(\eqref{eq_scalar_field_psi_by_creation_and_annihilation_operators}\) together with our knowledge about how creation/annihilation operators transform under CPT transformations in The Lorentz and CPT transformation laws. Recall that we consider the case of massive particles here, leaving the massless case to a later section.
We start with the spatial inversion \(\Pcal\) by recalling the following transformation rules
where \(\eta\) and \(\eta^c\) are the intrinsic parities of the particle and antiparticle, respectively. In order for the scalar field \(\eqref{eq_scalar_field_psi_by_creation_and_annihilation_operators}\) to transform nicely with \(\Pcal\), one must have \(\eta^{\ast} = \eta^c\) (or \(\eta^{\ast} = \eta\) in the case where the particle is its own antiparticle). As a result, we have
Next let’s consider the time inversion \(\Tcal\). We recall the transformation rules as follows
Similar to the case of spatial inversions, in order for \(\psi(x)\) to transform nicely with \(U(\Tcal)\), one must have \(\zeta^{\ast} = \zeta^c\). Moreover, since \(U(\Tcal)\) is anti-unitary, we have
Finally let’s consider the charge conjugation \(\Ccal\) with the following transformation laws
As before, we must have \(\xi^{\ast} = \xi^c\) and therefore
Vector fields#
The next simplest scenario after scalar field is vector field, where the representation \(D(\Lambda) = \Lambda\). Once again, let’s consider particles of one species so that we can drop the \(n\) label from, for example, \(a(\pbf, \sigma, n)\). In this case, we can rewrite \(\eqref{eq_annihilation_field_simplified_by_translation}\) and \(\eqref{eq_creation_field_simplified_by_translation}\) as follows
where \(\mu, \nu\) are the \(4\)-indexes. Moreover, the boost transformation formulae \(\eqref{eq_annihilation_u_transformation_simplified_by_boost}\) and \(\eqref{eq_creation_v_transformation_simplified_by_boost}\) take the following form
Finally the (linearized) rotation transformation formulae \(\eqref{eq_j_intertwines_u}\) and \(\eqref{eq_j_intertwines_v}\) take the following form
where \(\hat{\Jbf}\) is the angular momentum vector associated with the (tautological) representation \(\Lambda\). It follows from \(\eqref{eq_expansion_of_Lambda}\) and \(\eqref{eq_u_lorentz_expansion}\) that
where \(\{i,j,k\} = \{1,2,3\}\). From this one can then calculate the following squares
It follows then from \(\eqref{eq_vector_field_j_intertwines_u}\) and \(\eqref{eq_vector_field_j_intertwines_v}\) that
where we’ve worked out the details of the calculations for \(u\), but not \(v\) because they are essentially the same.
The reason for squaring the angular momentum \(3\)-vector as above is because it has a particularly simple matrix form
by \(\eqref{eq_angular_momentum_squared_eigenvalue}\). It follows that in order for \(\eqref{eq_vector_field_u0_intertwines_j_sq}\) – \(\eqref{eq_vector_field_vi_intertwines_j_sq}\) to have nonzero solutions, one must have either \(\jfrak = 0\), in which case only the time-components \(u_0(0)\) and \(v_0(0)\) may be nonzero, where we’ve also suppressed \(\sigma\) because spin vanishes, or \(\jfrak = 1\), in which case only the space-components \(u_i(0, \sigma)\) and \(v_i(0, \sigma)\) may be nonzero. These two cases are discussed in more details as follows.
Spin-\(0\) vector fields#
In this case \(\jfrak = 0\). For reasons that will become clear momentarily, let’s fix the constants \(u_0(0), v_0(0)\) as follows
It follows from \(\eqref{eq_vector_field_boost_u}\) and \(\eqref{eq_vector_field_boost_v}\) (see also \(\eqref{eq_L_transformation_for_massive_1}, \eqref{eq_L_transformation_for_massive_2}\)) that
where we once again have omitted the details of the calculation of \(v\) because it’s similar to that of \(u\). Plugging into \(\eqref{eq_vector_field_psi_plus}\) and \(\eqref{eq_vector_field_psi_minus}\), we see that the field components take the following form
Comparing these with \(\eqref{eq_scalar_field_psi_plus}\) and \(\eqref{eq_scalar_field_psi_plus_and_minus_are_adjoints}\), and thanks to the choices of \(u_0(0)\) and \(v_0(0)\) above, we see that
It follows that in fact a spinless vector field defined by \(\psi_{\mu}(x) \coloneqq \psi^+_{\mu}(x) + \psi^-_{\mu}(x)\) as usual is nothing but the gradient vector field of a (spinless) scalar field. Hence we get nothing new from spinless vector fields.
Spin-\(1\) vector fields#
In this case \(\jfrak = 1\). We start with the state whose spin \(z\)-component vanishes, i.e., \(u_{\mu}(0,0)\) and \(v_{\mu}(0,0)\). First we claim that they are both in the \(z\)-direction, i.e., \(u_{\mu}(0,0) = v_{\mu}(0,0) = 0\) unless \(\mu=3\). Indeed, taking the \(z\)-components of both sides of \(\eqref{eq_vector_field_angular_momentum_intertwines_u}\) and recalling that \(\left( J_3^{(1)} \right)_{0 \sigma} = 0\), we have for \(\mu = 1\)
and for \(\mu = 2\)
These, together with \(\eqref{eq_vector_field_u0_intertwines_j_sq}\), imply that only \(u_3(0, 0)\) is nonzero. The same conclusion can also be drawn for \(v_3(0, 0)\). Therefore up to a normalization factor, we can write
Now to calculate \(u\) and \(v\) for the other spin \(z\)-components, we’ll try to use \(\eqref{eq_j1_j2_matrix}\) as follows. According to \(\eqref{eq_vector_field_angular_momentum_intertwines_u}\) we have
Letting \(\sigma=0\) and \(\mu=1\), we have
Changing to \(\mu=2\), we have
Finally take \(\mu=3\), we have
Putting these all together, we have calculated \(u_{\mu}(0, 1)\) as follows
Calculations for \(\sigma = -1\) as well as for \(v\) are similar and hence omitted. The results are listed for future reference as follows
Applying the boosting formulae \(\eqref{eq_vector_field_boost_u}, \eqref{eq_vector_field_boost_v}\) to \(\eqref{eq_vector_field_uv_spin_z_0}, \eqref{eq_vector_field_uv_spin_z_1}, \eqref{eq_vector_field_uv_spin_z_2}\), we obtain the formulae for \(u, v\) at arbitrary momentum as follows
where
Now we can rewrite the fields \(\eqref{eq_vector_field_psi_plus}\) and \(\eqref{eq_vector_field_psi_minus}\) as follows
Similar to the calculation \(\eqref{eq_scalar_field_commutator_as_Delta}\) for scalar field, the (anti-)commutator can be calculated as follows
where
To better understand the quantity \(\Pi_{\mu \nu}(\pbf)\), let’s first evaluate it at \(\pbf = 0\) as follows
which is nothing but the projection to the spatial \(3\)-space, or in other words, the orthogonal complement of the time direction \(\pbf = 0\). Considering the definition \(e_{\mu}(\pbf, \sigma) \coloneqq L(p)_{\mu}^{\nu} e_{\nu}(0, \sigma)\) as in \(\eqref{eq_vector_field_defn_e_vector_at_p}\), we see that the general \(\Pi_{\mu \nu}(\pbf)\) is really just a projection to the orthogonal complement of \(p\), and therefore can be written as
because of the mass-shell condition \(p^2 + m^2 = 0\).
In light of \(\eqref{eq_scalar_field_commutator_as_Delta}\), we can rewrite \(\eqref{eq_vector_field_commutator_by_Pi}\) as follows
where \(\Delta_+(x - y)\) is defined by \(\eqref{eq_defn_Delta_plus}\). As in the case of scalar fields, this (anti-)commutator doesn’t vanish even for space-like \(x - y\). Nonetheless, it’s still an even function for space-like separations. The trick, as usual, is to consider a linear combination of \(\psi^+_{\mu}(x)\) and \(\psi^-_{\mu}(x)\) as follows
Now for space-separated \(x\) and \(y\), we can calculate using \(\eqref{eq_vector_field_commutator_by_Delta}\) and \(\eqref{eq_vector_field_psi_minus_adjoint_to_plus}\) as follows
For them to vanishes, we see that first of all, we must adopt the top sign, i.e., take the commutator, or in other words, the vector field of spin \(1\) must be bosonic. In addition, we must have \(|\kappa| = |\lambda|\). In fact, by adjusting the phase of the creation/annihilation operators, we can arrange so that \(\kappa = \lambda = 1\). To summarize, we can write a general vector field in the following form
just like \(\eqref{eq_scalar_field_psi_fixed_phase}\). It’s also obvious that \(\psi_{\mu}(x)\) is Hermitian.
Now if the vector field carries a nonzero (conserved) quantum charge, then one must adjust \(\eqref{eq_vector_field_psi_fixed_phase}\) as follows
in analogy with \(\eqref{eq_scalar_field_psi_fixed_phase_with_antiparticle}\) for scalar fields. Finally, we can express the vector field in terms of creation and annihilation operators as follows
in analogy with \(\eqref{eq_scalar_field_psi_by_creation_and_annihilation_operators}\) for scalar fields. Finally, let’s calculate the commutator (for general \(x\) and \(y\)) for later use as follows
where \(\Delta(x-y)\) as defined by \(\eqref{eq_defn_Delta}\).
So far, besides the introduction of the vectors \(e_{\mu}(\pbf, \sigma)\) in \(\eqref{eq_vector_field_defn_e_vector_at_p}\) and \(\eqref{eq_vector_field_defn_e_vector_at_rest}\), the discussion on vector fields looks very much like scalar fields. A key difference, however, stems from the following observation
which, in turn, implies that
This condition turns out to be coincide with a so-called “gauge fixing” condition for spin-\(1\) photons in quantum electrodynamics. However, it’s known that photons are massless particles. Therefore we may wonder if a vanishing mass limit \(m \to 0\) may be applied. Now the simplest way to construct a (scalar) interaction density \(\Hscr(x)\) using \(\psi_{\mu}(x)\) is
where \(J^{\mu}(x)\) is a \(4\)-vector current. Suppose we fix the in- and out-states in the interaction. Then according to \(\eqref{eq_vector_field_psi_by_creation_and_annihilation_operators}\), the rate of (anti-)particle emission is proportional to
where \(\langle J^{\mu} \rangle\) denotes the matrix element of the current between the fixed in- and out-states. Now this rate blows up at \(m \to 0\) limit unless \(p_{\mu} \langle J^{\mu} \rangle = 0\). This last condition can be translated to spacetime coordinates as follows
or in other words \(J^{\mu}(x)\) is a conserved current.
The CPT symmetries#
Let’s start with the spatial inversion. First recall from The Lorentz and CPT transformation laws
It follows that we need to express \(e_{\mu}(-\pbf, \sigma)\) in terms of \(e_{\mu}(\pbf, \sigma)\). To this end, let’s calculate
It follows that the spatial inversion transformation law is given as follows
under the following assumption \(\eta^c = \eta^{\ast}\). Omitting further details, the transformation laws for time inversion and charge conjugation is given as follows
under the assumptions \(\zeta^c = \zeta^{\ast}\) and \(\xi^c = \xi^{\ast}\).
Dirac fields#
Here we’ll encounter the first nontrivial representation of the (homogeneous orthochronous) Lorentz group, first discovered by P. Dirac in a completely different (and more physical) context. Our treatment here will be purely mathematical, and will serve as a warm-up for the general representation theory.
Dirac representation and gamma matrices#
Let \(D\) be a representation of the Lorentz group in the sense that \(D(\Lambda_1) D(\Lambda_2) = D(\Lambda_1 \Lambda_2)\). By the discussion in Quantum Lorentz symmetry and ignoring the translation part, we can write \(D(\Lambda)\) up to first order as follows
where \(\Jscr_{\mu \nu} = -\Jscr_{\nu \mu}\) are (Hermitian) matrices that, according to \(\eqref{eq_bracket_j4_j4}\), satisfy in addition the following Lie-algebraic condition
The idea is to turn \(\eqref{eq_bracket_repr_j}\) into a Jacobi identity. To this end, let’s assume the existence of matrices \(\gamma_{\mu}\) such that
where the curly bracket denotes the anti-commutator and is equivalent to the notation \([\cdots]_+\) used in the previous chapters. Here the right-hand-side, being a bear number, should be interpreted as a multiple of the identity matrix of the same rank as \(\gamma_{\mu}\). Such matrices \(\gamma_{\mu}\) form a so-called Clifford algebra of the symmetric bilinear form \(\eta_{\mu \nu}\). Then we can define
Now we can calculate
Then we can verify \(\eqref{eq_bracket_repr_j}\), starting from the left-hand-side, as follows
The last expression is easily seen to be equal to the right-hand-side of \(\eqref{eq_bracket_repr_j}\) using the anti-symmetry of \(\Jscr_{\mu \nu}\). Hence \(\eqref{eq_bracket_repr_j}\) is indeed satisfied for \(\Jscr_{\mu \nu}\) defined by \(\eqref{eq_dirac_field_defn_j}\).
In fact, the calculation \(\eqref{eq_dirac_field_j_gamma_commutator}\) may be rephrased more conveniently as follows
or in plain words, \(\gamma_{\mu}\) is a vector. Indeed, one can calculate the left-hand-side of \(\eqref{eq_dirac_field_gamma_is_vector}\) up to first order as follows [8]
Using the very definition \(\eqref{eq_dirac_field_defn_j}\), one then sees that \(\Jscr_{\mu \nu}\) is an anti-symmetric tensor in the sense that
Indeed, one can construct even more anti-symmetric tensors as follows
but no more because we’re constrained by the \(4\)-dimensional spacetime. Here we remark also that the consideration of only anti-symmetric tensors loses no generality because of \(\eqref{eq_dirac_field_clifford_algebra}\). Now we claim that the matrices \(1, \gamma_{\mu}, \Jscr_{\mu \nu}, \Ascr_{\mu \nu \rho}\) and \(\Pscr_{\mu \nu \rho \kappa}\) are all linearly independent. This can be seen either by observing that they transform differently under conjugation by \(D(\Lambda)\), or, more directly, by observing that they are orthogonal to each other under the inner product defined by the trace of the product (of two matrices). For example, one easily sees that \(\Jscr_{\mu \nu}\) is traceless because the trace of a commutator vanishes, and \(\op{tr}(\gamma_{\mu} \gamma_{\nu}) = 0\) for \(\mu \neq \nu\) using \(\eqref{eq_dirac_field_clifford_algebra}\). It’s slightly tricker to see that \(\gamma_{\mu}\) itself is also traceless, but this is again a consequence of the Clifford algebra relations \(\eqref{eq_dirac_field_clifford_algebra}\), which we demonstrate as follows
Counting these linearly independent matrices, we see that there are \(1 + \binom{4}{1} + \binom{4}{2} + \binom{4}{3} + \binom{4}{4} = 16\) of them. It means that the size of the \(\gamma_{\mu}\) matrices is at least \(4 \times 4\).
It turns out that there exists indeed a solution of \(\eqref{eq_dirac_field_clifford_algebra}\) in terms of \(4 \times 4\) matrices, conveniently known as the gamma matrices, which we define as follows
where \(\bm{\sigma} = (\sigma_1, \sigma_2, \sigma_3)\) is made up of the so-called Pauli matrices defined as follows
Indeed the Pauli matrices make up a solution to not only a \(3\)-dimensional Clifford algebra with respect to the Euclidean inner product, but also an angular momentum representation if multiplied by \(1/2\), or more precisely, a spin-\(1/2\) representation. Note that this representation, in terms of Hermitian matrices, is apparently different from the one given by \(\eqref{eq_representation_rotation_first_order} - \eqref{eq_j3_matrix}\) with \(\jfrak = 1/2\), in terms of real matrices. One can verify that they differ by just a change of basis. As long as the terminology is concerned, one often uses Hermitian and real interchangeably.
Now using the definition \(\eqref{eq_dirac_field_defn_j}\), we can calculate, using the Clifford relations, as follows
where \(i, j \in \{1,2,3\}\) and \(\epsilon\) is the totally anti-symmetric sign. We see that the representation \(\Jscr_{\mu \nu}\) is in fact reducible. Moreover, we see that that the corresponding representation \(D\) of the Lorentz group given by \(\eqref{eq_dirac_field_linearize_representation}\) is not unitary, since while \(\Jscr_{ij}\) are Hermitian, \(\Jscr_{i0}\) are anti-Hermitian. The fact that \(D\) is not unitary will have consequences when we try to construct the interaction density as in \(\eqref{eq_construct_interaction_density_by_fields}\), because products like \(\psi^{\dagger} \psi\) will not be a scalar in light of \(\eqref{eq_conjugate_annihilation_field}\) or \(\eqref{eq_conjugate_creation_field}\).
Next let’s consider the parity transformation in the formalism of gamma matrices. In light of the transformation laws \(\eqref{eq_j3_conjugated_by_p}\) and \(\eqref{eq_k3_conjugated_by_p}\), we can define
as the parity transformation. It’s clear that \(\beta^2 = 1\). Moreover, it follows from the Clifford relations \(\eqref{eq_dirac_field_clifford_algebra}\) that
It follows then from the definition \(\eqref{eq_dirac_field_defn_j}\) that
which are in complete analogy with \(\eqref{eq_j3_conjugated_by_p}\) and \(\eqref{eq_k3_conjugated_by_p}\) if we think of \(\Jscr_{ij}\) as the “angular momenta” and \(\Jscr_{i0}\) as the “boosts”.
In connection to the non-unitarity of \(D(\Lambda)\), let’s note that since \(\beta \gamma_{\mu}^{\dagger} \beta^{-1} = -\gamma_{\mu}\), we have \(\beta \Jscr_{\mu \nu}^{\dagger} \beta^{-1} = \Jscr_{\mu \nu}\), and therefore
in light of \(\eqref{eq_dirac_field_linearize_representation}\). This identity will be useful when we later construct the interaction density.
At last we’ll introduce yet another special element to the family of gamma matrices, namely \(\gamma_5\), defined as follows
One nice thing about \(\gamma_5\) is that it anti-commutes with all other \(\gamma\) matrices, and in particular
In fact, the collection \(\gamma_0, \gamma_1, \gamma_2, \gamma_3, \gamma_5\) makes exactly a \(5\)-dimensional spacetime Clifford algebra.
Construction of Dirac fields#
As in the case of scalar and vector fields, let’s write the Dirac fields as follows
Moreover, using the particularly convenient form of \(\Jscr_{ij}\) in \(\eqref{eq_dirac_field_jscr_matrix}\), we can write the zero-momentum conditions \(\eqref{eq_j_intertwines_u}\) and \(\eqref{eq_j_intertwines_v}\) as follows
Here the signs \(\pm\) correspond to the two (identical) irreducible representations of \(\Jscr_{ij}\) in the form of \(\eqref{eq_dirac_field_jscr_matrix}\), and \(m\) and \(m'\) index the Pauli matrices.
Now if we think of \(u_{m \pm}(0, \sigma)\) as matrix elements of a matrix \(U_{\pm}\) and similarly for \(v\), then \(\eqref{eq_dirac_field_sigma_intertwines_j_by_u}\) and \(\eqref{eq_dirac_field_sigma_intertwines_j_by_v}\) can be rewritten as
We recall that both \(\tfrac{1}{2} \bm{\sigma}\) and \(\Jbf^{(\jfrak)}\) (as well as \(-\Jbf^{(\jfrak) \ast}\)) are irreducible representations of the rotation group (or rather, its Lie algebra). We first claim that \(U_{\pm}\) must be isomorphism. Indeed, the kernel of \(U_{\pm}\) is easily seen to be an invariant subspace under the action of \(\Jbf^{(\jfrak)}\), and hence must be null if \(U_{\pm} \neq 0\). On the other hand, the image of \(U_{\pm}\) is an invariant subspace under the action of \(\bm{\sigma}\), and hence must be the whole space if \(U_{\pm} \neq 0\). It follows then that the rank of \(\Jbf^{(\jfrak)}\) and \(U_{\pm}\) must be the same as \(\bm{\sigma}\), which is \(2\). The same argument applies also to \(V_{\pm}\). In particular we must have \(\jfrak = 1/2\), or in other words, the Dirac fields describe \(1/2\)-spin particles.
Note
The mathematical argument above is commonly known as Schur’s lemma.
The matrix form of \(\Jbf^{(1/2)}\) according to \(\eqref{eq_j1_j2_matrix}\) and \(\eqref{eq_j3_matrix}\) is given By
Comparing with the Pauli matrices \(\eqref{eq_pauli_matrices}\), we see that
Hence \(\eqref{eq_dirac_field_sigma_intertwines_j_by_u_matrix_form}\) may be rewritten as \(\bm{\sigma} U_{\pm} = U_{\pm} \bm{\sigma}\). One can apply Schur’s lemma once again to conclude that \(U_{\pm}\) must be a scalar. To spell out more details, note that since \(\bm{\sigma}\) commutes with any scalar it commutes with \(U_{\pm} - \lambda\) for any \(\lambda \in \Cbb\). It follows that \(U_{\pm} - \lambda\) is either an isomorphism or zero. The later must be the case if \(\lambda\) is an eigenvalue of \(U_{\pm}\). Hence \(U_{\pm}\) must be a scalar as claimed. A similar argument can be applied to \(V_{\pm}\) by rewriting \(\eqref{eq_dirac_field_sigma_intertwines_j_by_v_matrix_form}\) as \(\bm{\sigma} (V_{\pm} \sigma_2) = (V_{\pm} \sigma_2) \bm{\sigma}\). Hence \(V_{\pm}\) must be proportional to \(\sigma_2\).
Going back to \(u_{m \pm}(0, \sigma)\) (resp. \(v_{m \pm}(0, \sigma)\)) from \(U_{\pm}\) (resp. \(V_{\pm}\)), we have concluded that
where we’ve inserted the extra factor \(-\ifrak\) in front of \(d_{\pm}\) to make the final results look more uniform. We can further unwrap this result in matrix notations as follows
These are known as the (Dirac) spinors. Spinors at finite momentum can be determined by \(\eqref{eq_annihilation_u_transformation_simplified_by_boost}\) and \(\eqref{eq_creation_v_transformation_simplified_by_boost}\) as follows
In general the constants \(c_{\pm}\) and \(d_{\pm}\) may be arbitrary. However, if we assume the conservation of parity, or in other words, that the fields transform covariantly under the spatial inversion, then we can determine the constants, and henceforth the spinors uniquely as follows. Applying the spatial inversion transformation laws of creation and annihilation operators from The Lorentz and CPT transformation laws to \(\eqref{eq_dirac_field_psi_plus}\) and \(\eqref{eq_dirac_field_psi_minus}\), we can calculate as follows
Here we’ve evaluated \(u(-\pbf, \sigma)\) (and \(v(-\pbf, \sigma)\)) in a way similar to \(\eqref{eq_vector_field_revert_momentum_transformation}\), where, instead of applying \(\Pcal\) directly on vectors, we recall for Dirac fields that the spatial inversion acts as \(\beta\), defined by \(\eqref{eq_dirac_field_beta_matrix},\) on spinors.
In order for \(U(\Pcal) \psi^+(x) U^{-1}(\Pcal)\) to be proportional to \(\psi^+(x)\), we observe that it would be sufficient (and most likely also necessary) if \(u(0, \sigma)\) is an eigenvector of \(\beta\). Similar argument applies to \(\psi^-(x)\) as well, so we can write
where \(b^2_{\pm} = 1\) since \(\beta^2 = 1\). Assuming this, we can rewrite \(\eqref{eq_dirac_field_spatial_inversion_acts_on_psi_plus}\) and \(\eqref{eq_dirac_field_spatial_inversion_acts_on_psi_minus}\) as follows
so that the parity is indeed conserved.
Now using \(\eqref{eq_dirac_field_beta_eigenvalue_of_u}\) and \(\eqref{eq_dirac_field_beta_eigenvalue_of_v}\) (and an appropriate rescaling), we can rewrite \(\eqref{eq_dirac_field_u_matrix_by_c}\) and \(\eqref{eq_dirac_field_v_matrix_by_d}\) as follows
So far we haven’t really achieved much by assuming the parity conservation. What eventually pins down the spinors is, once again, the causality condition. To see this, let’s try to construct the Dirac field following \(\eqref{eq_defn_psi_field}\) as follows
The (anti-)commutator can be calculated using \(\eqref{eq_dirac_field_psi_plus}\) and \(\eqref{eq_dirac_field_psi_minus}\) as follows
where \(N_{\ell \ell'}\) and \(M_{\ell \ell'}\) are the spin sums defined as follows
To evaluate the spin sums, we first turn back to the zero-momentum case and use \(\eqref{eq_dirac_field_beta_eigenvalue_of_u}\) and \(\eqref{eq_dirac_field_beta_eigenvalue_of_v}\) to express the values in terms of \(\beta\) as follows
where \(\dagger\) here means transpose conjugation. The easiest way to see this is probably to momentarily forget about the spin \(z\)-component \(\sigma\) so that \(\beta\) as in \(\eqref{eq_dirac_field_beta_matrix}\) behaves like a \(2 \times 2\) matrix with obvious eigenvectors \([1, 1]^T\) for \(b_{\pm} = 1\) and \([1, -1]^T\) for \(b_{\pm} = -1\). Then \(\eqref{eq_dirac_field_n_matrix_at_zero}\) and \(\eqref{eq_dirac_field_m_matrix_at_zero}\) can be verified by a direct calculation. Here the superscript \(T\) means taking transpose so we have column vectors.
Now we can evaluate the \(N\) and \(M\) matrices in terms of \(\beta\) as follows
To go further, we need to invoke the the transformation laws of the gamma matrices and their relatives, e.g., \(\beta\), under \(D(\Lambda)\) from Dirac representation and gamma matrices. More precisely, using the pseudo-unitarity \(\eqref{eq_dirac_field_pseudo_unitarity_of_d_matrix}\) we have the following
Plugging them into \(\eqref{eq_dirac_field_n_matrix_first_evaluation}\) and \(\eqref{eq_dirac_field_m_matrix_first_evaluation}\), respectively, we can continue our evaluation as follows
Now that we’ve finished evaluating the spin sums, we can plug them into \(\eqref{eq_dirac_field_commutator_raw}\) to get the following evaluation of the (anti-)commutator
where \(\Delta_+\) is defined by \(\eqref{eq_defn_Delta_plus}\). Recall that \(\Delta_+(x) = \Delta_+(-x)\) for space-like \(x\). Hence for space-like \(x-y\), the following holds
Error
\(\eqref{eq_delta_plus_derivative_is_odd}\) is what is claimed in the bottom of page 223 in [Wei95], but it’s not true! Indeed, the fact that \(\Delta_+(x)\) is even for space-like \(x\) only implies that \(\p_i \Delta_+(x)\) is odd for spatial indexes \(i=1,2,3\). However \(\dot{\Delta}_+(x)\) is not odd even for space-like \(x\). It has a serious consequence that the anti-commutator doesn’t vanish, which in turn means that the Dirac fields, as constructed here, cannot be arbitrarily assembled into a causal interaction density. This error is obviously not because of Weinberg’s ignorance, since he later also pointed out the non-vanishing of the anti-commutator on page 295.
It follows that in order for \(\eqref{eq_dirac_field_commutator_first_evaluation}\) to vanish for space-separated \(x\) and \(y\), the following must hold
We see that first of all, the top sign applies, which means in particular that we must be considering the anti-commutator in \(\eqref{eq_dirac_field_commutator_first_evaluation}\), or in other words, the Dirac fields must be fermionic. In addition, we must also have \(|\kappa| = |\lambda|\) and \(b_+ + b_- = 0\).
By the usual phase adjustments on the creation and annihilation operators and rescaling, we can arrange so that \(\kappa = \lambda = 1\). Recalling \(\eqref{eq_dirac_field_gamma_5_anti_commutes_beta}\) and replacing \(\psi\) with \(\gamma_5 \psi\) if necessary, we can arrange so that \(b_{\pm} = \pm 1\). Putting these all together, we have evaluated the Dirac field \(\eqref{eq_dirac_field_psi_field_raw}\) as follows
where the zero-momentum spinors are
and the spin sums are
and the anti-commutator, calculated by plugging the spin sums into \(\eqref{eq_dirac_field_commutator_raw}\), is
where \(\p_{\mu} = \p_{x_{\mu}}\) and \(\Delta(x-y)\) is defined by \(\eqref{eq_defn_Delta}\).
For \(\psi(x)\) defined by \(\eqref{eq_dirac_field_psi_field}\) to transform covariantly under spatial inversion, we recall \(\eqref{eq_dirac_field_psi_plus_conjugated_by_u}\) and \(\eqref{eq_dirac_field_psi_minus_conjugated_by_u}\) to conclude that
It follows that \(\eta \eta^c = -1\), or in other words, the intrinsic parity of the state consisting of a spin \(1/2\) particle and its antiparticle is odd in the sense of Parity symmetry. The parity transformation law for Dirac fields is as follows
The CPT symmetries#
The transformation law of Dirac fields under spatial inversion has already been worked out in \(\eqref{eq_dirac_field_spatial_inversion_transformation_law}\), so we’re left to work out the transformation laws under time inversion and charge conjugation.
Recall from Space and time inversions that the time-inversion operator \(U(\Tcal)\) is complex anti-linear. Hence to work out the transformation law under time inversion, we’ll need to work out the complex-conjugated spinors \(u^{\ast}(\pbf, \sigma)\) and \(v^{\ast}(\pbf, \sigma)\). Now in light of \(\eqref{eq_dirac_field_spinor_u_at_finite_momentum}\) and \(\eqref{eq_dirac_field_spinor_v_at_finite_momentum}\), and the fact that the spinors are real at zero-momentum, we just need to work out the complex conjugate \(D^{\ast}(L(p))\) in terms of \(D(L(p))\) and the gamma matrices. Now according to \(\eqref{eq_dirac_field_linearize_representation}\), it suffices to work out \(\Jscr^{\ast}_{\mu \nu}\). Finally according to \(\eqref{eq_dirac_field_defn_j}\), it suffices to work out \(\gamma^{\ast}_{\mu}\).
Inspecting the explicit forms of the gamma matrices given by \(\eqref{eq_dirac_field_defn_gamma_matrices}\) and \(\eqref{eq_pauli_matrices}\) we see that \(\gamma_0, \gamma_1, \gamma_3\) are anti-Hermitian while \(\gamma_2\) is Hermitian, or more explicitly
Using the Clifford algebra relations, this can be written more concisely as follows
While this result could’ve be satisfactory in its own right, as we’ll see, it’ll be more convenient to factor out a \(\beta\) matrix. Hence we’re motivated to introduce yet another special matrix
and rewrite \(\eqref{eq_dirac_field_gamma_conjugation_by_gamma_2}\) as follows
where we also note that \((\Cscr^{-1} \beta)^{-1} = \beta \Cscr\). It follows from \(\eqref{eq_dirac_field_defn_j}\) that
and hence from \(\eqref{eq_dirac_field_linearize_representation}\) that
Using the explicit formula \(\eqref{eq_dirac_field_defn_c_matrix}\) for \(\Cscr\) as well \(\eqref{eq_dirac_field_u_spinor_zero_momentum}\) and \(\eqref{eq_dirac_field_v_spinor_zero_momentum}\) for \(u(0, \sigma)\) and \(v(0, \sigma)\), we get the following
These relations turns out to be useful for the charge conjugation transformation, but not for the time inversion because the spinors \(u\) and \(v\) are swapped. To remedy this, we notice from \(\eqref{eq_dirac_field_u_spinor_zero_momentum}\) and \(\eqref{eq_dirac_field_v_spinor_zero_momentum}\) that the \(u\) and \(v\) spinors at zero-momentum are related by \(\gamma_5\) defined by \(\eqref{eq_dirac_field_defn_gamma_5}\). Moreover, in order to cancel the \(\beta\) matrices at the two ends of right-hand-side of \(\eqref{eq_dirac_field_d_matrix_conjugation_1}\), we can replace \(\pbf\) with \(-\pbf\), which is also desirable as far as the time inversion is concerned in light of \(\eqref{eq_creation_operator_time_inversion_conjugation_massive}\). Putting all these considerations together, let’s try the following
where the third equality holds because of the Clifford relations, namely, \(\gamma_5\) commutes with \(\Jscr_{\mu \nu}\), and hence \(D^{\ast}(L(p))\), and anti-commutes with \(\beta\). Now instead of \(\eqref{eq_dirac_field_u_conjugate_to_v}\), we can calculate as follows
A similar calculation can be done to show that in fact \(v(-\pbf, \sigma)\) satisfies exactly the same conjugation formula.
With all the preparations above, we can now calculate the spatial inversion transformation laws as follows
In order for \(\psi(x)\) to transform nicely under the time inversion, we’re forced to make the following assumption
Under this assumption we’ve finally worked out the time inversion transformation law for Dirac fields
Next let’s calculate the charge inversion transformation as follows
Just as for the time inversion, we are forced to assuming the following condition on the charge conjugation parities
Under this assumption, we can work out the charge conjugation transformation law as follows [9]
Here we’ve used \(\psi^{\ast}(x)\) instead of \(\psi^{\dagger}(x)\) because we don’t want to transpose the spinors, but it should be understood that the \(\ast\) when applied to the creation/annihilation operators are the same as taking the adjoint.
Finally, let’s consider the special case where the spin-\(1/2\) particles are their own antiparticles. These particles are known as Majorana fermions, as already discussed in Parities of elementary particles. According to \(\eqref{eq_dirac_field_spatial_parity_relation}, \eqref{eq_dirac_field_time_parity_relation}\) and \(\eqref{eq_dirac_field_charge_parity_relation}\), we see that the spatial parity of a Majorana fermion must be \(\pm \ifrak\), while the time and charge parity must be \(\pm 1\).
Construction of the interaction density#
As mentioned in Dirac representation and gamma matrices, the fact that the Dirac representation is not unitary means that we cannot construct the interaction density using \(\psi^{\dagger} \psi\) because it won’t be a scalar. Indeed, let’s work out how \(\psi^{\dagger}\) transforms under a (homogeneous orthochronous) Lorentz transformation using \(\eqref{eq_dirac_field_pseudo_unitarity_of_d_matrix}\) as follows
We see that if we define a new adjoint
then \(\bar{\psi}\) transforms nicely as follows
It follows that we can construct a bilinear form as follows
where \(M\) is a \(4 \times 4\) matrix, so that
Letting \(M\) to be \(1, \gamma_{\mu}, \Jscr_{\mu \nu}, \gamma_5 \gamma_{\mu}\) or \(\gamma_5\) then produces a scalar, vector, tensor, axial vector or pseudo-scalar, respectively. Here the adjectives “axial” and “pseudo-” refer to the opposite to usual parities under spatial and/or time inversion.
An important example is Fermi’s theory of beta-decay, which involves an interaction density of the following form
where \(p, n, e, \nu\) stand for proton, neutron, electron and neutrino, respectively.
General fields#
We’ve now seen how scalar, vector, and Dirac fields can be constructed out of specific representations of the (homogeneous orthochronous) Lorentz group. These constructions can be generalized and unified by understanding the general representation theory of the Lorentz group.
General representation theory of the Lorentz group#
The starting point, as in the case of Dirac fields, is the general commutation relation \(\eqref{eq_bracket_repr_j}\) that the \(\Jscr_{\mu \nu}\) matrices must satisfy. As explained in Quantum Lorentz symmetry, we can rename the \(\Jscr_{\mu \nu}\) matrices as follows
and rewrite \(\eqref{eq_bracket_repr_j}\) as a set of equations as follows
which correspond to \(\eqref{eq_jjj_commutation}, \eqref{eq_jkk_commutation}\) and \(\eqref{eq_kkj_commutation}\), respectively.
Let’s write \(\bm{\Jscr} \coloneqq \left(\Jscr_1, \Jscr_2, \Jscr_3\right)\) and \(\bm{\Kscr} \coloneqq \left(\Kscr_1, \Kscr_2, \Kscr_3\right)\). It turns out that this Lie algebra generated by \(\bm{\Jscr}\) and \(\bm{\Kscr}\) can be complex linearly transformed into one that splits. The transformation is defined as follows
so that \(\eqref{eq_jjj_commutation_general_repr}\) – \(\eqref{eq_kkj_commutation_general_repr}\) take the following form
In other words, both \(\bm{\Ascr}\) and \(\bm{\Bscr}\) form a Lie algebra of the \(3\)-dimensional rotation group and they commute each other. It follows then from Representations of angular momentum that representations of the Lie algebra defined by \(\eqref{eq_aaa_commutation}\) – \(\eqref{eq_ab0_commutation}\) can be parametrized by two nonnegative (half-)integers \(A\) and \(B\) such that
where \(a, a' \in \{-A, -A+1, \cdots, A\}\) and \(b, b' \in \{-B, -B+1, \cdots, B\}\), and \(\Jbf^{(A)}\) and \(\Jbf^{(B)}\) are matrices given by \(\eqref{eq_j1_j2_matrix}\) and \(\eqref{eq_j3_matrix}\). In particular, all these representations have dimension \((2A+1)(2B+1)\) and are unitary.
Now each one of these representations gives rise to a representation of the Lorentz group, which will be referred to as the \((A, B)\) representation. As we’ve seen for Dirac fields, these representations are not unitary, because while \(\bm{\Jscr} = \bm{\Ascr} + \bm{\Bscr}\) is Hermitian, \(\bm{\Kscr} = -\ifrak \left( \bm{\Ascr} - \bm{\Bscr} \right)\) is anti-Hermitian. For the corresponding unitary representation of the \(3\)-dimensional rotation group, we recall from Clebsch-Gordan coefficients that it may be split into irreducible components of spin \(\jfrak\), which takes values in the following range
according to \(\eqref{eq_composite_total_angular_momentum_range}\). Under this setup, the scalar, vector, and Dirac fields discussed before correspond to the following three scenarios.
Scalar field: \(A = B = 0\). In this case \(\jfrak\) must vanish, and hence no spin is possible.
Vector field: \(A = B = \tfrac{1}{2}\). In this case \(\jfrak\) may be \(0\) or \(1\), which correspond to the time and space components of the vector field, respectively.
Dirac field: \(A = \tfrac{1}{2}, B = 0\) or \(A = 0, B = \tfrac{1}{2}\). In either case \(\jfrak\) must be \(\tfrac{1}{2}\). Indeed, they correspond to the two irreducible components of (the angular momentum part of) the Dirac representation \(\eqref{eq_dirac_field_jscr_matrix}\). Therefore the Dirac field may be written in short hand as \(\left( \tfrac{1}{2}, 0 \right) \oplus \left( 0, \tfrac{1}{2} \right)\).
It turns out that any general \((A, B)\) fields can be derived from the above basic ones by taking tensor products and irreducible components. For example \((1, 0)\) and \((0, 1)\) fields can be derived, using again Clebsch-Gordan coefficients, by the following calculation
In fact, all \((A, B)\) fields with \(A + B\) being an integer can be obtained in this way by tensoring copies of \(\left( \tfrac{1}{2}, \tfrac{1}{2} \right)\). To get those fields with \(A + B\) being a half-integer, we can consider the following calculation
Construction of general fields#
We’ve seen that general fields can be indexed by two (half-)integers \(a\) and \(b\), and take the following general form
As usual, let’s translate \(\eqref{eq_j_intertwines_u}\) and \(\eqref{eq_j_intertwines_v}\) in the context of \((A, B)\) representations as follows
Using the fact that \(\bm{\Jscr} = \bm{\Ascr} + \bm{\Bscr}\) and \(\eqref{eq_general_field_a_repr}\) and \(\eqref{eq_general_field_b_repr}\), we can further rewrite these conditions as follows
Looking at \(\eqref{eq_general_field_uj_relation}\), it’s an identity that relates an angular momentum representation of spin \(\jfrak\) on the left to the sum of two independent angular momentum representations on the right. Hence it’s not unreasonable to guess that \(u\) might have something to do with the Clebsch-Gordan coefficients defined by \(\eqref{eq_defn_clebsch_gordan_coefficients}\). This turns out to be indeed the case as we now demonstrate. Since \(\bigoplus_{\jfrak} \Jbf^{(\jfrak)} = \Jbf^{(A)} + \Jbf^{(B)}\), where the direct sum is taken over the range \(\eqref{eq_general_field_j_range}\) and can be thought of as a block diagonal matrix, we can calculate as follows
We are now one step away from being able to compare with \(\eqref{eq_general_field_uj_relation}\). Namely we need to bring the coefficients \(C^{AB}(\jfrak, \sigma'; a', b')\) to the left side of the equation. To to this, we recall the following identity of Clebsch-Gordan coefficients
which follows from the orthonormality of the states \(\Psi^{AB}_{ab}\) and the reality of the Clebsch-Gordan coefficients as constructed in Clebsch-Gordan coefficients. [10] Now \(\eqref{eq_general_field_calculate_j_in_clebsch_gordan_coefficients}\) will be satisfied if we set the blue terms to equal to the following quantity
due to \(\eqref{eq_clebsch_gordan_coefficients_orthonormal_relation}\).
Now compare \(\eqref{eq_general_field_condition_on_j_by_clebsch_gordan_coefficients}\) with \(\eqref{eq_general_field_uj_relation}\), we’ve solved the \(u\)-fields of dimension \((2A+1)(2B+1)\) as follows
where \((2m)^{-1/2}\) is a conventional coefficient add here to cancel the mass term in \(\eqref{eq_annihilation_u_transformation_simplified_by_boost}\) later. Using the fact that
which can be verified directly using \(\eqref{eq_j1_j2_matrix}\) and \(\eqref{eq_j3_matrix}\), we can express the \(v\)-fields in terms of the \(u\)-fields as follows
To get the \(u\) and \(v\) fields at finite momentum, we need to invoke the general boost formulae \(\eqref{eq_annihilation_u_transformation_simplified_by_boost}\) – \(\eqref{eq_creation_v_transformation_simplified_by_boost}\), as well as the \(L\) transformation \(\eqref{eq_L_transformation_for_massive_1}\) – \(\eqref{eq_L_transformation_for_massive_3}\). Here we’ll think of a boost as a \(1\)-parameter transformation in a given direction \(\hat{\pbf} \coloneqq \pbf / |\pbf|\). It turns out to be neat to use a hyperbolic angle \(\theta\), rather than \(|\pbf|\), defined by
to parametrize the boost as follows
The nice thing about this parametrization is that \(L(\theta)\) becomes additive in \(\theta\) in the following sense
Indeed, one can verify it by, for example, the following calculation
In light of \(\eqref{eq_dirac_field_linearize_representation}\), we can then write
at least for \(\theta\) infinitesimal. Here the minus sign comes from the fact that we have to bring the upper-index \(\mu=0\) in \(\omega^{\mu\nu}\) down.
For an \((A, B)\) representation, we can further write \(\ifrak \bm{\Kscr} = \bm{\Ascr} - \bm{\Bscr}\), and henceforth
since the representation splits into a direct sum of \(\bm{\Ascr}\) and \(\bm{\Bscr}\). Combining \(\eqref{eq_general_field_d_transformation_in_ab_repr}\) with \(\eqref{eq_general_field_u_at_zero_momentum}\) and \(\eqref{eq_annihilation_u_transformation_simplified_by_boost}\), we obtain the following formula for the \(u\)-field at finite momentum for an \((A, B)\) representation
where we assume implicitly that the spin \(\jfrak\) is within the range \(\eqref{eq_general_field_j_range}\), and \(\sigma\) is the corresponding spin \(z\)-component.
Parallel to \(\eqref{eq_general_field_v_at_zero_momentum}\), we can express the \(v\)-field at finite momentum in terms of the \(u\)-field as follows
The construction of interaction densities for general \((A, B)\) fields relies on Clebsch-Gordan coefficients, and is discussed in some detail in the following dropdown block.
We will now turn to the arguably most interesting causality condition \(\eqref{eq_h_commutativity_for_space_like_separations}\). Indeed, it is this condition that clarifies the correlation between the spin and whether a particle/field is bosonic or fermionic. As before, we need to evaluate the (anti-)commutator between the fields using \(\eqref{eq_general_field_defn_psi_field}\) as follows
where \(\pi(\pbf)\) is the (rescaled) spin sum defined by
Here the second equality can be mostly easily seen using \(\eqref{eq_general_field_v_at_finite_momentum}\). Note also that we are considering the general scenario where \(\psi(x)\) is an \((A, B)\) field, while \(\psi'(x)\) is a possibly different \((A', B')\) field.
Using \(\eqref{eq_general_field_u_at_finite_momentum}\), we can spell out more details of the spin sum as follows
This looks horribly complicated, but it has been evaluated by the author in [Wei69]. Without going into the actual calculations, we note the following two facts, which suffice our purposes. The first is that \(\pi_{ab,a'b'}(\pbf)\) is a polynomial \(P\) in \(p\) on the mass shell as follows
The second is that this polynomial is even or odd depending on the parity of \(2A + 2B'\) as follows
Assuming \(\eqref{eq_general_field_spin_sum_is_polynomial}\), we note that any \(P_{ab, a'b'}\) can be written in such a way that it’s (at most) linear in the first argument \(\sqrt{\pbf^2 + m^2}\). Changing the content of \(P_{ab, a'b'}\) in \(\eqref{eq_general_field_spin_sum_is_polynomial}\), we may then rewrite it as follows
where \(P, Q\) are polynomials in \(\pbf\) that satisfy the following parity conditions
Returning to the causality condition \(\eqref{eq_general_field_psi_commutator}\), let’s consider space separated \(x\) and \(y\). Up to a Lorentz transformation, we may assume that \(x-y = (0, \xbf-\ybf)\). Under this assumption, we can calculate as follows
where the derivative \(\nabla\) is always taken with respect to \(x\). Here we’ve also used the fact that \(\Delta_+(x)\) (for space-like \(x\)) and the Dirac delta \(\delta^3(x)\) are even functions. We see that for the (anti-)commutator to vanish for \(\xbf \neq \ybf\), i.e., when \(\delta^3(\xbf - \ybf) = 0\), we must have
Now consider an important special case where \(\psi = \psi'\). It implies in particular that \((A, B) = (A', B')\) and \((\kappa, \lambda) = (\kappa', \lambda')\). In this case we can rewrite \(\eqref{eq_general_field_causality_kappa_lambda_condition}\) as follows
since \(\jfrak\) differs from \(A+B\) by an integer according to \(\eqref{eq_general_field_j_range}\). Hence in addition to the condition \(|\kappa| = |\lambda|\), the field (or rather the particle it describes) must be bosonic, i.e., the bottom sign is taken, if \(\jfrak\) is an integer, and fermionic, i.e., the top sign is taken, if \(\jfrak\) is a half-integer. This is consistent with the corresponding conclusions for scalar , vector, and Dirac fields found in previous sections, and is indeed a great clarification of the relationship between spin and statistics, e.g., Pauli’s exclusion principle.
Back to the general case. We know from \(\eqref{eq_general_field_self_kappa_lambda_relation}\) that \(|\kappa'| = |\lambda'|\) and \((-1)^{2A+2B} = (-1)^{2\jfrak} = \mp\), which is the same sign as in \(\eqref{eq_general_field_causality_kappa_lambda_condition}\). Hence we can rewrite \(\eqref{eq_general_field_causality_kappa_lambda_condition}\) by dividing both sides by \(|\kappa'|^2 = |\lambda'|^2\) as follows
Hence we conclude the following relationship between the coefficients \(\kappa\) and \(\lambda\)
where \(c\) is a constant that depends only on the field, or rather, the particle it describes, and not on the specific representation that gives rise to the field. Moreover we note that \(c\) is a phase since \(|\kappa| = |\lambda|\). Hence by adjusting the phase of the creation operator (and correspondingly the annihilation operator), we can arrange so that \(c = 1\).
This marks the end of the discussion about the causality condition on general \((A, B)\). As a result, we’ve obtained the following grand formula for a general (causal) field.
where the spinors \(u_{ab}\) and \(v_{ab}\) are given by \(\eqref{eq_general_field_u_at_finite_momentum}\) and \(\eqref{eq_general_field_v_at_finite_momentum}\), respectively.
The CPT symmetries#
The calculations of space, time, and charge conjugation transformations in the general case is essentially the same as for the Dirac field. In particular, instead of reverting the \(3\)-momentum in Dirac spinors as in \(\eqref{eq_dirac_field_spatial_inversion_acts_on_psi_plus}\), we need to do it for general \((A, B)\) spinors \(\eqref{eq_general_field_u_at_finite_momentum}\), which involves the Clebsch-Gordan coefficients.
Without going to the details, we list the relevant symmetry properties of Clebsch-Gordan coefficients as follows
The first relation is proved in [Wei00] page 124, and the second relation can be deduced from the time reversal transformation law \(\eqref{eq_time_inversion_on_massive_general}\).
Consider first the spatial inversion. Combining \(\eqref{eq_clebsch_gordan_symmetry_swap}\) with \(\eqref{eq_general_field_u_at_finite_momentum}\), one obtains the following relations on the spinors
We can calculate the spatial conjugation as follows
In order for the blue terms to be proportional to the corresponding terms in, in this case, a \((B, A)\) field, we must have
Under this assumption, we can complete the transformation law for spatial inversion as follows
which recovers the cases of scalar field \(\eqref{eq_scalar_field_spatial_inversion_transformation_law}\), vector field \(\eqref{eq_vector_field_spatial_inversion_transformation_law}\), and Dirac field \(\eqref{eq_dirac_field_spatial_inversion_transformation_law}\) where \(\beta\), as defined by \(\eqref{eq_dirac_field_beta_matrix}\), serves the function of swapping \(A\) and \(B\).
Next consider the time inversion. As for the spatial inversion, we’ll need the following identities
where it’s convenient for the third equality to reformulate \(\eqref{eq_angular_momentum_representation_conjugate_formula}\) as follows
Since \(v^{AB}_{ab}(\pbf, \sigma)\) is related to \(u^{AB}_{ab}(\pbf, \sigma)\) by \(\eqref{eq_general_field_v_at_finite_momentum}\), we can derive the \(v\)-counterpart of \(\eqref{eq_general_field_u_symmetry_negation}\) as follows
Remembering that \(U(\Tcal)\) is anti-unitary, we can calculate the time inversion transformation as follows
where the last equality assumes the following symmetry on the time-reversal parity
At this point, we’re pretty proficient at (and tired of) this kind of calculation. Hence we’ll not spell out the (rather similar) details for the charge conjugation symmetry, but rather list the result as follows
under the following assumption on the charge-reversal parity
The CPT theorem#
With all the hard work we’ve done in General fields, we can now reward ourselves a bit with the celebrated CPT theorem which is stated as follows
For an appropriate choice of the inversion phases \(\eta\) (space), \(\zeta\) (time), and \(\xi\) (charge), the product \(U(CPT)\) is conserved.
We’ll skip over the special case of scalar, vector, and Dirac fields, and jump directly into the general, and in fact simpler, case of \((A, B)\) fields.
Hence if we assume the following condition on the inversion parities
Assumption on the inversion parities
then we can rewrite \(\eqref{eq_cpt_conjugation_general_field_calculation}\) as follows
A few words are needed, however, to justify the seemingly strange assumption on the product of inversion parities. Indeed, it is physically meaningless to specify any inversion parity for a single species of particles because it’s just a phase. The only conditions that we’ve seen on the parities come from pairs of particles and their antiparticles, notably \(\eqref{eq_general_field_space_inversion_parity_relation}, \eqref{eq_general_field_time_inversion_parity_relation}\), and \(\eqref{eq_general_field_charge_inversion_parity_relation}\). We saw that the time and charge inversion parities are the same between the particle and its antiparticle, respectively. However, a sign \((-1)^{2\jfrak}\) is involved in the space inversion parity. So if we impose \(\eqref{eq_cpt_parities_product_assumption}\) on one particle species, then it will fail on its antiparticle species if the particle in question is a fermion! We’re eventually saved by the fact that the interaction density must involve an even number of fermions (cf. discussions in Causality and antiparticles). In any case \(\eqref{eq_cpt_parities_product_assumption}\) is a fairly sloppy assumption, which cannot hold in general, but it also doesn’t make a difference to the CPT theorem.
Now suppose the interaction density \(\Hscr(x)\) is defined by \(\eqref{eq_general_field_interaction_density}\) as a linear combination of monomials like
Hence in light of \(\eqref{eq_general_field_g_coefficients_covariance}\), we know that both \(A_1 + A_2 + \cdots + A_n\) and \(B_1 + B_2 + \cdots + B_n\) must be integers, for otherwise they cannot be coupled to a spinless state. It follows then the following CPT transformation law on the interaction density
Recall from \(\eqref{eq_evolution_equation_of_u_operator}\) and \(\eqref{eq_defn_v_by_density}\) that the interaction term \(V = \int d^3 x~\Hscr(0, \xbf)\) satisfies the following [12]
Since the CPT symmetry is clearly conserved for free particles, it is also conserved in interactions according to \(\eqref{eq_h_as_h0_plus_v}\).
Massless fields#
So far the story about quantum fields has been a 100% success. We’ve namely found the general formula \(\eqref{eq_general_field_psi_field}\) for any field that represents a massive particle. However, such success will come to an end when we consider instead massless particles as we’ll see in this section. This should not come as a surprise though since we’ve see in \(\eqref{eq_vector_field_defn_Pi}\) for example, that the spin sum blows up in the massless limit \(m \to 0\).
Let’s nonetheless kickstart the routine of constructing fields as follows, and see where the problem should arise.
This is reasonable because the translation symmetry is the same for massive and massless particles, and hence \(\eqref{eq_redefine_u_after_translation}\) and \(\eqref{eq_redefine_v_after_translation}\) apply.
Next, using the general transformation laws \(\eqref{eq_lorentz_transformation_formula_for_creation_operator}\) and \(\eqref{eq_lorentz_transformation_formula_for_annihilation_operator}\) for creation and annihilation operators, as well as the \(D\) matrix \(\eqref{eq_little_group_d_matrix_massless}\) for massless particles, we can infer the homogeneous Lorentz transformation laws as follows
Just as in the massive case, we’d like \(\psi_{\ell}(x)\) to satisfy the following transformation law
To see what conditions the spinors must satisfy, let’s first expand the left-hand-side as follows
Then the right-hand-side as follows
Equating the coefficients of \(a(\pbf_{\Lambda, \sigma})\) and \(a^{c \dagger}(\pbf_{\Lambda}, \sigma)\) (i.e., the blue terms), and inverting \(D_{\ell \ell'}(\Lambda^{-1})\) as in \(\eqref{eq_annihilation_u_transformation}\), we get the following conditions on the spinors
The next step is to take the massless analogy to the boost operator in the massive case. Namely, if we let \(\Lambda = L(p)\) be the chosen Lorentz transformation that takes the standard \(k = (1, 0, 0, 1)\) to \(p\), then \(\theta(\Lambda, p) = 0\). Taking \(p = k\) in \(\eqref{eq_massless_field_spinor_u_condition}\) and \(\eqref{eq_massless_field_spinor_v_condition}\), we obtain the following
Next, in analogy to the rotation transformation, let’s consider a little group element \(W\) that fixes \(k\). In this case \(\eqref{eq_massless_field_spinor_u_condition}\) and \(\eqref{eq_massless_field_spinor_v_condition}\) take the following form
Recall from Massless particle states that any \(W\) can be written in the following form
where \(S(a, b)\) is defined by \(\eqref{eq_massless_little_group_s_matrix}\), and \(R(\theta)\) is defined by \(\eqref{eq_massless_little_group_r_matrix}\). Considering separately the two cases \(W(0, 0, \theta) = R(\theta)\) and \(W(a, b, 0) = S(a, b)\), we get the following two consequences of \(\eqref{eq_massless_field_spinor_u_condition}\)
Similar constraints hold for \(v\) as well, but we’ll not bother to write them down.
It turns out, however, that these conditions can never be satisfied! To illustrate the difficulties, we’ll first consider the case of vector fields, both as a warm-up and for later references when we’ll analyze the electromagnetic theory. Then we’ll show that the difficulties persist to the general case of arbitrary \((A, B)\) fields.
The failure for vector fields#
For vector field \(D_{\mu}^{\nu}(\Lambda) = \Lambda_{\mu}^{\nu}\) as in the massive case. As a convention, let’s write
Since \(D_{\mu}^{\nu}(\Lambda)\) is real, it follows from \(\eqref{eq_massless_field_spinor_u_condition}\) and \(\eqref{eq_massless_field_spinor_v_condition}\) that \(v\) satisfies equations that are complex conjugate to those that \(u\) satisfies. Hence \(v_{\mu}(\pbf, \sigma) = u^{\ast}_{\mu}(\pbf, \sigma)\).
Now we can translate \(\eqref{eq_massless_field_u_from_k}\) from a boosting formula for \(u\) to one for \(e\) as follows
Moreover \(\eqref{eq_massless_field_spinor_u_condition_from_r}\) and \(\eqref{eq_massless_field_spinor_u_condition_from_s}\) can be translated to conditions on \(e\) as well as follows
Using the explicit formula \(\eqref{eq_massless_little_group_r_matrix}\) for \(R(\theta)\), we can derive from \(\eqref{eq_massless_field_spinor_e_condition_from_r}\) that the helicity \(\sigma = \pm 1\), and moreover,
up to normalization. However, by the explicit formula \(\eqref{eq_massless_little_group_s_matrix}\) for \(S(a, b)\), we see that \(\eqref{eq_massless_field_spinor_e_condition_from_s}\) then requires
which is impossible for any real \(a, b\) that are not both zero.
For reasons that will be justified later, it’s nonetheless legitimate to adopt the vectors \(e_{\mu}\) as defined by \(\eqref{eq_massless_vector_field_e_at_k}\) as the spinors, as well as the condition \(\kappa = \lambda = 1\) as in the case of massive vector fields. With these assumptions, we can rename \(\psi\) by \(a\) (as it’ll correspond to the electromagnetic potential which is conventionally named by \(a\)), and rewrite \(\eqref{eq_massless_field_defn_psi_field}\) as follows
Warning
The notations are getting slightly out of hands here. Namely, we’ve used \(a\) for at least three different things in one place: the vector field \(a_{\mu}\), the parameter in \(S(a, b)\), and the creation operator \(a(\pbf, \sigma)\). There will actually be a fourth place where \(a\) is used as the spin \(z\)-component in an \((A, B)\) field. We can only hope that the context will make it clear what \(a\) (or \(b\)) really represents.
As for massive vector fields, we’ll search for field equations that \(a_{\mu}(x)\) must satisfy. First of all, it satisfies obviously the (massless) Klein-Gordon equation
which is nothing but an incarnation of the mass-shell condition \(p_0^2 = \pbf^2\). Then let’s consider the massless analog to the gauge-fixing condition \(\eqref{eq_vector_field_gauge_fixing_condition}\). To this end, we claim that \(e_0(\kbf, \pm 1) = 0\) and \(\kbf \cdot \ebf(\kbf, \pm 1) = 0\) imply the following
in analogy to \(\eqref{eq_vector_field_spinor_orthogonal_to_momentum}\) by the following argument. First, note that \(e_{\mu}(\pbf, \pm 1)\) can be obtained from \(e_{\mu}(\kbf, \pm 1)\) by applying \(L(p)\) as in \(\eqref{eq_massless_vector_field_spinor_from_k_to_p}\). Second, \(L(p)\) can be decomposed into a boost along the \(z\)-axis as in \(\eqref{eq_massless_boost}\) followed by a \(3\)-rotation. Finally, we conclude \(\eqref{eq_massless_vector_field_spinor_zero_vanishes}\) and \(\eqref{eq_massless_vector_field_spinor_orthogonal_to_momentum}\) by noting that \(e_{\mu}(\kbf, \pm 1)\) is unaffected by any boost along the \(z\), and the dot product is preserved by any \(3\)-rotation. In contrast to the massless case \(\eqref{eq_vector_field_spinor_orthogonal_to_momentum}\), we have a stronger constraint \(\eqref{eq_massless_vector_field_spinor_zero_vanishes}\) here because the helicity \(0\) spinor is missing.
The corresponding constraints on \(a_{\mu}(x)\) is the following
which is clearly not Lorentz invariant.
But let’s calculate \(U(\Lambda) a_{\mu}(x) U^{-1}(\Lambda)\) anyway and see to some extent \(\eqref{eq_massless_field_psi_transform_by_d_matrix}\) fails. To this end, we’ll need to calculate the action of the \(D\) matrix on the spinors, and we’ll first do this for \(e_{\mu}(k, \pm 1)\) as follows
Next we recall the little group element \(W(\Lambda, p) = L^{-1}(\Lambda p) \Lambda L(p)\) by definition. Plugging into \(\eqref{eq_massless_vector_field_dw_acts_on_spinor}\), we obtain the following
where
is the extra term that makes it different from \(\eqref{eq_massless_field_spinor_u_condition}\) which would have been satisfied if \(\eqref{eq_massless_field_psi_transform_by_d_matrix}\) holds.
To most conveniently utilize \(\eqref{eq_massless_vector_field_spinor_e_condition}\), let’s calculate \({D_{\mu}}^{\nu} \left( U(\Lambda) a_{\nu}(x) U^{-1}(\Lambda) \right)\) using \(\eqref{eq_massless_vector_field_a}, \eqref{eq_massless_vector_field_creation_operator_conjugated_by_lorentz_transformation}\), and \(\eqref{eq_massless_vector_field_annihilation_operator_conjugated_by_lorentz_transformation}\) as follows
where \(\Omega(\Lambda, x)\) is a linear combination of creation and annihilation operators, whose precise form is not important here. Finally, moving \(D(\Lambda)\) to the right-hand-side, we obtain the following variation of \(\eqref{eq_massless_field_psi_transform_by_d_matrix}\)
which the massless vector field \(\eqref{eq_massless_vector_field_a}\) actually satisfies.
It follows, using integration by parts, that one can construct interaction densities by coupling \(j^{\mu}(x) a_{\mu}(x)\) as long as \(\p_{\mu} j^{\mu}(x) = 0\). This is completely parallel to \(\eqref{eq_vector_field_j_coupling}\) and \(\eqref{eq_vector_field_j_coupling_condition}\) for massive vector fields, and hence partially justifies the choice of the spinors in \(\eqref{eq_massless_vector_field_e_at_k}\), which satisfy \(\eqref{eq_massless_field_spinor_e_condition_from_r}\) but not \(\eqref{eq_massless_field_spinor_e_condition_from_s}\).
Another byproduct of \(\eqref{eq_massless_vector_field_a_conjugation_by_lorentz_transformation}\) is the observation that although \(a_{\mu}\) fails to be a vector, one can easily construct a \(2\)-tensor as follows
which obviously satisfies \(U(\Lambda) f_{\mu \nu} U^{-1}(\Lambda) = D_{\mu}^{\lambda}(\Lambda^{-1}) D_{\nu}^{\sigma}(\Lambda^{-1}) f_{\lambda \sigma}\). In fact, using \(\eqref{eq_massless_vector_field_klein_gordon}, \eqref{eq_massless_vector_field_a0_vanishes}\) and \(\eqref{eq_massless_vector_field_a_divergence_vanishes}\), one can show that \(f_{\mu \nu}\) satisfies the vacuum Maxwell equations
where \(\epsilon^{\rho \tau \mu \nu}\) is the totally anti-symmetric sign. Indeed \(f_{\mu \nu}\) is the quantization of the electromagnetic field, while \(a_{\mu}\) is the quantization of the electromagnetic potential.
The failure for general fields#
The problem that massless fields cannot be made to satisfy the transformation law \(\eqref{eq_massless_field_psi_transform_by_d_matrix}\) is not specific to vector fields. Indeed, we’ll show in this section that the same problem persists to all \((A, B)\) representations.
Recall from \(\eqref{eq_general_field_a_from_jk}\) – \(\eqref{eq_general_field_b_from_jk}\), and \(\eqref{eq_general_field_a_repr}\) – \(\eqref{eq_general_field_b_repr}\), that we can explicitly write the \(\Jscr_{\mu \nu}\) matrix as follows
It follows from \(\eqref{eq_dirac_field_linearize_representation}\), and the fact according to \(\eqref{eq_massless_little_group_r_matrix}\) that \(R(\theta)\) is the rotation in the \(xy\)-plane, that
for \(\theta\) infinitesimal. The linearized version of \(\eqref{eq_massless_field_spinor_u_condition_from_r}\), and its counterpart for \(v\), in this case is given by the following
It follows that
Next let’s linearize \(\eqref{eq_massless_field_spinor_u_condition_from_s}\) using the explicit formula \(\eqref{eq_massless_little_group_s_matrix}\) for \(S(a, b)\) as follows
which correspond to taking \(a\) infinitesimal and \(b = 0\), and \(b\) infinitesimal and \(a = 0\), respectively. Since \(u_{ab}\) are linearly independent, the above constraints are equivalent to the following
In order for \(u_{ab'}\) to be annihilated by the lowering operator, and for \(u_{a'b}\) to be annihilated by the raising operator, we must have
if \(u_{a'b'}(\kbf, \sigma) \neq 0\). Combining \(\eqref{eq_massless_general_field_ab_condition_1}\) and \(\eqref{eq_massless_general_field_ab_condition_2}\), we conclude that for the \(u\)-spinor to not vanish, we must have
It follows that a general massless \((A, B)\) field, according to \(\eqref{eq_general_field_defn_psi_field}\), can only destroy particles of helicity \(B-A\). Similar argument can be applied to the \(v\)-spinor, which, together with \(\eqref{eq_massless_general_field_ab_condition_for_v}\), implies that the field can only create antiparticles of helicity \(A-B\).
As a special case, we see once again that a massless helicity \(\pm 1\) field cannot be constructed as a vector field, i.e., a \(\left( \tfrac{1}{2}, \tfrac{1}{2} \right)\) field, because such vector field must be scalar by \(\eqref{eq_massless_general_field_helicity_condition}\). Indeed, the simplest massless helicity \(\pm 1\) field must be a \((1, 0) \oplus (0, 1)\) field, which is nothing but the anti-symmetric \(2\)-tensor \(f_{\mu \nu}\) defined by \(\eqref{eq_massless_vector_field_curvature_tensor}\).
The Feynman Rules#
In Cluster decomposable Hamiltonians, we’ve discussed the condition that the Hamiltonian must satisfy in order for the cluster decomposition principle to hold. Namely, the Hamiltonian can be written as a polynomial \(\eqref{eq_general_expansion_of_hamiltonian}\) of creation and annihilation operators in normal order, such that the coefficients contains exactly one momentum conversing delta function. In order to derive this condition, we’ve encountered the idea of Feynman diagrams, which is a bookkeeping device for the evaluation of S-matrix. We couldn’t say more about the coefficients besides the delta function because we had not introduced the building blocks of the Hamiltonian, namely, the quantum fields. Now that we’ve seen how to construct even the most general fields in the previous chapter, we’re ready to spell out the full details of Feynman diagrams, first in spacetime coordinates, and then in momentum space coordinates.
Warning
We’ll consider only fields of massive particles in this chapter.
Spacetime Feynman rules#
In this section, we’ll derive the Feynman rules in spacetime coordinates, which will establish a diagrammatic framework for calculating S-matrices. In the next section, we’ll translate these rules to the momentum space.
First, recall the S-matrix formulated in terms of the S-operator \(\eqref{eq_s_matrix_power_series_expansion_time_ordered_density}\) as follows
where \(T\{ \cdots \}\) is the time-ordered product defined by \(\eqref{eq_defn_time_ordered_product}\). As we’ve seen from the previous chapter, the interaction density \(\Hscr(x)\) may be written as a polynomial in fields and their adjoint as follows
where \(\Hscr_i(x)\) is monomial of certain fields and their adjoint. Finally, we recall the general formula \(\eqref{eq_general_field_psi_field}\) for a quantum field as follows
where we’ve restored the particle species index \(n\), and absorbed the sign \((-1)^{2B}\) in \(\eqref{eq_general_field_psi_field}\) into the \(v\)-spinor. Unlike the notation used in Quantum Lorentz symmetry, the sub-index \(\ell\) here includes not only the running indexes of the Lorentz representation, but also the representation itself, as well as the particle species \(n\).
Warning
Any specific field of interest, whether it’s scalar, vector, or Dirac, must be first put into the form \(\eqref{eq_generic_field_expression}\) in order to use the Feynman rules, that we’ll introduce below.
According to \(\eqref{eq_defn_annihilation_field}\) and \(\eqref{eq_defn_creation_field}\), we can also write, with obvious modifications, \(\psi_{\ell}(x) = \psi^+_{\ell}(x) + \psi^-_{\ell}(x)\) such as \(\psi^+_{\ell}(x)\) is a linear combination of annihilation operators, and \(\psi^-_{\ell}(x)\) is a linear combination of creation operators.
Now the idea of the Feynman rules to calculate the S-matrix is same as what has been discussed in Cluster decomposable Hamiltonians. Namely, we’d like to move any annihilation operator to the right of a creation operator using the standard commutation rule \(\eqref{eq_creation_annihilation_commutator}\). To be more specific, we’ll list all the possible scenarios as follows
- Paring a final particle (in out-state) \((\pbf, \sigma, n)\) with a field adjoint \(\psi^{\dagger}_{\ell}(x)\) gives
- \begin{equation} \underbracket{a(\pbf, \sigma, n) \psi^{\dagger}_{\ell}(x)} \coloneqq \left[ a(\pbf, \sigma, n), \psi^{\dagger}_{\ell}(x) \right]_{\pm} = (2\pi)^{-3/2} e^{-\ifrak p \cdot x} u^{\ast}_{\ell}(\pbf, \sigma, n) \label{eq_feynman_rule_a_psi_dagger} \end{equation}
- Paring a final antiparticle \((\pbf, \sigma, n^c)\) with a field \(\psi_{\ell}(x)\) gives
- \begin{equation} \underbracket{a(\pbf, \sigma, n^c) \psi_{\ell}(x)} \coloneqq \left[ a(\pbf, \sigma, n^c), \psi_{\ell}(x) \right]_{\pm} = (2\pi)^{-3/2} e^{-\ifrak p \cdot x} v_{\ell}(\pbf, \sigma, n) \label{eq_feynman_rule_a_psi} \end{equation}
- Paring a field \(\psi_{\ell}(x)\) with an initial particle (in in-state) \((\pbf, \sigma, n)\) gives
- \begin{equation} \underbracket{\psi_{\ell}(x) a^{\dagger}(\pbf, \sigma, n)} \coloneqq \left[ \psi_{\ell}(x), a^{\dagger}(\pbf, \sigma, n) \right]_{\pm} = (2\pi)^{-3/2} e^{\ifrak p \cdot x} u_{\ell}(\pbf, \sigma, n) \label{eq_feynman_rule_psi_a_dagger} \end{equation}
- Paring a field adjoint \(\psi^{\dagger}_{\ell}(x)\) with an initial antiparticle \((\pbf, \sigma, n^c)\) gives
- \begin{equation} \underbracket{\psi^{\dagger}(x) a^{\dagger}(\pbf, \sigma, n^c)} \coloneqq \left[ \psi^{\dagger}(x), a^{\dagger}(\pbf, \sigma, n^c) \right]_{\pm} = (2\pi)^{-3/2} e^{\ifrak p \cdot x} v^{\ast}_{\ell}(\pbf, \sigma, n) \label{eq_feynman_rule_psi_dagger_a_dagger} \end{equation}
- Paring a final particle \((\pbf, \sigma, n)\) (or antiparticle) with an initial particle \((\pbf', \sigma', n')\) (or antiparticle) gives
- \begin{equation} \underbracket{a(\pbf', \sigma', n') a^{\dagger}(\pbf, \sigma, n)} \coloneqq \left[ a(\pbf', \sigma', n'), a^{\dagger}(\pbf, \sigma, n) \right]_{\pm} = \delta^3(\pbf' - \pbf) \delta_{\sigma' \sigma} \delta_{n' n} \label{eq_feynman_rule_a_a_dagger} \end{equation}
- Paring a field \(\psi_{\ell}(x)\) in \(\Hscr_i(x)\) with a field adjoint \(\psi_m^{\dagger}(y)\) in \(\Hscr_j(y)\) gives
- \begin{equation} \underbracket{\psi_{\ell}(x) \psi^{\dagger}_m(y)} \coloneqq \theta(x_0 - y_0) \left[ \psi^+_{\ell}(x), \psi^{+ \dagger}_m(y) \right]_{\pm} \mp \theta(y_0 - x_0) \left[ \psi^{- \dagger}_m(y), \psi^-_{\ell}(x) \right]_{\pm} \eqqcolon -\ifrak \Delta_{\ell m}(x, y) \label{eq_feynman_rule_propagator} \end{equation}
where \(\theta(\tau)\) is the step function which equals \(1\) for \(\tau > 0\) and vanishes for \(\tau < 0\). Here we remind ourselves once again that the Feynman rule is all about moving annihilation operators, e.g. \(\psi^+_{\ell}(x)\) and \(\psi^{- \dagger}_m(y)\), to the right of creation operators, e.g. \(\psi^-_{\ell}(x)\) and \(\psi^{- \dagger}_m(y)\). The sign \(\mp\) in the middle is due to the fact that when the top sign should to be used, the particles are fermions, and hence the interchange of the fields due to time ordering requires an extra minus sign.
This quantity is known as a propagator, which will be evaluated in the next section.
Note
In the above listing, we’ve assumed that the item on the left in the (anti-)commutators also lies to the left of the item on the right in the (anti-)commutator in the vacuum expectation value in \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\) if we ignore the time ordering operator. The same applies to case (6) where \(\psi_{\ell}(x)\) is assumed to lie to the left of \(\psi_m^{\dagger}(y)\). In particular, if we assume that the interaction density \(\Hscr(x)\) is normally ordered in the sense that all the field adjoints lie to the left of the fields, then \(\Hscr_i(x)\) necessarily lies to the left of \(\Hscr_j(y)\), and we don’t have to to define \(\Delta_{\ell m}(x, x)\) which would require certain regulation to not blow up integrals.
The parings listed above are commonly known as Wick contractions.
A great invention of Feynman is the following diagrammatic representation of the above rules, known as the Feynman diagrams.
A few comments are in order to clarify the meaning of these diagrams
The arrow points towards the (positive) direction of time, which is upwards for particles and downwards for antiparticles. In other words, we interpret an antiparticle as a particle that moves backwards in time. An exceptional case is (6), where the edge is placed horizontally. The reason is that a field or its adjoint doesn’t just create or destroy (anti-)particles – they create/destroy a particle and at the same time destroy/create the corresponding antiparticle, respectively. In light of \(\eqref{eq_feynman_rule_propagator}\), there is no reason to prefer either an upward or a downward arrow.
The arrow in (6) points from \((m, y)\) to \((\ell, x)\) since \(\psi_{\ell}(x)\) is a field and \(\psi^{\dagger}_m(y)\) is a field adjoint. Two processes happen in this scenario, namely, a particle created by \(\psi^{+ \dagger}_m(y)\) is absorbed by \(\psi^+_{\ell}(x)\), and an antiparticle created by \(\psi^-_{\ell}(x)\) is absorbed by \(\psi^{- \dagger}_m(y)\). The arrow is compatible with both processes.
In the case where the particle is its own antiparticle, the arrows in (1) – (6) will be omitted because one cannot tell apart a field and a field adjoint according to \(\eqref{eq_general_field_charge_inversion_transformation}\).
We didn’t draw the other scenario in (5) where an antiparticle is created and then destroyed without any interaction. In this case we need to flip the direction of the arrow.
Every nodes in the diagram, marked by a fat dot, correspond to a monomial \(\Hscr_i(x)\) in \(\eqref{eq_interaction_density_as_sum_of_monomials}\). Moreover, for each node, there are as many incoming edges as there are fields, and as many outgoing edges as there are field adjoints.
With these basic building blocks at hand, we’re ready to evaluate \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\) using Feynman diagrams. The first thing to notice is that the S-matrix, as defined by \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\), should be viewed as a power series, where a term of order \(N\) corresponds to a monomial given as a product of powers, each of which is \(N_i\)-th power of an interaction type \(g_i \Hscr_i(x)\) (see \(\eqref{eq_interaction_density_as_sum_of_monomials}\)), such that \(N = \sum_i N_i\).
To a given term of order \(N\), one can draw the associated Feynman diagrams in two steps as follows. The first step is to draw (on a piece of paper) one upward-pointing and downward-pointing strand for each in-state particle and antiparticle, respectively, at the bottom; do the same to the out-state particles and antiparticles at the top; and draw in the middle \(N_i\) vertices for each interaction types \(i\), which corresponds to a monomial \(\Hscr_i(x)\), with incoming and outgoing edges corresponding to its fields and field adjoints, respectively. The second step is to connect any pair of open strands by an oriented edge if the particle in question is different from its antiparticle, and by an unoriented edge otherwise. Moreover, the vertices and the edges are labelled as illustrated in the figure above.
Now knowing how to draw a Feynman diagram of any given order for an interaction, we can spell out the recipe for a diagrammatic calculation of the S-matrix given by \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\) in the following steps.
Draw all (distinct) Feynman diagrams (in reality, to a finite order) following the rules described above. We’ll come back to what we mean by “distinct” right after the recipe.
For each diagram, we assign a factor \(-\ifrak\) to each vertex, corresponding to one factor of the power \((-\ifrak)^n\) in \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\); and a factor \(g_i\) to each vertex corresponding to the coefficient of \(\Hscr_i(x)\) in \(\eqref{eq_interaction_density_as_sum_of_monomials}\); and a factor in the listing of Feynman rules to each edge. Multiplying all the factors together and integrating over all the coordinates \(x_1, x_2, \cdots\), one for each vertex, we obtain a (numeric) valuation of the Feynman diagram.
The S-matrix is the “sum” over all the evaluations of the Feynman diagrams. Here the sum is put in quotation marks because we might do subtraction instead of addition when there are fermionic fields involved in the interaction. More precisely, for each Feynman diagram, one can move the fields and field adjoints over each other that the two ends of every edge are next to each other (in the right order). Then we add a minus sign in front of the evaluation if such rearrangement involves an odd number of swaps between fermionic fields (or field adjoints).
These are really all one needs to evaluate S-matrices using Feynman diagrams, but a few further notes may be necessary to make it completely clear. Firstly, note that we’ve ignored the factor \(1 / N!\) in \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\) from our recipe above. The reason lies in the word “distinct” from the first step. More precisely, we consider two Feynman diagrams, which differ by a re-labeling of the vertices, to be the same. Since there are \(N!\) ways of labeling \(N\) vertices, we have already taken the fraction \(1 / N!\) into account by including only distinct diagrams.
Secondly, by the discussion in Cluster decomposable Hamiltonians, only connected Feynman diagrams will be included so that the resulting S-matrix satisfies the cluster decomposition principle.
The last note is more of a convention (for convenience), which aims at further remove duplications among Feynman diagrams. Here by duplication we mean diagrams that are not exactly the same but whose evaluations are the same. A basic example is when a monomial in the interaction density \(\Hscr(x)\) contains a power of the same field (or field adjoint). From the viewpoint of Feynman diagrams, it means that a vertex may have more than one identical attached (incoming or outgoing) strands. Now when other strands want to connect to these identical ones, they may choose which one to connect first, and second, and so on, but the result will be the same regardless of the choices. Hence it’s a convention to write the coefficient of \(\Hscr_i(x)\) as \(g_i / k!\) if it contains \(k\) identical fields (or field adjoints), so that in a diagrammatic calculation, one only need to include one of such diagrams. Other numerical factors might be inserted in more complex situations, such as when two vertices with identical attached strands try to connect to each other, or when there is a loop of identical vertices. These cases are discussed in [Wei95] page 265 – 267, and we’ll come back to them when they become relevant in calculations.
To make things concrete and to prepare for the calculations in the next sections, we conclude the discussion of Feynman rules with two prototypical examples.
Example 1: \(\psi^{\dagger} \psi \phi\)-interaction#
Consider the following interaction density
where \(\psi(x)\) is a (complex) fermionic field, and \(\phi(x)\) is a real, i.e., \(\phi(x) = \phi^{\dagger}(x)\), bosonic field. This is an important type of interaction since it shows up not only in quantum electrodynamics, but in fact in the whole Standard Model of electromagnetic, weak, and strong interactions.
This type of interaction allows three kinds of scattering processes, namely, fermion-fermion, fermion-boson, and boson-boson scattering, which we’ll discuss one by one.
- Fermion-fermion scattering
The scattering is represented as \(12 \to 1'2'\), where all in- and out-state particles \(1, 2, 1', 2'\) are fermions. Up to the second order, there are two (connected) Feynman diagrams
where the solid (directed) edges represent the fermions, and the dashed (undirected) edges represent the (neutral) boson. More explicitly, the two diagrams correspond to the following two contractions
\begin{align} \underbracket{ a(1') \psi_{\ell}^{\dagger}(x) } \underbracket{ a(2') \psi_{\ell'}^{\dagger}(y) } \underbracket{ \phi_k(x) \phi_{k'}(y) } \underbracket{ \psi_m(x) a^{\dagger}(1) } \underbracket{ \psi_{m'}(y) a^{\dagger}(2) } \label{eq_fermion_fermion_scattering_1212} \\ \underbracket{ a(2') \psi_{\ell}^{\dagger}(x) } \underbracket{ a(1') \psi_{\ell'}^{\dagger}(y) } \underbracket{ \phi_k(x) \phi_{k'}(y) } \underbracket{ \psi_m(x) a^{\dagger}(1) } \underbracket{ \psi_{m'}(y) a^{\dagger}(2) } \label{eq_fermion_fermion_scattering_1221} \end{align}respectively. Moreover, comparing with the original order
\begin{equation} a(2') a(1') \psi^{\dagger}(x) \psi(x) \phi(x) \psi^{\dagger}(y) \psi(y) \phi(y) a^{\dagger}(1) a^{\dagger}(2) \label{eq_two_particles_scattering_original_order} \end{equation}we see that the contractions \(\eqref{eq_fermion_fermion_scattering_1212}\) and \(\eqref{eq_fermion_fermion_scattering_1221}\) require an even and odd number of swaps between fermionic operators, respectively. We note that whether a given diagram requires an even or odd fermionic swaps is rather arbitrary, and depends on many conventions. However, the fact that the two diagrams corresponding to \(\eqref{eq_fermion_fermion_scattering_1212}\) and \(\eqref{eq_fermion_fermion_scattering_1221}\) carry opposite signs is independent of the conventions and hence meaningful. Indeed, it’s another incarnation of the Fermi statistics in the sense that the S-matrix switches sign if either the in-state fermions \(1 \leftrightarrow 2\) or the out-state fermions \(1' \leftrightarrow 2'\) are swapped.
Now let’s use the Feynman rules \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_propagator}\) to evaluate the fermion-fermion scattering S-matrix up to second order as follows [13]
\begin{align*} S^C_{\pbf'_1 \sigma'_1 n'_1,~\pbf'_2 \sigma'_2 n'_2;~~\pbf_1 \sigma_1 n_1,~\pbf_2 \sigma_2 n_2} &= \sum_{\ell m k, \ell' m' k'} (-\ifrak)^2 g_{\ell m k} g_{\ell' m' k'} \int d^4 x d^4 y \big( \eqref{eq_fermion_fermion_scattering_1212} - \eqref{eq_fermion_fermion_scattering_1221} \big) \\ &= (2\pi)^{-6} \sum_{\ell m k, \ell' m' k'} (-\ifrak)^2 g_{\ell m k} g_{\ell' m' k'} \int d^4 x d^4 y~(-\ifrak) \Delta_{k k'}(x, y) \times \\ &\phantom{=} \times e^{\ifrak p_1 \cdot x + \ifrak p_2 \cdot y} u_m(\pbf_1, \sigma_1, n_1) u_{m'}(\pbf_2, \sigma_2, n_2) \\ &\phantom{=} \times \left( e^{-\ifrak p'_1 \cdot x - \ifrak p'_2 \cdot y} u^{\ast}_{\ell}(\pbf'_1, \sigma'_1, n'_1) u^{\ast}_{\ell'}(\pbf'_2, \sigma'_2, n'_2) - e^{-\ifrak p'_2 \cdot x - \ifrak p'_1 \cdot y} u^{\ast}_{\ell}(\pbf'_2, \sigma'_2, n'_2) u^{\ast}_{\ell'}(\pbf'_1, \sigma'_1, n'_1) \right) \end{align*}
- Fermion-boson scattering
Consider the scattering \(12 \to 1'2'\), where particles \(1, 1'\) are fermions and \(2, 2'\) are bosons, under interaction density \(\eqref{eq_psi_dagger_psi_phi_interaction_density}\). Up to second order, there are again two Feynman diagrams as follows
They correspond to the following two contractions
\begin{equation} \underbracket{ a(2') \phi_k(x) } \underbracket{ a(1') \psi^{\dagger}_{\ell}(x) } \underbracket{ \psi_m(x) \psi_{\ell'}^{\dagger}(y) } \underbracket{ \psi_{m'}(y) a^{\dagger}(1) } \underbracket{ \phi_{k'}(y) a^{\dagger}(2) } \label{eq_fermion_boson_scattering_1212} \end{equation}and
\begin{equation} \underbracket{ a(2') \phi_{k'}(y) } \underbracket{ a(1') \psi_{\ell}^{\dagger}(x) } \underbracket{ \psi_m(x) \psi_{\ell'}^{\dagger}(y) } \underbracket{ \psi_{m'}(y) a^{\dagger}(1) } \underbracket{ \phi_k(x) a^{\dagger}(2) } \label{eq_fermion_boson_scattering_1221} \end{equation}respectively. Comparing with the ordering of operators \(\eqref{eq_two_particles_scattering_original_order}\), we see that neither \(\eqref{eq_fermion_boson_scattering_1212}\) nor \(\eqref{eq_fermion_boson_scattering_1221}\) require any fermionic swap, and hence no extra signs are need in this case, in contrast to the previous case of fermion-fermion scattering.
Next let’s use \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_propagator}\) to evaluate the second order S-matrix as follows
\begin{align*} S^C_{\pbf'_1 \sigma'_1 n'_1,~\pbf'_2 \sigma'_2 n'_2;~\pbf_1 \sigma_1 n_1,~\pbf_2 \sigma_2 n_2} &= \sum_{\ell m k, \ell' m' k'} (-\ifrak)^2 g_{\ell m k} g_{\ell' m' k'} \int d^4 x d^4 y \ \big( \eqref{eq_fermion_boson_scattering_1212} + \eqref{eq_fermion_boson_scattering_1221} \big) \\ &= (2\pi)^{-6} \sum_{\ell m k, \ell' m' k'} (-\ifrak)^2 g_{\ell m k} g_{\ell' m' k'} \int d^4 x d^4 y~(-\ifrak) \Delta_{m \ell'}(x, y) \times \\ &\phantom{=} \times e^{-\ifrak p'_1 \cdot x + \ifrak p_1 \cdot y} u^{\ast}_{\ell}(\pbf'_1, \sigma'_1, n'_1) u_{m'}(\pbf_1, \sigma_1, n_1) \\ &\phantom{=} \times \left( e^{-\ifrak p'_2 \cdot x + \ifrak p_2 \cdot y} u^{\ast}_k(\pbf'_2, \sigma'_2, n'_2) u_{k'}(\pbf_2, \sigma_2, n_2) \ + e^{-\ifrak p'_2 \cdot y + \ifrak p_2 \cdot x} u^{\ast}_{k'}(\pbf'_2, \sigma'_2, n'_2) u_k(\pbf_2, \sigma_2, n_2) \right) \end{align*}
- Boson-boson scattering
It turns out that the lowest order boson-boson scattering under the interaction density \(\eqref{eq_psi_dagger_psi_phi_interaction_density}\) is four. An example of such scattering is given by the following Feynman diagram
which involves a fermionic loop. We’ll not evaluate the corresponding S-matrix here, but note that the corresponding (fermionic) contraction
\begin{equation*} \underbracket{ \psi(x_1) \psi^{\dagger}(x_2) } \underbracket{ \psi(x_2) \psi^{\dagger}(x_3) } \underbracket{ \psi(x_3) \psi^{\dagger}(x_4) } \underbracket{ \psi(x_4) \psi^{\dagger}(x_1) } \end{equation*}where we’ve ignored the terms involving the bosonic operators, requires an odd number of fermionic swaps, which, in turn, requires an extra minus sign. This is a general phenomenon for any diagram that involves a fermionic loop.
Example 2: \(\phi^3\)-interaction#
Now let’s consider an interaction density that involves a power of the same field as follows
where \(g_{\ell m k}\) is totally symmetric, and the extra factor \(1/3!\) is to account for this symmetry.
We’ll evaluate the scattering amplitude of two (identical) bosons \(12 \to 1'2'\). In this case, there are three connected Feynman diagrams of second order as follows
They correspond to the following three contractions
respectively. We note once again that the factor \(1/3!\) in \(\eqref{eq_phi3_interaction_density}\) allows us to include just the above three contractions, and ignore the other ones obtained by, say, permuting \(\ell, m\), and \(k\) (or equivalently \(\ell', m'\), and \(k'\)). With this in mind, we can now calculate the (connected) S-matrix using \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_propagator}\) as follows
A special case is when \(\phi\) is a scalar field so that \(\eqref{eq_phi3_interaction_density}\) takes the following form
In this case, the S-matrix can be simplified as follows
where \(\Delta_F\) is the so-called Feynman propagator and will be discussed in detail in the next section.
Momentum space Feynman rules#
There turns out to be many advantages in working in the \(4\)-momentum space, instead of the spacetime. Hence the goal of this section is to translate the Feynman rules \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_propagator}\) to the momentum space. The biggest challenge, however, lies in the fact that so far we’ve been working exclusively under the assumption that the \(4\)-momentum lies on the mass shell, whether in the spinors appearing in \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_psi_a_dagger}\) or in the propagator \(\eqref{eq_feynman_rule_propagator}\). We’ll tackle this challenge in three steps. First, we’ll rewrite the propagator as an integral on the momentum space, as opposed to the integral \(\eqref{eq_defn_Delta_plus}\) defined on the mass shell. This will then allow us to translate the Feynman rules to the momentum space, except for the external edges which still live on the mass shell because the in- and out-state particles do so. Finally, we’ll discuss how to generalize the Feynman rules so that the “external lines” do not necessarily have to live on the mass shell.
Propagator in momentum space#
Plugging \(\eqref{eq_generic_field_expression}\) into \(\eqref{eq_feynman_rule_propagator}\), we can rewrite the propagator in terms of the spin sums as follows
where the top and bottom signs apply to bosonic and fermionic fields, respectively.
Then, in light of \(\eqref{eq_general_field_spin_sums_as_pi}\) and \(\eqref{eq_general_field_spin_sum_is_polynomial}\), we can write the spin sums as polynomials in (on-mass-shell) \(4\)-momenta as follows
where the top and bottom signs correspond to bosonic and fermionic fields, respectively. Note that although according to \(\eqref{eq_general_field_spin_sums_as_pi}\), the spin sums for \(u\) and \(v\) can be made the same (with respect to some basis of representation), we’ve introduced here an additional sign for \(v\) using the symmetry property \(\eqref{eq_general_field_spin_sum_polynomial_parity}\). In particular, as we’ll see right below, this extra sign is also needed to be consistent with our previous calculations for Dirac fields.
Before we continue evaluating the propagator, let’s take a moment to figure out how \(P_{\ell m}(p)\) looks like in the cases of scalar, vector, Dirac, and general \((A, B)\) fields. The case of scalar fields is the simplest, and it follows from \(\eqref{eq_scalar_u_and_v}\) that
Next, for spin-\(1\) vector fields, it follows from \(\eqref{eq_vector_field_Pi_matrix}\) and \(\eqref{eq_vector_field_defn_Pi}\) that
which is even in \(p\), and hence consistent with \(\eqref{eq_spin_sum_v_as_polynomial}\).
Then for spin-\(1/2\) Dirac fields, it follows from \(\eqref{eq_dirac_field_n_matrix_as_spinor_sum}\) – \(\eqref{eq_dirac_field_m_matrix_as_spinor_sum}\) and \(\eqref{eq_dirac_field_spin_sum_u}\) – \(\eqref{eq_dirac_field_spin_sum_v}\) that
We see that \(\eqref{eq_dirac_field_spin_sum_v}\) is consistent with \(\eqref{eq_spin_sum_v_as_polynomial}\), thanks to the sign convention.
Finally, for any \((A, B)\) field, the polynomial \(P_{ab, a'b'}(p) = \pi_{ab, a'b'}(p)\) (on the mass shell) is given by \(\eqref{eq_general_field_spin_sum}\). We note that \(\eqref{eq_spin_sum_v_as_polynomial}\) is consistent with \(\eqref{eq_general_field_spin_sums_as_pi}\) due to the parity symmetry \(\eqref{eq_general_field_spin_sum_polynomial_parity}\), which we only proved in a special case.
Back to the evaluation of the propagator \(\eqref{eq_propagator_as_spin_sums}\). By plugging \(\eqref{eq_spin_sum_u_as_polynomial}\) and \(\eqref{eq_spin_sum_v_as_polynomial}\) into \(\eqref{eq_propagator_as_spin_sums}\), we can further rewrite the propagator as follows
where \(\Delta_+(x)\) is defined by \(\eqref{eq_defn_Delta_plus}\), and \(P_{\ell m}(p)\) takes the form of \(\eqref{eq_general_field_spin_sum_as_polynomial}\) and is linear in \(p_0 = \sqrt{\pbf^2 + M^2}\). In particular, we see that \(\Delta_{\ell m}(x, y)\) is really a function of \(x - y\), and henceforth can be written as
Now we must remember that although \(P_{\ell m}(p)\) for scalar, vector, and Dirac fields, look like a polynomial defined generally on the momentum space, they’re really only defined on the mass shell, as shown in \(\eqref{eq_general_field_spin_sum_as_polynomial}\) for general fields. We’ll first try the poor man’s extension of \(P_{\ell m}(p)\) to a genuine momentum space polynomial by the following definition
where \(P_{\ell m}^{(0)}, P_{\ell m}^{(1)}\) correspond to \(P_{ab, a'b'}, 2Q_{ab, a'b'}\) in \(\eqref{eq_general_field_spin_sum_as_polynomial}\), respectively. Clearly \(P^{(L)}(p) = P(p)\) when \(p\) is on the mass shell, or equivalently \(p_0 = \sqrt{\pbf^2 + M^2}\). Here the supscript \(L\) stands for linear, since \(P^{(L)}(p)\) is linear in \(p_0\). This linearity, as we’ll now demonstrate, turns out to be the key feature of this rather naive extension.
Since the step function \(\theta(\tau)\), which equals \(1\) for \(\tau > 0\) and vanishes for \(\tau < 0\), has the following property
we can rewrite \(\eqref{eq_propagator_as_delta_plus}\) as follows
where the blue terms vanish since \(\Delta_+(x)\) is even if \(x_0 = 0\).
Define the Feynman propagator \(\Delta_F(x)\) by
which, by the way, is also the propagator for scalar fields. Then the general propagator can be derived from it as follows
Now we’ll evaluate \(\eqref{eq_defn_feynman_propagator}\) as an integral over the momentum space. The key trick is to express the step function \(\theta(t)\) as follows
where \(\epsilon > 0\) is infinitesimally small. The fact that \(\theta(t)\) vanishes for \(t < 0\) and equals \(1\) for \(t > 0\) can be easily verified using the residue theorem. Plugging \(\eqref{eq_theta_step_function_as_contour_integral}\) and \(\eqref{eq_defn_Delta_plus}\) into \(\eqref{eq_defn_feynman_propagator}\), we can calculate as follows
where in the last equality we’ve replaced \(2\sqrt{\pbf^2 + M^2} \epsilon\) with \(\epsilon\), and ignored the \(\epsilon^2\) term. It follows that
Plugging \(\eqref{eq_feynman_propagator_as_momentum_space_integral}\) into \(\eqref{eq_general_propagator_from_feynman_propagator}\), we can express the propagator as an integral over the entire momentum space (without the mass-shell constraint) as follows
This expression is, however, not Lorentz covariant in general since \(P^{(L)}_{\ell m}(p)\), being linear in \(p_0\), is not. More precisely, we’d like \(P_{\ell m}(p)\) to satisfy the following
where we recall the \(m\) index correspond to a field adjoint according to \(\eqref{eq_feynman_rule_propagator}\). Nonetheless, we see from \(\eqref{eq_p_polynomial_scalar}\) – \(\eqref{eq_p_polynomial_dirac}\) that the polynomial \(P\) is Lorentz covariant in the case of scalar, vector, and Dirac fields. Among these three cases, the only case where \(P\) is not linear in \(p_0\) is vector field. Indeed, we have the following from \(\eqref{eq_p_polynomial_vector}\)
Plugging into \(\eqref{eq_propagator_as_momentum_space_integral_linear}\), we have
We see that the price to pay for making \(P_{\mu \nu}(p)\) Lorentz covariant is the blue term, which is local in the sense that it’s non-vanishing only when \(x = y\).
Todo
It’s claimed in [Wei95] page 278 – 279 that such an extra non-covariant term may be countered by a modification to the interaction density, but I have had a hard time seeing why that’s the case. In order to not get stuck at this point, we’ll ignore the difference between \(P^{(L)}\) and \(P\) at momentum so that \(\eqref{eq_propagator_as_momentum_space_integral_linear}\) and write
and remember to be extra careful when, in a concrete case, \(P\) is not linear in \(p_0\).
Feynman rules in momentum space#
To turn the Feynman rules derived in Spacetime Feynman rules, specifically \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_propagator}\), from an integral over spacetime coordinates as in \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\) to an integral over the momentum space coordinates, we need to integrate out the \(x\) coordinates. Indeed, as can be seen from \(\eqref{eq_feynman_rule_a_psi_dagger}\) – \(\eqref{eq_feynman_rule_psi_dagger_a_dagger}\), as well as \(\eqref{eq_propagator_as_momentum_space_integral}\), the \(x\) variables appear only in the exponential terms. More precisely, at each vertex, the only term that involves \(x\) is the following
where \(p_{\text{in}}\) and \(p_{\text{out}}\) denote the momenta of the in- and out-state (anti-)particles that connect the vertex, respectively, if there is any, and \(p_{\text{entering}}\) and \(p_{\text{leaving}}\) denote the momenta on the internal edges that entering and leaving the vertex, respectively. Integrating \(x\) out, we get the following momentum-conservation factor
assigned to each vertex.
Now with all the \(x\) coordinates integrated out, we can reformulate the diagrammatic Feynman rules as follows
To evaluate the (connected) S-matrix of the form \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\), we follow the steps explained in Spacetime Feynman rules with one modification:
Associate to each vertex a factor \(-\ifrak (2\pi)^4 \delta^4 \left( \sum p_{\text{in}} - \sum p_{\text{out}} + \sum p_{\text{entering}} - \sum p_{\text{leaving}} \right)\), and to each edge a factor as indicated in the figure above.
Indeed, the momentum-conservation delta functions at the vertices correspond precisely to the delta functions appeared in Cluster decomposable Hamiltonians. Hence the same argument implies, once again, that the connected S-matrix contains exactly one momentum-conservation delta function.
Before moving onto the discussion about the external edges, let’s revisit the example calculations of S-matrices in \(\psi^{\dagger} \psi \phi\)-interaction in momentum space as follows.
- Fermion-boson scattering in momentum space
The goal is to re-evaluate the (connected) S-matrix \(S^C_{1'2',12}\), where \(1\) stands for the fermion, and \(2\) stands for the boson, in momentum space. Using the momentum space Feynman rules, together with obvious abbreviations, the calculation goes as follows.
\begin{align*} S^C_{1'2',12} &= \sum_{\ell m k, \ell' m' k'} (-\ifrak)^{2+1} (2\pi)^{8-6-4} g_{\ell m k} g_{\ell' m' k'} \int d^4 p~\frac{P_{m \ell'}(p)}{p^2 + M^2 - \ifrak \epsilon} u_{m'}(1) u^{\ast}_{\ell}(1') \\ &\phantom{=~} \times \left( u_{k'}(2) u^{\ast}_k(2') \delta^4(p-p_1-p_2) \delta^4(p'_1+p'_2-p) + u_k(2) u^{\ast}_{k'}(2') \delta^4(p+p'_2-p_1) \delta^4(p'_1-p_2-p) \right) \\ &= \ifrak (2\pi)^{-2} \delta^4(p'_1+p'_2-p_1-p_2) \sum_{\ell m k, \ell' m' k'} g_{\ell m k} g_{\ell' m' k'} u_{m'}(1) u^{\ast}_{\ell}(1') \\ &\phantom{=~} \times \left( \frac{P_{m \ell'}(p_1+p_2)}{(p_1+p_2)^2+M^2-\ifrak \epsilon} u_{k'}(2) u^{\ast}_k(2') + \frac{P_{m \ell'}(p_1-p'_2)}{(p_1-p'_2)^2+M^2-\ifrak \epsilon} u_k(2) u^{\ast}_{k'}(2') \right) \\ &= \ifrak (2\pi)^{-2} \delta^4(p'_1+p'_2-p_1-p_2) \sum_{k, k'} \bigg( \phantom{)} \\ &\phantom{=~} \left( \blue{u^{\dagger}(1') \Gamma_k \frac{P(p_1+p_2)}{(p_1+p_2)^2+M^2-\ifrak \epsilon} \Gamma_{k'} u(1)} \right) u_{k'}(2) u^{\ast}_k(2') \\ &\phantom{=~(} + \left( \blue{u^{\dagger}(1') \Gamma_k \frac{P(p_1-p'_2)}{(p_1-p'_2)^2+M^2-\ifrak \epsilon} \Gamma_{k'} u(1)} \right) u_k(2) u^{\ast}_{k'}(2') \bigg) \end{align*}where the blue terms are supposed to be understood as matrix multiplications with
\begin{equation} (\Gamma_k)_{\ell m} \coloneqq g_{\ell m k} \label{eq_three_way_coupling_constant_as_matrix} \end{equation}and \(u^{\dagger}\) as a transpose-conjugated row vector, and \(M\) is the mass of the intermediate fermion.
One can see from the calculation above that a general S-matrix can readily be read off from the Feynman diagrams, and is made up of the following pieces
An appropriate power of \(\ifrak\) and \(\pi\) determined by the number of edges and vertices.
A momentum-conservation delta function equating the total in- and out-state particles.
A summation of (matrix) products of field coefficients, propagator integrands, and coupling constants, one for each Feynman diagram. In particular, the momentum carried by a propagator is determined by the momentum-conservation law at each vertex.
- Fermion-fermion scattering in momentum space
Using the observation we made from the previous calculation, we can immediately write down the S-matrix \(S^C_{1'2',12}\), where both \(1\) and \(2\) are fermions, as follows
\begin{align*} S^C_{1'2',12} &= \ifrak (2\pi)^{-2} \delta^4(p'_1+p'_2-p_1-p_2) \sum_{k, k'} \bigg( \phantom{)} \\ &\phantom{=~} \frac{P_{k k'}(p'_1 - p_1)}{(p'_1-p_1)^2 + M^2 - \ifrak\epsilon} \left( u^{\dagger}(1) \Gamma_k u(1') \right) \left( u^{\dagger}(2) \Gamma_{k'} u(2') \right) \\ &\phantom{=(} - \frac{P_{k k'}(p'_2-p_1)}{(p'_2-p_1)^2 + M^2 - \ifrak\epsilon} \left( u^{\dagger}(1) \Gamma_k u(2') \right) \left( u^{\dagger}(2) \Gamma_{k'} u(1') \right) \bigg) \end{align*}- Boson-boson scattering in momentum space
We didn’t actually calculate the (4th order) boson-boson scattering S-matrix \(S^C_{1'2',12}\) in spacetime coordinates due to its complexity. But it becomes much simpler in momentum space coordinates, and can be calculated as follows. First, let’s figure out the powers of \(\ifrak\) and \(\pi\), respectively. Since there are \(4\) vertices and \(4\) internal edges, each of which contribute one \(-\ifrak\), we get a contribution of \((-\ifrak)^8 = 1\). Moreover, since there are equal numbers of vertices and internal edges, each of which contribute \((2\pi)^4\) and \((2\pi)^{-4}\), respectively, and \(4\) external edges, each of which contribute \((2\pi)^{-3/2}\), we get a total contribution of \((2\pi)^{-6}\). Remembering an additional minus sign coming from the fermionic loop, we have the following
\begin{align*} S^C_{1'2',12} &= -(2\pi)^{-6} \delta^4(p'_1+p'_2-p_1-p_2) \sum_{k_1, k_2, k'_1, k'_2} u^{\ast}_{k'_1}(1') u^{\ast}_{k'_2}(2') u_{k_1}(1) u_{k_2}(2) \\ &\phantom{=~} \times \int d^4 p~\op{Tr} \bigg( \frac{P(p)}{p^2+M^2-\ifrak\epsilon} \Gamma_{k'_1} \frac{P(p-p'_1)}{(p-p'_1)^2+M^2-\ifrak\epsilon} \Gamma_{k'_2} \frac{P(p-p'_1-p'_2)}{(p-p'_1-p'_2)^2+M^2-\ifrak\epsilon} \Gamma_{k_2} \\ &\phantom{=~} \times \frac{P(p-p'_1-p'_2+p_2)}{(p-p'_1-p'_2+p_2)^2+M^2-\ifrak\epsilon} \Gamma_{k_1} \bigg) + \cdots \end{align*}where \(\cdots\) denotes valuations of other (4th order) Feynman diagrams.
External edges off the mass shell#
By transforming propagators into integrals over the \(4\)-momentum space as in \(\eqref{eq_propagator_as_momentum_space_integral}\), and integrating out the spacetime coordinates, we’ve been able to express the S-matrix as an integral over a product of \(4\)-momentum space coordinates, one for each internal edge that doesn’t split the Feynman diagram into disconnected components. In particular, as we’ve seen in several examples from the previous section, the S-matrix involves no momentum space integrals at all when the Feynman diagram is a tree.
However, the external edges are still confined to the mass shell since the in- and out-state particles are. We’d like to relax such confinement to remove any mass-shell constraints. It will allow us to calculate contributions from a large Feynman diagram in terms of its smaller local pieces, and facilitate the derivation of the path integral formalism later. The idea of a “local” Feynman diagram, i.e., a diagram without external edges, is to add additional “vertices at infinity” at the places of in- and out-state particles, which turn the external edges into internal ones. Moreover, unlike the (internal) vertices, whose spacetime coordinates will be integrated as in \(\eqref{eq_s_matrix_fully_expanded_by_timed_ordered_interaction_density}\), the spacetime coordinates of the added vertices will be kept intact. In particular, no momentum conservation law will be imposed on these vertices.
It turns out that this procedure can be generalized to a field theory with external interaction, in which case the interaction-picture perturbation term takes the following form
Here \(\epsilon_a(t, \xbf)\) are the infinitesimal parameters, and \(o_a(t, \xbf)\) are the external currents in the interaction picture in the following sense
in analogy to \(\eqref{eq_defn_interaction_perturbation_term}\).
Now under the perturbation \(V_{\epsilon}(t)\), the S-matrix, whose entry \(S_{\beta\alpha}\) is a complex number, becomes a functional \(S_{\beta\alpha}[\epsilon]\) in complex functions \(\epsilon_a\). The functional derivative
can be evaluated using the usual Feynman rules. Indeed, besides the internal vertices coming from the interaction density \(\Hscr(t, \xbf)\), we must also include external vertices coming from \(o(t, \xbf)\). For example, if \(o(t, \xbf)\) are monomials of fields and field adjoints, then there will be \(r\) external vertices, each of which has as many incoming and outgoing edges as there are fields and field adjoints in the corresponding \(o(t, \xbf)\), respectively. In particular, if \(o(t, \xbf)\) are all degree one, then we recover the off-mass-shell Feynman diagrams, as promised.
The Canonical Formalism#
The quantum theory we’ve been developing so far has been based almost solely on the symmetry principles, especially Lorentz symmetries. This is a very satisfying approach since it’s logically clean and relies only on the most fundamental principles, however, this is not the way quantum theory historically had been developed. Not surprisingly, the original development of quantum theory is much messier and requires substantial experience in “classical” physics. It’s largely based on the so-called Lagrangian formalism, which is a readily well-established principle in classical physics and can be “quantized”. The main goal of this chapter is to go through this formalism, not for historical sake, but because it offers a particularly convenient way to construct Hamiltonians that generate Lorentz-invariant S-matrices, which has been difficult for us as can be seen in Feynman rules in momentum space.
Canonical variables#
We’ve seen in Quantum Fields and Antiparticles a few ways of constructing (Lorentz-invariant) interaction densities. However, we don’t have a systematic way to do so. The so-called Lagrangian formalism will not provide a systematic solution either, but it’ll allow us to construct more interesting interaction densities (from classical physics theories), to the extent that all known quantum field theories arise in this way! In addition, it’ll shed light on the mysterious local terms as for example in \(\eqref{eq_vector_field_propagator_needs_local_term}\), that are needed to compensate for a Lorentz-invariant momentum space propagator.
The offer from the Lagrangian formalism regarding constructing a quantum field theory is the following. Instead of using the fields \(\eqref{eq_defn_annihilation_field}\) – \(\eqref{eq_defn_creation_field}\) to construct the Hamiltonians, we’ll use the so-called canonical variables, which have particularly simple (equal time) commutation relations. More precisely, it consists of a collection of quantum operators \(q_n(t, \xbf)\) and its canonical conjugates \(p_n(t, \xbf)\), which satisfy the following (anti-)commutation relations
where \(\pm\) correspond to when the particle under question is fermionic or bosonic, respectively.
To see how canonical variables may be constructed from fields considered in Quantum Fields and Antiparticles, let’s consider a few examples.
- Scalar fields
Let’s start by considering scalar fields of particles that are their own antiparticles. Using notations from Scalar fields, it means that \(\psi(x) = \psi^{ \dagger}(x)\), i.e., the field is Hermitian. It follows then from \(\eqref{eq_scalar_field_commutator}\) and \(\eqref{eq_defn_Delta}\) that
\begin{equation} \left[ \psi(x), \psi(y) \right] = \Delta(x-y) = \frac{1}{(2\pi)^3} \int \frac{d^3 p}{2p_0} \left( e^{\ifrak p \cdot (x-y)} - e^{-\ifrak p \cdot (x-y)} \right) \label{eq_scalar_field_commutation_relation_reproduced} \end{equation}where \(p_0 = \sqrt{\pbf^2 + m^2}\).
We claim that the canonical commutation relations \(\eqref{eq_canonical_commutation_relation_1}\) – \(\eqref{eq_canonical_commutation_relation_3}\) are satisfied by
\begin{align} q(t, \xbf) &\coloneqq \psi(t, \xbf) \label{eq_defn_q_scalar_field_self_dual} \\ p(t, \xbf) &\coloneqq \dot{\psi}(t, \xbf) \label{eq_defn_p_scalar_field_self_dual} \end{align}Indeed, it follows from the following calculations
\begin{alignat}{2} \left[ q(t, \xbf), p(t, \xbf) \right] &= \left[ \psi(t, \xbf), \dot{\psi}(t, \ybf) \right] &&= -\dot{\Delta}(0, \xbf-\ybf) = \ifrak \delta^3(\xbf-\ybf) \nonumber \\ \left[ q(t, \xbf), q(t, \xbf) \right] &= \left[ \psi(t, \xbf), \psi(t, \ybf) \right] &&= \Delta(0, \xbf-\ybf) = 0 \label{eq_canonical_commutator_scalar_field_self_dual_qq} \\ \left[ p(t, \xbf), p(t, \xbf) \right] &= \left[ \dot{\psi}(t, \xbf), \dot{\psi}(t, \ybf) \right] &&= -\ddot{\Delta}(0, \xbf-\ybf) = 0 \nonumber \end{alignat}Now for particles that are different from their antiparticles, we must modify \(\eqref{eq_defn_q_scalar_field_self_dual}\) – \(\eqref{eq_defn_p_scalar_field_self_dual}\) as follows
\begin{align*} q(t, \xbf) &= \psi(t, \xbf) \\ p(t, \xbf) &= \dot{\psi}^{\dagger}(t, \xbf) \end{align*}and note that in this case \(\left[ \psi(t, \xbf), \psi(t', \ybf) \right] = 0\), in contrast to \(\eqref{eq_canonical_commutator_scalar_field_self_dual_qq}\).
- Spin-\(1\) vector fields
Consider once again particles that are self-charge-dual. Using notations from Spin-1 vector fields, we recall the commutation relation \(\eqref{eq_vector_field_commutator}\) as follows
\begin{equation*} \left[ \psi_{\mu}(x), \psi_{\nu}(y) \right] = \left( \eta_{\mu\nu} - \frac{\p_{\mu} \p_{\nu}}{m^2} \right) \Delta(x-y) \end{equation*}The canonical variables in this case can be defined as follows
\begin{align} q_i(t, \xbf) &= \psi_i(t, \xbf) \label{eq_defn_q_vector_field_self_dual} \\ p_i(t, \xbf) &= \dot{\psi}_i(t, \xbf) - \frac{\p \psi_0(t, \xbf)}{\p x_i} \label{eq_defn_p_vector_field_self_dual} \end{align}where \(i=1,2,3\). Indeed, let’s calculate the equal-time commutators as follows
\begin{align*} \left[ q_i(t, \xbf), p_j(t, \ybf) \right] &= \left[ \psi_i(t, \xbf), \dot{\psi}_j(t, \ybf) \right] - \left[ \psi_i(t, \xbf), \frac{\p \psi_0(t, \ybf)}{\p y_j} \right] \\ &= -\left( \eta_{ij} -\frac{\p_i \p_j}{m^2} \right) \dot{\Delta}(0, \xbf-\ybf) - \left. \frac{\p_i \p_0}{m^2} \right|_{t=0} \left( \p_j \Delta(t, \xbf-\ybf) \right) \\ &= \ifrak \delta^3(\xbf-\ybf) \delta_{ij} \\ \left[ q_i(t, \xbf), q_j(t, \ybf) \right] &= \left( \eta_{ij} - \frac{\p_i \p_j}{m^2} \right) \Delta(0, \xbf-\ybf) = 0 \\ \left[ p_i(t, \xbf), p_j(t, \ybf) \right] &= \left[ \dot{\psi}_i(t, \xbf), \dot{\psi}_j(t, \ybf)\right] + \p_{x_i} \p_{y_j} \left[ \psi_0(t, \xbf), \psi_0(t, \ybf) \right] \\ &\phantom{=} - \p_{x_i} \left[ \psi_0(t, \xbf), \dot{\psi}_j(t, \ybf) \right] - \p_{y_j} \left[ \dot{\psi}_i(t, \xbf), \psi_0(t, \ybf) \right] = 0 \end{align*}We’ve omitted some details about the vanishing of the last quantities – it turns out that the the first and second terms cancel out, and the third and the fourth terms also cancel out.
In any case, we’ve constructed three pairs of canonical variables, one for each spatial index. But what about the time index? It turns out that \(\psi_0\) is not an independent variable. Indeed, we can derive from \(\eqref{eq_defn_p_vector_field_self_dual}\) an expression of \(\psi_0\) as follows
\begin{align*} & & p_i & = \p_0 \psi_i - \p_i \psi_0 \\ & \xRightarrow{\phantom{\eqref{eq_klein_gordon}}} & \p_i p_i & = \p_0 \p_i \psi_i - \p^2_i \psi_0 \\ & \xRightarrow{\phantom{\eqref{eq_klein_gordon}}} & \nabla \cdot \pbf & = \p_0 \sum_{i=1}^3 \p_i \psi_i - \sum_{i=1}^3 \p^2_i \psi_0 \\ & \xRightarrow{\eqref{eq_vector_field_gauge_fixing_condition}} & \nabla \cdot \pbf & = \p_0^2 \psi_0 - \sum_{i=1}^3 \p_i^2 \psi_0 = -\square \psi_0 \\ & \xRightarrow{\eqref{eq_klein_gordon}} & \psi_0 & = -m^{-2} \nabla \cdot \pbf \end{align*}- Spin-\(1/2\) Dirac fields
Recall the anti-commutator of Dirac fields \(\eqref{eq_dirac_field_commutator}\) as follows
\begin{equation*} \left[ \psi_{\ell}(x), \psi^{\dagger}_{\ell'}(y) \right]_+ = \left( (-\gamma^{\mu} \p_{\mu} + m) \beta \right)_{\ell \ell'} \Delta(x-y) \end{equation*}where \(\ell, \ell'\) are indexes corresponding to the two spin \(z\)-component \(\pm 1/2\). Assuming that particle under question has distinct antiparticle, i.e., it’s not a Majorana fermion, the following holds trivially
\begin{equation*} \left[ \psi_{\ell}(x), \psi_{\ell'}(y) \right]_+ = 0 \end{equation*}It follows that the canonical variables can be defined by
\begin{align*} q_{\ell}(x) &= \psi_{\ell}(x) \\ p_{\ell}(x) &= \ifrak \psi^{\dagger}_{\ell}(x) \end{align*}Indeed, the only nontrivial (and non-vanishing) anti-commutator can be calculated as follows
\begin{align*} \left[ q_{\ell}(t, \xbf), p_{\ell'}(t, \ybf) \right]_+ &= \ifrak \left[ \psi_{\ell}(t, \xbf), \psi_{\ell'}^{\dagger}(t, \ybf) \right]_+ \\ &= -\ifrak \left( \gamma^0 \beta \right)_{\ell \ell'} \dot{\Delta}(0, \xbf-\ybf) \\ &= \ifrak \delta^3(\xbf-\ybf) \delta_{\ell \ell'} \end{align*}
Through these examples, we see that there is no particular pattern in how one may define canonical variables. In fact, one doesn’t really define canonical variables in this way either – they are simply given for granted in the Lagrangian formalism as we will see.
We begin by a general discussion on functionals \(F[q(t), p(t)]\) of canonical variables, since both Hamiltonians and Lagrangians will be such functionals. A few notes are in order. First we’ve used a shorthand notation \(q(t)\) and \(p(t)\) to denote a collection of canonical variables. Moreover, in writing \(q(t)\) (and similarly for \(p(t)\)) we implicitly think of them as fields at a given time. Indeed, as we’ll see, the time variable plays an exceptional role in the Lagrangian formalism, in contrast to our mindset so far that space and time are all mixed up in a Lorentz invariant theory. Finally, we’ve used square bracket to differentiate it from regular functions of spacetime or momentum variables.
At the heart of the Lagrangian formalism lies a variational principle. Hence it’s crucial to be able to take infinitesimal variations on \(F[q(t), p(t)]\), which we write as follows
Here the infinitesimal fields \(\delta q_n\) and \(\delta p_n\) are assumed to (anti-)commute with all other fields. Now assuming \(F[q(t), p(t)]\) is written so that all the \(q\) fields lie to the left of all the \(p\) fields, then \(\eqref{eq_infinitesimal_variation_of_functional_of_canonical_variables}\) can be realized by the following definition of variational derivatives
Hamiltonian and Lagrangian for free fields#
For free fields we have
where \(H_0\) is the free field Hamiltonian, also known as the symmetry generator for the time translation, or the energy operator. However, rather than thinking of it as an abstract operator as we’ve done so far, we’ll (momentarily) make it a functional of canonical variables. With this in mind, we can take the time derivative of \(\eqref{eq_free_field_q_time_evolution}\) and \(\eqref{eq_free_field_p_time_evolution}\) as follows
We recognize these as the quantum analog of Hamilton’s equation of motion.
To turn \(H_0\) into a functional of canonical variables, we first make it a functional of creation and annihilation operators. Remembering that \(H_0\) is the energy operator, and \(p_0 = \sqrt{\pbf^2 + m^2}\) is the energy in the \(4\)-momentum, we can write \(H_0\) as a diagonal operator as follows
For simplicity, let’s consider the case of a real scalar field \(\psi(x)\) given by \(\eqref{eq_scalar_field_psi_by_creation_and_annihilation_operators}\) as follows
The canonical conjugate variable is
These look a bit far from \(\eqref{eq_free_field_hamiltonian_diagonal}\). But since \(H_0\) involves products like \(a^{\dagger}(\pbf, \sigma, n) a(\pbf, \sigma, n)\), let’s try to square the canonical variables as follows
and
and finally, inspired by the calculations above
Putting these calculations together in a specific way, and using the identity \(p_0^2 - \pbf^2 = m^2\), we can eliminate the blue terms as follows
Here we’ve encountered for the first time an infinite term (which we’ve marked in blue). As long as the Hamiltonian dynamics \(\eqref{eq_free_field_hamilton_equation_q_dot}\) – \(\eqref{eq_free_field_hamilton_equation_p_dot}\) is concerned, it makes no difference adding a constant to the Hamiltonian. Hence we can write the free Hamiltonian for real scalar fields as follows
Warning
Throwing away the infinite term in \(\eqref{eq_calculate_free_real_scalar_field_hamiltonian}\) is an instance of a well-known criticism in quantum field theory: “just because something is infinite doesn’t mean it’s zero”. Indeed, Weinberg mentioned in page 297 [Wei95] that such “infinities” shouldn’t be thrown away when, for example, the fields are constrained within a finite space, or there is an involvement of gravity.
Now it’s time to introduce the rather mysterious Lagrangian, which can be derived from the Hamiltonian via the so-called Legendre transformation as follows
where each occurrence of \(p_n(t)\) is replaced by its expression in \(q_n(t)\) and \(\dot{q}_n(t)\).
As a concrete example, let’s consider again the real scalar field, where \(p = \dot{q}\). It follows that
It should be noted that expressing \(p\) in terms of \(q\) and \(\dot{q}\) isn’t always easy. Indeed, it’s far from obvious how the \(p_i\) defined by \(\eqref{eq_defn_p_vector_field_self_dual}\) could be expressed in the corresponding \(q_i\) and \(\dot{q}_i\). (Un)Fortunately, we’d never really need to do so – writing down a Lagrangian turns out to be mostly a guess work.
Hamiltonian and Lagrangian for interacting fields#
Let \(H\) be the full Hamiltonian. Then the Heisenberg picture canonical variables can be defined as follows
Then obviously these canonical variables also satisfy the canonical (anti-)commutation relations
Moreover, the analog of \(\eqref{eq_free_field_hamilton_equation_q_dot}\) and \(\eqref{eq_free_field_hamilton_equation_p_dot}\) holds as follows
As an example, we note that, in light of \(\eqref{eq_free_scalar_field_hamiltonian}\), the full Hamiltonian for real scalar fields may be written as
where \(\Hscr(Q)\) is the perturbation term giving rise to the interaction.
The Lagrangian formalism#
We’ll leave aside the discussion of canonical variables for a bit to introduce the Lagrangian formalism in its most general form. After that we’ll play the game backwards. Namely, instead of constructing canonical variables out of the free fields that we’ve been exclusively considering since Quantum Fields and Antiparticles, we’ll get canonically conjugate fields out of the (magically appearing) Lagrangians, and then impose the canonical commutation relations \(\eqref{eq_canonical_commutation_relation_1}\) – \(\eqref{eq_canonical_commutation_relation_3}\) on them – a procedure generally known as “quantization”.
In the classical physical theory of fields, a Lagrangian is a functional \(L[\Psi(t), \dot{\Psi}(t)]\), where \(\Psi(t)\) is any field and \(\dot{\Psi}(t)\) is its time derivative. Here we’ve capitalized the field variables to distinguish them from the free fields considered in the previous section. Define the conjugate fields as follows
so that the field equations are given by
Warning
Unlike the functional derivatives considered in \(\eqref{eq_infinitesimal_variation_of_functional_of_canonical_variables}\) for canonical variables, the functional derivative \(\eqref{eq_general_lagrangian_conjugate_pi}\), interpreted quantum mechanically, is not really well-defined since \(\Psi(t)\) and \(\dot{\Psi}(t)\) don’t in general satisfy a simple (same time) commutation relation. According to Weinberg (see footnote on page 299 in [Wei95]), “no important issues hinge on the details here”. So we’ll pretend that it behaves just like usual derivatives.
Indeed, recall that in the classical Lagrangian formalism, the field equations are given by a variational principle applied to the so-called action, defined as follows
The infinitesimal variation of \(I[\Psi]\) is given by
where for the last equality, integration by parts is used under the assumption that the infinitesimal variation \(\delta \Psi_n(t, \xbf)\) vanishes at \(t \to \pm\infty\). Obviously \(\delta I[\Psi]\) vanishes for any \(\delta \Psi_n(t, \xbf)\) if and only if \(\eqref{eq_equation_of_motion_for_fields}\) is satisfied.
Now we’re interested in constructing Lorentz invariant theories, but an action defined by \(\eqref{eq_defn_action_of_fields}\) apparently distinguishes the time from space variables. This motivates the hypothesis that the Lagrangian itself is given by a spatial integral of a so-called Lagrangian density as follows
In terms of the Lagrangian density, we can rewrite the action \(\eqref{eq_defn_action_of_fields}\) as a \(4\)-integral as follows
We’d also like to reexpress the field equations \(\eqref{eq_equation_of_motion_for_fields}\) in terms of the Lagrangian density. To this end, let’s first calculate the variation of \(\eqref{eq_defn_lagrangian_density}\) by an amount \(\delta \Psi_n(t, \xbf)\) as follows
It follows that
Combining these with \(\eqref{eq_equation_of_motion_for_fields}\) and \(\eqref{eq_equation_of_motion_for_fields}\), we’ve derived the so-called Euler-Lagrange equations for the Lagrangian density
Note that the summing \(4\)-index \(\mu\) here represents \(x_{\mu}\). Most importantly, the field equations given by \(\eqref{eq_euler_lagrange}\) will be Lorentz invariant if \(\Lscr\) is. Indeed, guessing such \(\Lscr\) will be more or less the only way to construct Lorentz invariant (quantum) field theories.
Note
The Lagrangian density \(\Lscr\) is assumed to be real for two reasons. First, if \(\Lscr\) were complex, then splitting it into the real and imaginary parts, \(\eqref{eq_euler_lagrange}\) would contain twice as many equations as there are fields, regardless whether real or complex. This is undesirable because generically there will be no solutions. The second reason has to wait until the next section, where symmetries will be discussed. It turns out that the reality of \(\Lscr\) will guarantee that the symmetry generators are Hermitian.
Now recall from the previous section that the anchor of our knowledge is the Hamiltonian – we know how it must look like, at least for free fields. To go from the Lagrangian to the Hamiltonian, we use again the Legendre transformation (cf. \(\eqref{eq_legendre_transformation_lagrangian_from_hamiltonian}\)) to define the Hamiltonian as follows
Warning
In order to realize \(H\) as a functional of \(\Psi\) and \(\Pi\), one must in principle be able to solve for \(\dot{\Psi}_n\) in terms of \(\Psi_n\) and \(\Pi_n\) from \(\eqref{eq_general_lagrangian_conjugate_pi}\). This isn’t always easy, if at all possible, but it rarely pose serious difficulties in applications either.
As a double check, let’s verify that the Hamiltonian defined by \(\eqref{eq_legendre_transformation_hamiltonian_from_lagrangian}\) also satisfies Hamilton’s equations (cf. \(\eqref{eq_free_field_hamilton_equation_q_dot}\) – \(\eqref{eq_free_field_hamilton_equation_p_dot}\)). Indeed, the variational derivatives are calculated as follows
It’s therefore attempting to demand, in the Lagrangian formalism, that \(\Psi_n\) and \(\Pi_n\), defined by \(\eqref{eq_general_lagrangian_conjugate_pi}\), satisfy the canonical commutation relations. In other words, they are (Heisenberg picture) canonically conjugate fields. But this is not true in general, as it turns out.
The issue is that the Lagrangian \(L[\Psi(t), \dot{\Psi}(t)]\) may contain certain field, but not its time derivative. One example is spin-\(1\) vector fields, where we see from \(\eqref{eq_defn_q_vector_field_self_dual}\) that the spatial fields \(\psi_i\) are part of the canonical variables, but not \(\psi_0\), which nonetheless should present in the Lagrangian by Lorentz invariance. It turns out that what’s missing from the Lagrangian is \(\dot{\psi}_0\), which causes its conjugate variable defined by \(\eqref{eq_general_lagrangian_conjugate_pi}\) to vanish.
But instead of dealing with vector fields further, we’ll turn back to the general ground to establish the fundamental principles. Inspired by above discussion, we can rewrite the Lagrangian as
where each \(Q_n(t)\) has a corresponding \(\dot{Q}_n(t)\), but not for \(C(t)\). It follows that one can define the canonical conjugates by
and hence the Hamiltonian takes the following form
Global symmetries#
Of course, the reason for introducing the Lagrangian formalism is not to reproduce the Hamiltonians and the fields that we already knew. The main motivation is that, as we’ll see, the Lagrangian formalism provides a framework for studying symmetries. Recall from What is a Symmetry? that a symmetry was defined to be a(n anti-)unitary transformation on the Hilbert space of states, i.e., a transformation that preserves amplitudes. Now in the Lagrangian formalism, field equations come out of the stationary action condition. Therefore in this context, we’ll redefine a symmetry as an infinitesimal variation of the fields that leaves the action invariant. As it turns out, symmetries in this sense lead to conserved currents, which are nothing but the symmetry operators considered earlier. Hence besides a slight abuse of terminology, the notion of symmetries will be consistent.
Note
Throughout this section, repeated indexes like \(n\), which are used to index various fields, in an equation are not automatically summed up. On the other hand, repeated \(4\)-indexes like \(\mu\) do follow the Einstein summation convention.
Consider an infinitesimal variation
which leaves the action \(I[\Psi]\) invariant
A few remarks are in order. First of all, if we think of \(\eqref{eq_infinitesimal_variation_of_field}\) as an infinitesimal (unitary) symmetry transformation, then the coefficient \(\ifrak\) can be justified by then intention of making \(\Fscr_n(x)\) Hermitian. Next, although \(\eqref{eq_vanishing_of_action_under_infinitesimal_variation}\) always holds when \(\Psi_n(x)\) is stationary, the infinitesimal \(\Fscr_n(x)\) being a symmetry demands that \(\eqref{eq_vanishing_of_action_under_infinitesimal_variation}\) holds true for any \(\Psi_n(x)\). Finally, we emphasize the fact that \(\epsilon\) is an infinitesimal constant, rather than a function of \(x\), is the defining property for the symmetry to be called “global”. Indeed, we’ll be dealing with symmetries that are not global in the next chapter, namely, the gauge symmetries.
From symmetries to conservation laws#
The general principle that “symmetries imply conservation laws” is mathematically known as Noether’s theorem, but we’ll not bother with any mathematical formality here. To see how to derive conserved quantitites from an assumed symmetry, let’s change \(\eqref{eq_infinitesimal_variation_of_field}\) as follows
where \(\epsilon(x)\) now is an infinitesimal function of \(x\). Under this variation, the corresponding \(\delta I\) may not vanish. But it must take the following form
because it must vanishe when \(\epsilon(x)\) is constant. Here \(J^{\mu}(x)\) is a function(al) to be determined in individual cases, and is usually known as current. Now if \(\Psi_n(x)\) satisfies the field equations, i.e., it’s a stationary point of the action, then \(\eqref{eq_variation_of_action_by_functional_deformation}\) must vanishes for any \(\epsilon(x)\). Applying integration by parts (and assuming \(\Fscr_n(x)\) vanishes at infinity), we must have
which is the conservation law for \(J\), which then can be called a conserved current. One gets also a conserved quantity, i.e., a quantity that doesn’t change by time, by integrating \(\eqref{eq_general_conservation_of_current}\) over the \(3\)-space as follows
Unfortunately, not much more can be said about the conserved current \(J\) at this level of generality. This is, however, not the case if one imposes stronger assumptions on the symmetry, as we now explain.
- Lagrangian-preserving symmetry
This is the first strengthening of the symmetry assumption. Namely, instead of assuming that the variation \(\eqref{eq_infinitesimal_variation_of_field}\) fixes the action, we assume that it fixes the Lagrangian itself. Namely,
\begin{equation} \delta L = \ifrak \epsilon \sum_n \int d^3 x \left( \frac{\delta L}{\delta \Psi_n(t, \xbf)} \Fscr_n(t, \xbf) + \frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \dot{\Fscr}_n(t, \xbf) \right) = 0 \label{eq_stationary_lagrangian} \end{equation}Now let \(\epsilon(t)\) be a time-dependent infinitesimal in \(\eqref{eq_functional_infinitesimal_variation_of_field}\). Then we can calculate \(\delta I\) under such variation as follows
\begin{align*} \delta I &= \ifrak \sum_n \int dt \int d^3 x \left( \frac{\delta L}{\delta \Psi_n(t, \xbf)} \epsilon(t) \Fscr_n(t, \xbf) + \frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \frac{d}{dt} \big( \epsilon(t) \Fscr_n(t, \xbf) \big) \right) \\ &= \ifrak \sum_n \int dt \int d^3 x~\frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \dot{\epsilon}(t) \Fscr_n(t, \xbf) \end{align*}Comparing with \(\eqref{eq_variation_of_action_by_functional_deformation}\), we can derive an explicit formula for the conserved quantity as follows
\begin{equation} F = -\ifrak \sum_n \int d^3 x~\frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \Fscr_n(t, \xbf) \label{eq_lagrangian_preserving_symmetry_conserved_quantity} \end{equation}Indeed, one can verify directly that \(\dot{F}(t) = 0\) using \(\eqref{eq_stationary_lagrangian}\) together with the field equations \(\eqref{eq_general_lagrangian_conjugate_pi}\) and \(\eqref{eq_equation_of_motion_for_fields}\).
- Lagrangian-density-preserving symmetry
Taking the previous assumption further, let’s impose the even stronger condition that the Lagrangian density is invariant under \(\eqref{eq_infinitesimal_variation_of_field}\). It means that
\begin{equation} \delta \Lscr = \ifrak \epsilon \sum_n \left( \frac{\delta \Lscr}{\delta \Psi_n(x)} \Fscr_n(x) + \frac{\delta \Lscr}{\delta (\p_{\mu} \Psi_n(x))} \p_{\mu} \Fscr_n(x) \right) = 0 \label{eq_stationary_lagrangian_density} \end{equation}Now under \(\eqref{eq_functional_infinitesimal_variation_of_field}\), we can calculate the variation of the action as follows
\begin{align*} \delta I &= \ifrak \sum_n \int d^4 x~\left( \frac{\delta \Lscr}{\delta \Psi_n(x)} \epsilon(x) \Fscr_n(x) + \frac{\delta \Lscr}{\delta (\p_{\mu} \Psi_n(x))} \p_{\mu} \big( \epsilon(x) \Fscr_n(x) \big) \right) \\ &= \ifrak \sum_n \int d^4 x~\frac{\delta \Lscr}{\delta (\p_{\mu} \Psi_n(x))} \Fscr_n(x) \p_{\mu}\epsilon(x) \end{align*}Comparing with \(\eqref{eq_variation_of_action_by_functional_deformation}\) as before, we can derive an explicit formula for the conserved current as follows
\begin{equation} J^{\mu}(x) = -\ifrak \sum_n \frac{\delta \Lscr}{\delta (\p_{\mu} \Psi_n(x))} \Fscr_n(x) \label{eq_lagrangian_density_preserving_symmetry_conserved_density} \end{equation}Once again, one can directly verify that \(\p_{\mu} J^{\mu}(x) = 0\) using \(\eqref{eq_stationary_lagrangian_density}\) together with the Euler-Lagrange equation \(\eqref{eq_euler_lagrange}\).
So far everything has been completely classical. To make it a quantum theory, we’ll involve the canonical fields introduced in Hamiltonian and Lagrangian for interacting fields. More precisely, instead of any \(\Fscr_n(t, \xbf)\), we’ll suppose that it takes the following form
where \(Q(t)\) is defined by \(\eqref{eq_defn_heisenberg_canonical_q}\). Next, recall from \(\eqref{eq_general_quantum_lagrangian}\) that the field \(\Psi_n\) is either a \(Q_n\), in which case \(\delta L / \delta \dot{Q}_n = P_n\), or a \(C_n\), in which case the functional derivative vaninshes.
Now in the case of a Lagrangian-preserving symmetry, the conserved quantity \(\eqref{eq_lagrangian_preserving_symmetry_conserved_quantity}\) takes the following form
which of course is time-independent. Moreover, one can show that \(F\) in fact generates the quantum symmetry in the following sense
where we’ve taken advantage of the time-independency of \(F\) to arrange the same-time commutator.
Spacetime translations#
So far the symmetries have been rather abstract, to make it more explicit, and also to get warmed up for the general case, let’s assume the Lagrangian is invariant under the (spacetime) translation transformation given as follows
Comparing with \(\eqref{eq_infinitesimal_variation_of_field}\) we see that
It follows from \(\eqref{eq_variation_of_action_by_functional_deformation}\) and \(\eqref{eq_general_conservation_of_current}\) that there exists a conserved \(4\)-current \({T^{\nu}}_{\mu}\), which is known as the energy-momentum tensor, such that
The corresponding conserved currents then take the form
such that \(\dot{P}_{\mu} = 0\). Here it’s important to not confuse \(P_{\mu}\) with a canonical variable – it’s just a conserved quantity which turns out to be the \(4\)-momentum.
Now recall from \(\eqref{eq_defn_lagrangian_density}\) that the Lagrangian is usually the spatial integral of a density functional. Hence it’s not unreasonable to suppose that the Lagrangian is indeed invariant under spatial translations. Under this assumption, we can rewrite \(\eqref{eq_lagrangian_preserving_symmetry_generator_formula}\) as follows
with the understanding that \(\Psi_n = Q_n\).
To verify that \(\Pbf\) indeed generates spatial translations, let’s calculate using the fact that \(\Pbf\) is time-independent as follows
It follows that
for any functional \(\Gscr\) that doesn’t explicitly involve \(\xbf\). This verifies that \(\Pbf\) indeed generates the spatial translation.
In contrast, one cannot hope that the Lagrangian to be invariant under time translation, if there should be any interaction. But we already know the operator that generates time translation, namely, the Hamiltonian. In other words, we define \(P_0 \coloneqq -H\) such that
for any functional \(\Gscr\) that doesn’t explicitly involve \(t\).
In general, the Lagrangian density is not invariant under spacetime translations. However, it turns out that the conserved current, which in this case is \({T^{\mu}}_{\nu}\), can nonetheless be calculated. To spell out the details, let’s consider the following variation
The corresponding variation of the action is given as follows
where we’ve used the chain rule for derivatives in the second equality, and integration by parts in the third. Comparing with \(\eqref{eq_variation_of_action_by_functional_deformation}\), we see that
Note
The energy-momentum tensor \({T^{\nu}}_{\mu}\) is not yet suitable for general relativity since it’s not symmetric. As we’ll see in Lorentz symmetry, when taking homogeneous Lorentz transformation symmetry into account, one can supplement \({T^{\nu}}_{\mu}\) with some extra terms to make it both conserved and symmetric.
Indeed, this calculation recovers \(\eqref{eq_spatial_translation_conserved_quantity}\) by letting \(\nu = 0\) and \(\mu \neq 0\). Moreover, it recovers the Hamiltonian by letting \(\mu = \nu = 0\) as follows
Linear transformations#
As another example, let’s consider linear variations as follows
where we’ve adopted the Einstein summation convention for repeated upper and lower indexes because it’d otherwise be too tedious to write out the summations. Here \((t_{\square})^{\square}_{\square}\) should furnish a representation of the Lie algebra of the symmetry group.
As before, the invariance of action under such variations implies the existstence of conserved currents \(J^{\mu}_a\) such that
as well as the conserved quantity
If, in addition, the Lagrangian is invariant under such variations, then \(T_a\) takes the following form by \(\eqref{eq_lagrangian_preserving_symmetry_generator_formula}\)
It follows that
In particular, when \(t_a\) is diagonal (e.g., in electrodynamics), the operators \(Q^n\) and \(P_n\) may be regarded as raising/lowering operators. In fact, we claim that \(T_a\) form a Lie algebra by the following calculation
Now if \(t_a\) form a Lie algebra with structure constants \({f_{ab}}^c\) as follows
then
In other words, the conserved quantities also form the same Lie algebra.
Now if, in addition, the Lagrangian density is also invariant, then \(\eqref{eq_lagrangian_density_preserving_symmetry_conserved_density}\) takes the following form
Note that since \(\Lscr\) doesn’t have \(\dot{C}_r\) dependencies, we have the following by letting \(\mu = 0\) in \(\eqref{eq_lagrangian_density_invariant_linear_transformation_conserved_current}\)
whose equal-time commutation relations with canonical variables \(P\) and \(Q\) can be easily calculated.
Lorentz invariance#
The goal of this section is to show that the Lorentz invariance of the Lagrangian density implies the Lorentz invariance of the S-matrix, which justifies our interest in the Lagrangian formalism in the first place.
Recall from \(\eqref{eq_expansion_of_Lambda}\) and \(\eqref{eq_lorentz_lie_algebra_is_antisymmetric}\) that
is a \((\mu, \nu)\)-parametrized anti-symmetric variation. It follows then from \(\eqref{eq_variation_of_action_by_functional_deformation}\) that there exist \((\mu, \nu)\)-parametrized anti-symmetric conserved currents as follows
which, in turn, make conversed quantities
such that \(\dot{J}^{\mu \nu} = 0\). These, as we’ll see, turn out to be rather familiar objects that we’ve encountered as early as in \(\eqref{eq_u_lorentz_expansion}\).
In light of \(\eqref{eq_lagrangian_density_preserving_symmetry_conserved_density}\), one can work out an explicit formula for \(\Mscr^{\rho \mu \nu}\) if the Lagrangian density is invariant under the symmetry transformation. Now since the Lagrangian density is expressed in terms of quantum fields, one’d like to know how they transform under Lorentz transformations. Since the translation symmetry has already been dealt with in Spacetime translations, we’ll consider here homogeneous Lorentz transformations. Luckily this has been worked out already in Quantum Fields and Antiparticles. More precisely, recall from \(\eqref{eq_dirac_field_linearize_representation}\) that the variation term can be written as follows
where \(\Jscr\) are matrices satisfying \(\eqref{eq_bracket_repr_j}\). The corresponding derivatives then have the following variation term
where the second summand on the right-hand-side corresponds to the fact the the Lorentz transformation also acts on the spacetime coordinates.
Now the invariance of the Lagrangian density under such variation can be written as follows
Since \(\omega^{\mu \nu}\) is not in general zero, its coefficient must be zero, which, taking \(\eqref{eq_lorentz_omega_is_antisymmetric}\) into account, implies the following
Using \(\eqref{eq_euler_lagrange}\), we can get rid of the \(\delta\Lscr / \delta\Psi_n\) term in \(\eqref{eq_lorentz_invariance_current_raw_identity}\) to arrive at the following
Now we can address the issue of energy-momentum tensor not being symmetric by introducing the following so-called Belinfante tensor
which is both conserved in the sense that
and symmetric in the sense that
Indeed \(\eqref{eq_belinfante_tensor_is_conserved}\) follows from the observation that the term inside the parenthesis of \(\eqref{eq_defn_belinfante_tensor}\) is anti-symmetric in \(\mu\) and \(\kappa\), and \(\eqref{eq_belinfante_tensor_is_symmetric}\) is a direct consequence of \(\eqref{eq_lorentz_invariance_current_identity}\).
The conserved quantities corresponding to \(\Theta_{\mu\nu}\) are
where the first first equality holds because, again, the item in the parenthesis of \(\eqref{eq_defn_belinfante_tensor}\) is anti-symmetric is \(\mu\) and \(\nu\), and therefore \(\nu \neq 0\) given \(\mu = 0\). Hence it’s at least equally legitimate to call \(\Theta_{\mu \nu}\) the energy-momentum tensor. Indeed, the fact that \(\Theta_{\mu \nu}\) is the symmetric makes it the right choice in general relatively.
Unlike the other conserved currents, which are derived under the general principles explained in From symmetries to conservation laws, we’ll construct the anti-symmetric \(\Mscr^{\rho \mu \nu}\) declared in \(\eqref{eq_lorentz_invariance_m_conservation}\) and \(\eqref{eq_lorentz_invariance_conserved_m_antisymmetric}\) by hand as follows
While \(\eqref{eq_lorentz_omega_is_antisymmetric}\) is automatically satisfied by definition, we can verify \(\eqref{eq_lorentz_invariance_m_conservation}\) as follows
Moreover \(\eqref{eq_lorentz_invariance_conserved_j}\) takes the following form
Now if we consider the rotation generators \(J_i \coloneqq \tfrac{1}{2} \epsilon_{ijk} J^{jk}\), then it follows from \(\eqref{eq_hamiltonian_acts_as_time_derivative}\) that
since \(\Jbf\) doesn’t implicitly involve \(t\). This recovers one of the commutation relations \(\eqref{eq_hj_commute}\) for the Poincaré algebra. Next, let’s verify \(\eqref{eq_pjp_commutation}\) as follows
What come next are the boost operators defined as follows
Footnotes