6. The Canonical Formalism#
The quantum theory we’ve been developing so far has been based almost solely on the symmetry principles, especially Lorentz symmetries. This is a very satisfying approach since it’s logically clean and relies only on the most fundamental principles, however, this is not the way quantum theory historically had been developed. Not surprisingly, the original development of quantum theory is much messier and requires substantial experience in “classical” physics. It’s largely based on the so-called Lagrangian formalism, which is a readily well-established principle in classical physics and can be “quantized”. The main goal of this chapter is to go through this formalism, not for historical sake, but because it offers a particularly convenient way to construct Hamiltonians that generate Lorentz-invariant S-matrices, which has been difficult for us as can be seen in Feynman rules in momentum space.
6.1. Canonical Variables#
We’ve seen in Quantum Fields and Antiparticles a few ways of constructing (Lorentz-invariant) interaction densities. However, we don’t have a systematic way to do so. The so-called Lagrangian formalism will not provide a systematic solution either, but it’ll allow us to construct more interesting interaction densities (from classical physics theories), to the extent that all known quantum field theories arise in this way! In addition, it’ll shed light on the mysterious local terms as for example in Eq.5.2.14, that are needed to compensate for a Lorentz-invariant momentum space propagator.
The offer from the Lagrangian formalism regarding constructing a quantum field theory is the following. Instead of using the creation and annihilation fields defined by Eq.4.1.1 to construct the Hamiltonians, we’ll use the so-called canonical variables, which have particularly simple (equal time) commutation relations. More precisely, it consists of a collection of quantum operators \(q_n(t, \xbf)\) and its canonical conjugates \(p_n(t, \xbf)\), which satisfy the following (anti-)commutation relations
where \(\pm\) correspond to when the particle under question is fermionic or bosonic, respectively.
To see how canonical variables may be constructed from fields considered in Quantum Fields and Antiparticles, let’s consider a few examples.
- Scalar fields
Let’s start by considering scalar fields of particles that are their own antiparticles. Using notations from Scalar Fields, it means that \(\psi(x) = \psi^{ \dagger}(x)\), i.e., the field is Hermitian. It follows then from Eq.4.2.9 and Eq.4.2.10 that
\[\left[ \psi(x), \psi(y) \right] = \Delta(x-y) = \frac{1}{(2\pi)^3} \int \frac{d^3 p}{2p_0} \left( e^{\ifrak p \cdot (x-y)} - e^{-\ifrak p \cdot (x-y)} \right)\]where \(p_0 = \sqrt{\pbf^2 + m^2}\).
We claim that the canonical commutation relations Eq.6.1.1 are satisfied by
(6.1.2)#\[\begin{split}q(t, \xbf) &\coloneqq \psi(t, \xbf) \\ p(t, \xbf) &\coloneqq \dot{\psi}(t, \xbf)\end{split}\]Indeed, it follows from the following calculations
(6.1.3)#\[\begin{split}\begin{alignat*}{3} \left[ q(t, \xbf), p(t, \ybf) \right] &= \left[ \psi(t, \xbf), \dot{\psi}(t, \ybf) \right] &&= -\dot{\Delta}(0, \xbf-\ybf) &&= \ifrak \delta^3(\xbf-\ybf) \\ \left[ q(t, \xbf), q(t, \ybf) \right] &= \left[ \psi(t, \xbf), \psi(t, \ybf) \right] &&= \Delta(0, \xbf-\ybf) &&= 0 \\ \left[ p(t, \xbf), p(t, \ybf) \right] &= \left[ \dot{\psi}(t, \xbf), \dot{\psi}(t, \ybf) \right] &&= -\ddot{\Delta}(0, \xbf-\ybf) = 0 \end{alignat*}\end{split}\]Now for particles that are different from their antiparticles, we must modify Eq.6.1.2 as follows
\[\begin{split}q(t, \xbf) &= \psi(t, \xbf) \\ p(t, \xbf) &= \dot{\psi}^{\dagger}(t, \xbf)\end{split}\]and note that in this case \(\left[ \psi(t, \xbf), \psi(t', \ybf) \right] = 0\), in contrast to the second equation in Eq.6.1.3.
- Spin-\(1\) vector fields
Consider once again particles that are self-charge-dual. Using notations from Spin-1 vector fields, we recall the commutation relation Eq.4.3.17 as follows
\[\left[ \psi_{\mu}(x), \psi_{\nu}(y) \right] = \left( \eta_{\mu\nu} - \frac{\p_{\mu} \p_{\nu}}{m^2} \right) \Delta(x-y)\]The canonical variables in this case can be defined as follows
(6.1.4)#\[\begin{split}q_i(t, \xbf) &= \psi_i(t, \xbf) \\ p_i(t, \xbf) &= \dot{\psi}_i(t, \xbf) - \frac{\p \psi_0(t, \xbf)}{\p x_i}\end{split}\]where \(i=1,2,3\). Indeed, let’s calculate the equal-time commutators as follows
\[\begin{split}\left[ q_i(t, \xbf), p_j(t, \ybf) \right] &= \left[ \psi_i(t, \xbf), \dot{\psi}_j(t, \ybf) \right] - \left[ \psi_i(t, \xbf), \frac{\p \psi_0(t, \ybf)}{\p y_j} \right] \\ &= -\left( \eta_{ij} -\frac{\p_i \p_j}{m^2} \right) \dot{\Delta}(0, \xbf-\ybf) - \left. \frac{\p_i \p_0}{m^2} \right|_{t=0} \left( \p_j \Delta(t, \xbf-\ybf) \right) \\ &= \ifrak \delta^3(\xbf-\ybf) \delta_{ij} \\ \left[ q_i(t, \xbf), q_j(t, \ybf) \right] &= \left( \eta_{ij} - \frac{\p_i \p_j}{m^2} \right) \Delta(0, \xbf-\ybf) = 0 \\ \left[ p_i(t, \xbf), p_j(t, \ybf) \right] &= \left[ \dot{\psi}_i(t, \xbf), \dot{\psi}_j(t, \ybf)\right] + \p_{x_i} \p_{y_j} \left[ \psi_0(t, \xbf), \psi_0(t, \ybf) \right] \\ &\qquad - \p_{x_i} \left[ \psi_0(t, \xbf), \dot{\psi}_j(t, \ybf) \right] - \p_{y_j} \left[ \dot{\psi}_i(t, \xbf), \psi_0(t, \ybf) \right] = 0\end{split}\]We’ve omitted some details about the vanishing of the last quantities – it turns out that the the first and second terms cancel out, and the third and the fourth terms also cancel out.
In any case, we’ve constructed three pairs of canonical variables, one for each spatial index. But what about the time index? It turns out that \(\psi_0\) is not an independent variable. Indeed, we can derive from Eq.6.1.4, using Eq.4.3.19 and Eq.4.1.18, an expression of \(\psi_0\) as follows
\[\begin{split}p_i = \p_0 \psi_i - \p_i \psi_0 & \implies \p_i p_i = \p_0 \p_i \psi_i - \p^2_i \psi_0 \\ & \implies \nabla \cdot \pbf = \p_0 \sum_{i=1}^3 \p_i \psi_i - \sum_{i=1}^3 \p^2_i \psi_0 \\ & \implies \nabla \cdot \pbf = \p_0^2 \psi_0 - \sum_{i=1}^3 \p_i^2 \psi_0 = -\square \psi_0 \\ & \implies \psi_0 = -m^{-2} \nabla \cdot \pbf\end{split}\]- Spin-\(1/2\) Dirac fields
Recall the anti-commutator of Dirac fields Eq.4.4.34 as follows
\[\left[ \psi_{\ell}(x), \psi^{\dagger}_{\ell'}(y) \right]_+ = \left( (-\gamma^{\mu} \p_{\mu} + m) \beta \right)_{\ell \ell'} \Delta(x-y)\]where \(\ell, \ell'\) are indexes corresponding to the two spin \(z\)-component \(\pm 1/2\). Assuming that particle under question has distinct antiparticle, i.e., it’s not a Majorana fermion, the following holds trivially
\[\left[ \psi_{\ell}(x), \psi_{\ell'}(y) \right]_+ = 0\]It follows that the canonical variables can be defined by
\[\begin{split}q_{\ell}(x) &= \psi_{\ell}(x) \\ p_{\ell}(x) &= \ifrak \psi^{\dagger}_{\ell}(x)\end{split}\]Indeed, the only nontrivial (and non-vanishing) anti-commutator can be calculated as follows
\[\begin{split}\left[ q_{\ell}(t, \xbf), p_{\ell'}(t, \ybf) \right]_+ &= \ifrak \left[ \psi_{\ell}(t, \xbf), \psi_{\ell'}^{\dagger}(t, \ybf) \right]_+ \\ &= -\ifrak \left( \gamma^0 \beta \right)_{\ell \ell'} \dot{\Delta}(0, \xbf-\ybf) \\ &= \ifrak \delta^3(\xbf-\ybf) \delta_{\ell \ell'}\end{split}\]
Through these examples, we see that there is no particular pattern in how one may define canonical variables. In fact, one doesn’t really define canonical variables in this way either – they are simply given for granted in the Lagrangian formalism as we will see.
We begin by a general discussion on functionals \(F[q(t), p(t)]\) of canonical variables, since both Hamiltonians and Lagrangians will be such functionals. A few notes are in order. First we’ve used a shorthand notation \(q(t)\) and \(p(t)\) to denote a collection of canonical variables. Moreover, in writing \(q(t)\) (and similarly for \(p(t)\)) we implicitly think of them as fields at a given time. Indeed, as we’ll see, the time variable plays an exceptional role in the Lagrangian formalism, in contrast to our mindset so far that space and time are all mixed up in a Lorentz invariant theory. Finally, we’ve used square bracket to differentiate it from regular functions of spacetime or momentum variables.
At the heart of the Lagrangian formalism lies a variational principle. Hence it’s crucial to be able to take infinitesimal variations on \(F[q(t), p(t)]\), which we write as follows
Here the infinitesimal fields \(\delta q_n\) and \(\delta p_n\) are assumed to (anti-)commute with all other fields. Now assuming \(F[q(t), p(t)]\) is written so that all the \(q\) fields lie to the left of all the \(p\) fields, then Eq.6.1.5 can be realized by the following definition of variational derivatives
6.1.1. Hamiltonian and Lagrangian for free fields#
For free fields we have
where \(H_0\) is the free field Hamiltonian, also known as the symmetry generator for the time translation, or the energy operator. However, rather than thinking of it as an abstract operator as we’ve done so far, we’ll (momentarily) make it a functional of canonical variables. With this in mind, we can take the time derivative of Eq.6.1.6 as follows
We recognize these as the quantum analog of Hamilton’s equation of motion.
To turn \(H_0\) into a functional of canonical variables, we first make it a functional of creation and annihilation operators. Remembering that \(H_0\) is the energy operator, and \(p_0 = \sqrt{\pbf^2 + m^2}\) is the energy in the \(4\)-momentum, we can write \(H_0\) as a diagonal operator as follows
For simplicity, let’s consider the case of a real scalar field \(\psi(x)\) given by Eq.4.2.8 as follows
The canonical conjugate variable is
These look a bit far from Eq.6.1.8. But since \(H_0\) involves products like \(a^{\dagger}(\pbf, \sigma, n) a(\pbf, \sigma, n)\), let’s try to square the canonical variables as follows
and
and finally, inspired by the calculations above
Putting these calculations together in a specific way, and using the identity \(p_0^2 - \pbf^2 = m^2\), we can eliminate the blue terms as follows
Here we’ve encountered for the first time an infinite term (which we’ve marked in blue). As long as the Hamiltonian dynamics Eq.6.1.7 is concerned, it makes no difference adding a constant to the Hamiltonian. Hence we can write the free Hamiltonian for real scalar fields as follows
Warning
Throwing away the infinite term in Eq.6.1.9 is an instance of a well-known criticism in quantum field theory – just because something is infinite doesn’t mean it’s zero. Indeed, Weinberg mentioned in page 297 [Wei95] that such “infinities” shouldn’t be thrown away when, for example, the fields are constrained within a finite space, or there is an involvement of gravity.
Now it’s time to introduce the rather mysterious Lagrangian, which can be derived from the Hamiltonian via the so-called Legendre transformation as follows
where each occurrence of \(p_n(t)\) is replaced by its expression in \(q_n(t)\) and \(\dot{q}_n(t)\).
As a concrete example, let’s consider again the real scalar field, where \(p = \dot{q}\). It follows that
It should be noted that expressing \(p\) in terms of \(q\) and \(\dot{q}\) isn’t always easy. Indeed, it’s far from obvious how the \(p_i\) defined by Eq.6.1.4 could be expressed in the corresponding \(q_i\) and \(\dot{q}_i\). (Un)Fortunately, we’d never really need to do so – writing down a Lagrangian turns out to be mostly a guess work.
6.1.2. Hamiltonian and Lagrangian for interacting fields#
Let \(H\) be the full Hamiltonian. Then the Heisenberg picture canonical variables can be defined as follows
Then obviously these canonical variables also satisfy the canonical (anti-)commutation relations
Moreover, the analog of Eq.6.1.7 holds as follows
As an example, we note that, in light of Eq.6.1.10, the full Hamiltonian for real scalar fields may be written as
where \(\Hscr(Q)\) is the perturbation term giving rise to the interaction.
6.2. The Lagrangian Formalism#
We’ll leave aside the discussion of canonical variables for a bit to introduce the Lagrangian formalism in its most general form. After that we’ll play the game backwards. Namely, instead of constructing canonical variables out of the free fields that we’ve been exclusively considering since Quantum Fields and Antiparticles, we’ll get canonically conjugate fields out of the (magically appearing) Lagrangians, and then impose the canonical commutation relations Eq.6.1.1 on them – a procedure generally known as “quantization”.
In the classical physical theory of fields, a Lagrangian is a functional \(L[\Psi(t), \dot{\Psi}(t)]\), where \(\Psi(t)\) is any field and \(\dot{\Psi}(t)\) is its time derivative. Here we’ve capitalized the field variables to distinguish them from the free fields considered in the previous section. Define the conjugate fields as follows
so that the field equations are given by
Warning
Unlike the functional derivatives considered in Eq.6.1.5 for canonical variables, the functional derivative Eq.6.2.1, interpreted quantum mechanically, is not really well-defined since \(\Psi(t)\) and \(\dot{\Psi}(t)\) don’t in general satisfy a simple (same time) commutation relation. According to Weinberg (see footnote on page 299 in [Wei95]), “no important issues hinge on the details here”. So we’ll pretend that it behaves just like usual derivatives.
Indeed, recall that in the classical Lagrangian formalism, the field equations are given by a variational principle applied to the so-called action, defined as follows
The infinitesimal variation of \(I[\Psi]\) is given by
where for the last equality, integration by parts is used under the assumption that the infinitesimal variation \(\delta \Psi_n(t, \xbf)\) vanishes at \(t \to \pm\infty\). Obviously \(\delta I[\Psi]\) vanishes for any \(\delta \Psi_n(t, \xbf)\) if and only if Eq.6.2.2 is satisfied.
Now we’re interested in constructing Lorentz invariant theories, but an action defined by Eq.6.2.3 apparently distinguishes the time from space variables. This motivates the hypothesis that the Lagrangian itself is given by a spatial integral of a so-called Lagrangian density as follows
In terms of the Lagrangian density, we can rewrite the action Eq.6.2.3 as a \(4\)-integral as follows
Note
The Lagrangian density \(\Lscr(\Psi, \p_{\mu} \Psi)\) is to be considered as a function-valued functional of \(\Psi\) and \(\p_{\mu} \Psi\). Thus it makes sense to take partial derivatives, instead of variational derivatives, with respect to its variables such as \(\p \Lscr / \p \Psi\).
We’d also like to reexpress the field equations Eq.6.2.2 in terms of the Lagrangian density. To this end, let’s first calculate the variation of Eq.6.2.4 by an amount \(\delta \Psi_n(t, \xbf)\) as follows
It follows that
Combining these with Eq.6.2.2 and Eq.6.2.2, we’ve derived the so-called Euler-Lagrange equations for the Lagrangian density
Note that the summing \(4\)-index \(\mu\) here represents \(x_{\mu}\). Most importantly, the field equations given by Eq.6.2.5 will be Lorentz invariant if \(\Lscr\) is. Indeed, guessing such \(\Lscr\) will be more or less the only way to construct Lorentz invariant (quantum) field theories.
Note
The Lagrangian density \(\Lscr\) is assumed to be real for two reasons. First, if \(\Lscr\) were complex, then splitting it into the real and imaginary parts, Eq.6.2.5 would contain twice as many equations as there are fields, regardless whether real or complex. This is undesirable because generically there will be no solutions. The second reason has to wait until the next section, where symmetries will be discussed. It turns out that the reality of \(\Lscr\) will guarantee that the symmetry generators are Hermitian.
Now recall from the previous section that the anchor of our knowledge is the Hamiltonian – we know how it must look like, at least for free fields. To go from the Lagrangian to the Hamiltonian, we use again the Legendre transformation (cf. Eq.6.1.11) to define the Hamiltonian as follows
Warning
In order to realize \(H\) as a functional of \(\Psi\) and \(\Pi\), one must in principle be able to solve for \(\dot{\Psi}_n\) in terms of \(\Psi_n\) and \(\Pi_n\) from Eq.6.2.1. This isn’t always easy, if at all possible, but it rarely pose serious difficulties in applications either.
As a double check, let’s verify that the Hamiltonian defined by Eq.6.2.6 also satisfies Hamilton’s equations (cf. Eq.6.1.7). Indeed, the variational derivatives are calculated, using Eq.6.2.1 and Eq.6.2.2, as follows
It’s therefore attempting to demand, in the Lagrangian formalism, that \(\Psi_n\) and \(\Pi_n\), defined by Eq.6.2.1, satisfy the canonical commutation relations. In other words, they are (Heisenberg picture) canonically conjugate fields. But this is not true in general, as it turns out.
The issue is that the Lagrangian \(L[\Psi(t), \dot{\Psi}(t)]\) may contain certain field, but not its time derivative. One example is spin-\(1\) vector fields, where we see from Eq.6.1.4 that the spatial fields \(\psi_i\) are part of the canonical variables, but not \(\psi_0\), which nonetheless should present in the Lagrangian by Lorentz invariance. It turns out that what’s missing from the Lagrangian is \(\dot{\psi}_0\), which causes its conjugate variable defined by Eq.6.2.1 to vanish.
But instead of dealing with vector fields further, we’ll turn back to the general ground to establish the fundamental principles. Inspired by above discussion, we can rewrite the Lagrangian as
where each \(Q_n(t)\) has a corresponding \(\dot{Q}_n(t)\), but not for \(C(t)\). It follows that one can define the canonical conjugates by
and hence the Hamiltonian takes the following form
6.3. Global symmetries#
Of course, the reason for introducing the Lagrangian formalism is not to reproduce the Hamiltonians and the fields that we already knew. The main motivation is that, as we’ll see, the Lagrangian formalism provides a framework for studying symmetries. Recall from What is a Symmetry? that a symmetry was defined to be a(n anti-)unitary transformation on the Hilbert space of states, i.e., a transformation that preserves amplitudes. Now in the Lagrangian formalism, field equations come out of the stationary action condition. Therefore in this context, we’ll redefine a symmetry as an infinitesimal variation of the fields that leaves the action invariant. As it turns out, symmetries in this sense lead to conserved currents, which are nothing but the symmetry operators considered earlier. Hence besides a slight abuse of terminology, the notion of symmetries will be consistent.
Note
Throughout this section, repeated indexes like \(n\), which are used to index various fields, in an equation are not automatically summed up. On the other hand, repeated \(4\)-indexes like \(\mu\) do follow the Einstein summation convention.
Consider an infinitesimal variation
which leaves the action \(I[\Psi]\) invariant
A few remarks are in order. First of all, if we think of Eq.6.3.1 as an infinitesimal (unitary) symmetry transformation, then the coefficient \(\ifrak\) can be justified by the intention of making \(\Fscr_n(x)\) Hermitian. Next, although Eq.6.3.2 always holds when \(\Psi_n(x)\) is stationary, the infinitesimal \(\Fscr_n(x)\) being a symmetry demands that Eq.6.3.2 holds true for any \(\Psi_n(x)\). Finally, we emphasize the fact that \(\epsilon\) is an infinitesimal constant, rather than a function of \(x\), is the defining property for the symmetry to be called “global”. Indeed, we’ll be dealing with symmetries that are not global in the next chapter, namely, the gauge symmetries.
The general principle that “symmetries imply conservation laws” is mathematically known as Noether’s theorem, but we’ll not bother with any mathematical formality here. To see how to derive conserved quantities from an assumed symmetry, let’s change Eq.6.3.1 as follows
where \(\epsilon(x)\) now is an infinitesimal function of \(x\). Under this variation, the corresponding \(\delta I\) may not vanish. But it must take the following form
because it must vanish when \(\epsilon(x)\) is constant. Here \(J^{\mu}(x)\) is a function(al) to be determined in individual cases, and is usually known as current. Now if \(\Psi_n(x)\) satisfies the field equations, i.e., it’s a stationary point of the action, then Eq.6.3.4 must vanishes for any \(\epsilon(x)\). Applying integration by parts (and assuming \(\Fscr_n(x)\) vanishes at infinity), we must have
which is the conservation law for \(J\), which then can be called a conserved current. One gets also a conserved quantity, i.e., a quantity that doesn’t change by time, by integrating Eq.6.3.5 over the \(3\)-space as follows
Unfortunately, not much more can be said about the conserved current \(J\) at this level of generality. This is, however, not the case if one imposes stronger assumptions on the symmetry, as we now explain.
- Lagrangian-preserving symmetry
This is the first strengthening of the symmetry assumption. Namely, instead of assuming that the variation Eq.6.3.1 fixes the action, we assume that it fixes the Lagrangian itself. Namely,
(6.3.6)#\[\delta L = \ifrak \epsilon \sum_n \int d^3 x \left( \frac{\delta L}{\delta \Psi_n(t, \xbf)} \Fscr_n(t, \xbf) + \frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \dot{\Fscr}_n(t, \xbf) \right) = 0\]Now let \(\epsilon(t)\) be a time-dependent infinitesimal in Eq.6.3.3. Then we can calculate \(\delta I\) under such variation as follows
\[\begin{split}\delta I &= \ifrak \sum_n \int dt \int d^3 x \left( \frac{\delta L}{\delta \Psi_n(t, \xbf)} \epsilon(t) \Fscr_n(t, \xbf) + \frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \frac{d}{dt} \big( \epsilon(t) \Fscr_n(t, \xbf) \big) \right) \\ &= \ifrak \sum_n \int dt \int d^3 x~\frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \dot{\epsilon}(t) \Fscr_n(t, \xbf)\end{split}\]Comparing with Eq.6.3.4, we can derive an explicit formula for the conserved quantity as follows
(6.3.7)#\[F = -\ifrak \sum_n \int d^3 x~\frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \Fscr_n(t, \xbf)\]Indeed, one can verify directly that \(\dot{F}(t) = 0\) using Eq.6.3.6 together with the field equations Eq.6.2.1 and Eq.6.2.2.
- Lagrangian-density-preserving symmetry
Taking the previous assumption further, let’s impose the even stronger condition that the Lagrangian density is invariant under Eq.6.3.1. It means that
(6.3.8)#\[\delta \Lscr = \ifrak \epsilon \sum_n \left( \frac{\p \Lscr}{\p \Psi_n} \Fscr_n(x) + \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \p_{\mu} \Fscr_n(x) \right) = 0\]Now under Eq.6.3.3, we can calculate the variation of the action as follows
\[\begin{split}\delta I &= \ifrak \sum_n \int d^4 x~\left( \frac{\p \Lscr}{\p \Psi_n} \epsilon(x) \Fscr_n(x) + \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \p_{\mu} \big( \epsilon(x) \Fscr_n(x) \big) \right) \\ &= \ifrak \sum_n \int d^4 x~\frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \Fscr_n(x) \p_{\mu}\epsilon(x)\end{split}\]Comparing with Eq.6.3.4 as before, we can derive an explicit formula for the conserved current as follows
(6.3.9)#\[J^{\mu}(x) = -\ifrak \sum_n \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \Fscr_n(x)\]Once again, one can directly verify that \(\p_{\mu} J^{\mu}(x) = 0\) using Eq.6.3.8 together with the Euler-Lagrange equation Eq.6.2.5.
So far everything has been completely classical. To make it a quantum theory, we’ll involve the canonical fields introduced in Hamiltonian and Lagrangian for interacting fields. More precisely, instead of any \(\Fscr_n(t, \xbf)\), we’ll suppose that it takes the following form
where \(Q(t)\) is defined by Eq.6.1.13. Next, recall from Eq.6.2.8 that the field \(\Psi_n\) is either a \(Q_n\), in which case \(\delta L / \delta \dot{Q}_n = P_n\), or a \(C_n\), in which case the functional derivative vanishes.
Now in the case of a Lagrangian-preserving symmetry, the conserved quantity Eq.6.3.7 takes the following form
which of course is time-independent. Moreover, one can show that \(F\) in fact generates the quantum symmetry in the following sense
where we’ve taken advantage of the time-independency of \(F\) to arrange the same-time commutator.
6.3.1. Spacetime translations#
So far the symmetries have been rather abstract, to make it more explicit, and also to get warmed up for the general case, let’s assume the Lagrangian is invariant under the (spacetime) translation transformation given as follows
Comparing with Eq.6.3.1 we see that
It follows from Eq.6.3.4 and Eq.6.3.5 that there exists a conserved \(4\)-current \({T^{\nu}}_{\mu}\), which is known as the energy-momentum tensor, such that
The corresponding conserved currents then take the form
such that \(\dot{P}_{\mu} = 0\). Here it’s important to not confuse \(P_{\mu}\) with a canonical variable – it’s just a conserved quantity which turns out to be the \(4\)-momentum.
Now recall from Eq.6.2.4 that the Lagrangian is usually the spatial integral of a density functional. Hence it’s not unreasonable to suppose that the Lagrangian is indeed invariant under spatial translations. Under this assumption, we can rewrite Eq.6.3.10 as follows
with the understanding that \(\Psi_n = Q_n\).
To verify that \(\Pbf\) indeed generates spatial translations, let’s calculate using the fact that \(\Pbf\) is time-independent as follows
It follows that
for any functional \(\Gscr\) that doesn’t explicitly involve \(\xbf\). This verifies that \(\Pbf\) indeed generates the spatial translation.
In contrast, one cannot hope that the Lagrangian to be invariant under time translation, if there should be any interaction. But we already know the operator that generates time translation, namely, the Hamiltonian. In other words, we define \(P_0 \coloneqq -H\) such that
for any functional \(\Gscr\) that doesn’t explicitly involve \(t\).
In general, the Lagrangian density is not invariant under spacetime translations. However, it turns out that the conserved current, which in this case is \({T^{\mu}}_{\nu}\), can nonetheless be calculated. To spell out the details, let’s consider the following variation
The corresponding variation of the action is given as follows
where we’ve used the chain rule for derivatives in the second equality, and integration by parts in the third. Comparing with Eq.6.3.4, we see that
Note
The energy-momentum tensor \({T^{\nu}}_{\mu}\) is not yet suitable for general relativity since it’s not symmetric. As we’ll see in Lorentz symmetry, when taking homogeneous Lorentz transformation symmetry into account, one can supplement \({T^{\nu}}_{\mu}\) with some extra terms to make it both conserved and symmetric.
Indeed, this calculation recovers Eq.6.3.13 by letting \(\nu = 0\) and \(\mu \neq 0\). Moreover, it recovers the Hamiltonian by letting \(\mu = \nu = 0\) as follows
6.3.2. Linear transformations#
As another example, let’s consider linear variations as follows
where we’ve adopted the Einstein summation convention for repeated upper and lower indexes because it’d otherwise be too tedious to write out the summations. Here \((t_{\square})^{\square}_{\square}\) should furnish a representation of the Lie algebra of the symmetry group.
As before, the invariance of action under such variations implies the existence of conserved currents \(J^{\mu}_a\) such that
as well as the conserved quantity
If, in addition, the Lagrangian is invariant under such variations, then \(T_a\) takes the following form by Eq.6.3.10
It follows that
In particular, when \(t_a\) is diagonal (e.g., in electrodynamics), the operators \(Q^n\) and \(P_n\) may be regarded as raising/lowering operators. In fact, we claim that \(T_a\) form a Lie algebra by the following calculation
Now if \(t_a\) form a Lie algebra with structure constants \({f_{ab}}^c\) as follows
then
In other words, the conserved quantities also form the same Lie algebra.
Now if, in addition, the Lagrangian density is also invariant, then Eq.6.3.9 takes the following form
Note that since \(\Lscr\) doesn’t have \(\dot{C}_r\) dependencies, we have the following by letting \(\mu = 0\) in Eq.6.3.17
whose equal-time commutation relations with canonical variables \(P\) and \(Q\) can be easily calculated.
6.3.3. Lorentz invariance#
The goal of this section is to show that the Lorentz invariance of the Lagrangian density implies the Lorentz invariance of the S-matrix, which justifies our interest in the Lagrangian formalism in the first place.
Recall from Eq.1.2.16 and Eq.1.2.17 that
is a \((\mu, \nu)\)-parametrized anti-symmetric variation. It follows then from Eq.6.3.4 that there exist \((\mu, \nu)\)-parametrized anti-symmetric conserved currents as follows
which, in turn, make conversed quantities
such that \(\dot{J}^{\mu \nu} = 0\). These, as we’ll see, turn out to be rather familiar objects that we’ve encountered as early as in Eq.1.2.18.
In light of Eq.6.3.9, one can work out an explicit formula for \(\Mscr^{\rho \mu \nu}\) if the Lagrangian density is invariant under the symmetry transformation. Now since the Lagrangian density is expressed in terms of quantum fields, one’d like to know how they transform under Lorentz transformations. Since the translation symmetry has already been dealt with in Spacetime translations, we’ll consider here homogeneous Lorentz transformations. Luckily this has been worked out already in Quantum Fields and Antiparticles. More precisely, recall from Eq.4.4.1 that the variation term can be written as follows
where \(\Jscr\) are matrices satisfying Eq.4.4.2. The corresponding derivatives then have the following variation term
where the second summand on the right-hand-side corresponds to the fact the the Lorentz transformation also acts on the spacetime coordinates.
Now the invariance of the Lagrangian density under such variation can be written as follows
Since \(\omega^{\mu \nu}\) is not in general zero, its coefficient must be zero, which, taking Eq.6.3.18 into account, implies the following
Using the Euler-Lagrange equation Eq.6.2.5, we can get rid of the \(\delta\Lscr / \delta\Psi_n\) term in Eq.6.3.21 to arrive at the following
where we’ve also used Eq.6.3.16. Now we can address the issue of energy-momentum tensor not being symmetric by introducing the following so-called Belinfante tensor
which is both conserved in the sense that
and symmetric in the sense that
Indeed Eq.6.3.24 follows from the observation that the term inside the parenthesis of Eq.6.3.23 is anti-symmetric in \(\mu\) and \(\kappa\), and Eq.6.3.25 is a direct consequence of Eq.6.3.22.
The conserved quantities corresponding to \(\Theta_{\mu\nu}\), according to Eq.6.3.12 are
where the first equality holds because, again, the item in the parenthesis of Eq.6.3.23 is anti-symmetric is \(\mu\) and \(\kappa\), and therefore \(\kappa \neq 0\) given \(\mu = 0\). Hence it’s at least equally legitimate to call \(\Theta_{\mu \nu}\) the energy-momentum tensor. Indeed, the fact that \(\Theta_{\mu \nu}\) is the symmetric makes it suitable for general relativity.
Unlike the other conserved currents, which are derived under the general principles explained in Global symmetries, we’ll construct the anti-symmetric \(\Mscr^{\rho \mu \nu}\) declared in Eq.6.3.19 by hand as follows
While Eq.6.3.18 is automatically satisfied by definition, we can verify Eq.6.3.19, using Eq.6.3.24 and Eq.6.3.25, as follows
Moreover Eq.6.3.20 takes the following form
Now if we consider the rotation generators defined by
then it follows from Eq.6.3.15 that
since \(\Jbf\) doesn’t implicitly involve \(t\). This recovers one of the commutation relations Eq.1.2.22 for the Poincaré algebra. Next, let’s verify the commutation relation between \(\Pbf\) and \(\Jbf\), using Eq.6.3.14 and Eq.6.3.26, as follows
What come next are the boost operators defined as follows [1]
Bringing down the index, we can rewrite Eq.6.3.28 in vector form as follows
Now it follows from Eq.6.3.15 that
which is consistent with Eq.1.2.22.
Finally, using Eq.6.3.14 and Eq.6.3.29 together with the fact that \(\Pbf\) commutes with itself, one can evaluate the commutator between \(\Pbf\) and \(\Kbf\) as follows
which, again, is consistent with Eq.1.2.22.
It turns out, following the lines of argument in Lorentz symmetry of S-matrix, these commutation relations are enough to show the Lorentz invariance of S-matrix under the same “smoothness” assumptions on the interaction terms. In addition, the other commutation relations between \(H, \Pbf, \Jbf, \Kbf\) also follows.
Though not necessary, it’s indeed possible to verify the other Poincaré algebra relations directly. In particular, the commutation relations between the rotation generators are verified as follows.
6.4. Transition to Interaction Picture#
In this section, we will investigate, through examples, how to derive from the Lagrangian formalism an interaction picture, on which our entire approach to quantum field theory has been based. As a byproduct, we will also generalize the quantization procedure considered in Quantization of Free Scalar Fields.
6.4.1. Scalar field with derivative coupling#
In light of the Lagrangian Eq.6.1.12 for free scalar field, let’s consider the following Lagrangian density with derivative coupling and interaction
where coupling \(J^{\mu}\) may be either a scalar current or a functional of other fields, and should not be confused with the conserved quantity defined in Eq.6.3.27.
Now the canonical conjugate variable \(\Pi\) is, according to Eq.6.2.1, given by
and the Hamiltonian is, according to Eq.6.2.9 and Eq.6.4.2, given by
In light of Eq.6.1.15, we recognize the first three summands in the integrand as the free Hamiltonian density, and the rest as the interaction density. More explicitly, we can rewrite \(H\) as follows
Now we can pass to the interaction picture in the sense of Eq.2.5.7 as follows
where, for example, \(\pi(t, \xbf) \coloneqq e^{\ifrak H_0 t} \Pi(0, \xbf) e^{-\ifrak H_0 t}\). Moreover, we note that the free Hamiltonian \(H_0\) is time-independent, and recovers Eq.6.2.11.
Finally, in order to get the interaction density in terms of fields as explained in Quantum Fields and Antiparticles, we simply replace \(\pi(t, \xbf)\) with \(\dot{\phi}(t, \xbf)\) to get the following
It’s said in [Wei95] that the manifestly non-Lorentz-invariant summand \(\tfrac{1}{2} (J^0(t, \xbf))^2\) corresponds exactly to the local term in Eq.5.2.14, but I haven’t been able to see how.
6.4.2. Vector field with spin-\(1\)#
We start with a very general Lagrangian density defined as follows
where \(J^{\mu}\) is a coupling current just as in the case of scalar fields. As explained in Vector Fields, vector fields may come with spin \(0\) or \(1\), and we would like to consider here only the spin-\(1\) part. To achieve this, consider the Euler-Lagrange equation Eq.6.2.5 which takes the following form
Taking the divergence, we get
which we recognize, in light of Eq.6.2.12, as the field equation of a scalar field, or more precisely, a spin-\(0\) vector field. To avoid the appearance of \(\p_{\mu} V^{\mu}\) as an independently propagating (scalar) field, we take \(\alpha = -\beta = 1\), so that \(\p_{\mu} V^{\mu}\) may be expressed in terms of the external current \(J\).
Now we can rewrite Eq.6.4.4 as follows
where
in analogous to Eq.4.7.19, where we’ve tried to construct a tensor for massless spin-\(1\) particles. This is the general Lagrangian for spin-\(1\) vector fields.
To work out the canonical variables, we note that
which is nonzero for \(\mu \neq 0\). It follows that for spatial indexes \(i\), we have the the canonical variables \(V_i\) whose canonical dual is, according to Eq.6.2.1, given by
while \(V_0\) is auxiliary since \(\p \Lscr / \p \dot{V}_0 = 0\). It turns out \(V_0\) can be explicitly solved in terms of the other fields as follows. Setting \(\mu = 0\) in Eq.6.4.5 and remembering \(\alpha = -\beta = 1\), we have
To be able to write down the Hamiltonian, we need the following preparations. First, using Eq.6.4.8, we can write
and second, we can expand
Putting these all together, we can finally write down the Hamiltonian as follows
where the last equality serves the purpose of separating the Hamiltonian \(H = H_0 + V\) into the free part and the interacting part.
Now we pass to the interaction picture by writing
The Hamilton’s equations Eq.6.1.7 take the following form (cf. Eq.6.2.7)
We’re still missing \(v^0\) since \(V^0\) was an auxiliary variable, which was solved by Eq.6.4.9. Inspired by Eq.6.4.9, let’s define
This enables us to rewrite Eq.6.4.11 as follows
where we see again, just as in Eq.6.4.3, the non-Lorentz-invariant local term.
We can now solve for \(\bm{\pi}\) using Eq.6.4.13 and Eq.6.4.12 by
and write down the field equations, again using Eq.6.4.13 and Eq.6.4.12 as follows
These two equations can be unified using the \(4\)-vector notation as follows
Taking the divergence, we get
in agreement with Eq.4.3.19. Moreover it follows that Eq.6.4.14 reduces to the Klein-Gordon equation
General real solutions to Eq.6.4.15 and Eq.6.4.16 take the following form
where \(p^0 = \sqrt{\pbf^2 + m^2}\) and \(\sigma\) takes value in \(\{-1, 0, 1\}\) by convention so that \(e(\pbf, \sigma)\) correspond to the three \(4\)-vectors orthogonal to \(p\), namely
As explained in Eq.4.3.9, Eq.4.3.12 and Eq.4.3.13, we may choose the spinors \(e^{\mu}(\pbf, \sigma)\) such that
It’s then straightforward to verified the desired canonical commutation relations Eq.6.1.1
given that the following hold
This is a rather convincing evidence for the validity of the free field Hamiltonian Eq.6.4.11.
6.4.3. Dirac field with spin-\(1/2\)#
Recall from Eq.4.4.9 that the Dirac representation is not unitary, which eventually led to the definition of \(\bar{\psi}\) in Eq.4.4.45 for constructing interaction densities for Dirac fields. Motivated by discussions in Construction of the interaction density and the desire to make Lagrangian real, let’s consider the following
where \(\Hscr\) is a real function. Such \(\Lscr\) is nonetheless not real, which can be seen by the following calculation using Eq.4.4.10, Eq.4.4.12, and Eq.4.4.45
However, the same calculation shows that the action, i.e., the spacetime integral of the Lagrangian, is real. It follows that one needs not to treat \(\Psi\) and \(\bar{\Psi}\) as independent variables since the field equations, given as the stationary point of the action functional, for \(\Psi\) is adjoint to that for \(\bar{\Psi}\). Therefore we can simply define the canonical conjugate
and write the Hamiltonian
where the two summands in the last integrand give the usual splitting \(H = H_0 + V\).
Passing to the interaction picture, we can write
so that the field equation is simply
which is recognized as the Dirac equation Eq.4.4.37. Indeed, the other Hamilton’s equation \(\dot{\pi} = -\delta H_0 / \delta \psi\) gives nothing but the adjoint of the Dirac equation, hence no new information.
A general solution to Eq.6.4.20 can be written as
where \(u(\pbf, \pm 1/2)\) are the two independent solutions of
and similarly \(v(\pbf, \pm 1/2)\) are solutions of
Now observe that \(\ifrak \gamma^{\mu} p_{\mu}\), being a \(4 \times 4\) matrix, has two eigenvalues \(\pm m\). It follows from a straightforward matrix calculation that
must be proportional to the projection \(-\ifrak \gamma^{\mu} p_{\mu} + m\). Likewise
must be proportional to the other projection \(\ifrak \gamma^{\mu} p_{\mu} + m\).
Hence we can normalize \(u, v\) such that
which is consistent with the spin sum calculations Eq.4.4.33.
Knowing that spin \(1/2\) particles are fermions, one can verified that the canonical commutation relations
are satisfied if operators \(a, b\) satisfy the following
and all the other anti-commutators vanish.
6.5. Constraints and Dirac Brackets#
We’ve seen in the case of spin-1 massive vector fields that the main difficulty in deriving Hamiltonian from Lagrangian is the appearance of constraints. In this specific case, the constraints came from the vanishing of certain canonical variables
according to Eq.6.4.7, as well as relations among canonical variables Eq.6.4.9 coming from the Euler-Lagrange equation Eq.6.4.5. In this case, we were lucky in the sense that the later constraint gives us an explicit solution of \(V_0\) which is exactly the conjugate canonical variable of \(\Pi^0\), so we ended up in the comfortable situation again with only unconstrained canonical variables left.
We will not always be so lucky and therefore we need a more systematic solution to the problem, which was offered by Dirac. According to him, constraints like Eq.6.5.1 that come directly out of the structure of the Lagrangian, e.g., missing time derivatives of some fields, are called primary constraints. In addition, there may exist further constraints from the requirement of the equation of motion, e.g., the Euler-Lagrange equation, being consistent with the primary constraints. These are then called secondary constraints. In practice, it’s often the case that the primary and secondary constraints are considered together, and their distinction is not important.
What is important about the constraints are the distinction between the so-called first and second class constraints which we now explain. The difficulty in defining conjugate canonical variables boils down to the incompatibility between canonical commutation relations Eq.6.1.1 and the constraints, and the solution from Dirac is simply to (re)define the bracket, known as the Dirac bracket.
The first step is to recall the Poisson bracket from classical mechanics. Let \(L(\Psi, \dot{\Psi})\) be any Lagrangian regarded as a function of fields \(\Psi_a(t)\) and their time derivatives \(\dot{\Psi}_a(t)\). Here the (compound-)index \(a\) may contain continuous parameters such as the spatial coordinates. Define the canonical conjugates
for all \(a\). Of course the \(\Psi\) and \(\Pi\) are not all independent variables, but rather are subject to primary and second constraint equations. Now the Poisson bracket between any two functions \(A, B\) of the canonical variables is defined as follows
where the partial derivatives are calculated without taking the constraints into account. It holds therefore trivially that
Warning
Since we’re working within the framework of the canonical formalism, all commutators are taken at the same time. This rule is understood throughout this section, although it’s nowhere explicit in any formula.
Now if \(\Psi\) and \(\Pi\) are all independent variables, then \(\left[ \Psi_a, \Pi^b \right] = \ifrak \left[ \Psi_a, \Pi^b \right]_P\) would give the desired commutation relations. But the existence of constraints would require a modification to the Poisson bracket and eventually lead to the Dirac bracket.
As a side note, it follows from Eq.6.5.2 that the Hamilton’s equations Eq.6.1.7 can be written in the following form
Let’s write a generic constraint as \(\chi_N = 0\) where \(\chi_N\) is a function of the canonical variables \(\Psi, \Pi\) and \(N\) is indexing the constraints. Again, here \(N\) may contain continuous parameters such as spacetime coordinates. Since the constraints come out of the Lagrangian itself and the Euler-Lagrange equations, they are constant along the trajectory of motion, i.e., \(\dot{\chi}_N = 0\) whenever \(\chi_N = 0\). It follows that
whenever \(\chi_N = 0\). It turns out that one of the key features of the Dirac bracket is to upgrade Eq.6.5.3 so that it holds for any function (of the canonical variables) in place of \(H\).
6.5.1. First class constraints#
A constraint is of first class if it Poisson commutes with all other constraints. Such constraints typically arise from Lagrangians that carry gauge symmetries. The presence of gauge symmetry makes the system apparently underdetermined in the sense that there are more fields or their components than field equations.
Unfortunately, there appears to be no general recipe for handling first class constraints. However, it can typically be handled by “fixing the gauge”. A particularly important, and successful, example of such procedure, namely quantum electrodynamics, will be presented in the next chapter.
6.5.2. Second class constraints#
Assuming the first class constraints have been dealt with, the remaining constraints are called second class. On the space of second class constraints, we have a non-singular matrix \(C\) whose entries are defined by
Note
Since an anti-symmetric matrix of odd dimension necessarily has vanishing determinant, the dimension of \(C\) must be even. Indeed, it’s often convenient to pair constraints in the form of \(\chi_{1N}, \chi_{2N}\) and so on.
Now define the Dirac bracket as follows
One checks easily that the Dirac bracket satisfies the same (Lie) algebraic properties as the Poisson bracket. Moreover, it satisfies
for any \(B\). It is this last property that guarantees the compatibility between commutator relations and constraints if the former is calculated as follows
6.5.3. Spin-\(1\) vector field revisited#
We’ll have to wait until the next chapter to illustrate how first class constraints may appear and how they may be handled, since it appears in the theory of massless helicity-\(1\) vector fields. But we’re ready to illustrate, in the absence of first class constraints, how second class constraints may be handled by Dirac bracket.
Recall from Eq.6.4.7 that since \(\dot{V}_0\) is missing from \(\Lscr\), we get a primary constraint
and from the Euler-Lagrange equations Eq.6.4.9 a secondary constraint
Here we remind ourselves again that the time-dependence has been left out since all commutators will be taken at equal time.
The \(C\) matrix can now be calculated as follows
Clearly \(C\) is non-singular. Hence no first class constraints exist, and Dirac’s method applies.
In this case, the constrained canonical variables are \(V_0\) and \(\Pi^0\). Instead of solving them in terms of the unconstrained canonical variables explicitly as before, simply calculate the commutators using Dirac bracket as follows. First note that
It follows from Eq.6.5.6 and Eq.6.5.5 that
Together with the trivial Poisson bracket relations
we can now calculate all the commutation relations as follows
This turns out to be the same as if we use the explicit Eq.6.5.1 and Eq.6.4.9, as well as the canonical commutation relations among the unconstrained canonical variables, to calculate the commutators.
Note
According to [Wei95] page 330 footnote (**), it’s not known in full generality whether the Dirac bracket always produces the correct commutation relations, and more importantly, whether the standard relation Eq.6.2.6 between Lagrangian and Hamiltonian holds even in the presence of constrained canonical variables. These issues are (partially) addressed in [Wei95] page 329 – 330 through the work of [MaNa76] and page 332 – 337 through the work of Weinberg himself.
Instead of working out all the details, we’ll simply take for granted that Dirac’s method works. One of the blessings of physics (as opposed to mathematics, for example), which I learned from R. Feynman, is that all these knowledge points are highly inter-connected in the sense that one can nearly start anywhere in physics and deduce anything else. If we apply Dirac’s method to a theory, e.g., a quantum field theory, and it fails, we will know it from other principles, e.g., the free particle commutation relations calculated in Canonical Variables, which come from the free fields derived in Quantum Fields and Antiparticles, which, ultimately, come from the principle of Lorentz invariance and causality.
Footnotes