6. The Canonical Formalism#

The quantum theory we’ve been developing so far has been based almost solely on the symmetry principles, especially Lorentz symmetries. This is a very satisfying approach since it’s logically clean and relies only on the most fundamental principles, however, this is not the way quantum theory historically had been developed. Not surprisingly, the original development of quantum theory is much messier and requires substantial experience in “classical” physics. It’s largely based on the so-called Lagrangian formalism, which is a readily well-established principle in classical physics and can be “quantized”. The main goal of this chapter is to go through this formalism, not for historical sake, but because it offers a particularly convenient way to construct Hamiltonians that generate Lorentz-invariant S-matrices, which has been difficult for us as can be seen in Feynman rules in momentum space.

6.1. Canonical Variables#

We’ve seen in Quantum Fields and Antiparticles a few ways of constructing (Lorentz-invariant) interaction densities. However, we don’t have a systematic way to do so. The so-called Lagrangian formalism will not provide a systematic solution either, but it’ll allow us to construct more interesting interaction densities (from classical physics theories), to the extent that all known quantum field theories arise in this way! In addition, it’ll shed light on the mysterious local terms as for example in Eq.5.2.14, that are needed to compensate for a Lorentz-invariant momentum space propagator.

The offer from the Lagrangian formalism regarding constructing a quantum field theory is the following. Instead of using the creation and annihilation fields defined by Eq.4.1.1 to construct the Hamiltonians, we’ll use the so-called canonical variables, which have particularly simple (equal time) commutation relations. More precisely, it consists of a collection of quantum operators \(q_n(t, \xbf)\) and its canonical conjugates \(p_n(t, \xbf)\), which satisfy the following (anti-)commutation relations

(6.1.1)#\[\begin{split}\left[ q_n(t, \xbf), p_{n'}(t, \ybf) \right]_{\pm} &= \ifrak \delta^3(\xbf - \ybf) \delta_{n n'} \\ \left[ q_n(t, \xbf), q_{n'}(t, \ybf) \right]_{\pm} &= 0 \\ \left[ p_n(t, \xbf), p_{n'}(t, \ybf) \right]_{\pm} &= 0 \\\end{split}\]

where \(\pm\) correspond to when the particle under question is fermionic or bosonic, respectively.

To see how canonical variables may be constructed from fields considered in Quantum Fields and Antiparticles, let’s consider a few examples.

Scalar fields

Let’s start by considering scalar fields of particles that are their own antiparticles. Using notations from Scalar Fields, it means that \(\psi(x) = \psi^{ \dagger}(x)\), i.e., the field is Hermitian. It follows then from Eq.4.2.9 and Eq.4.2.10 that

\[\left[ \psi(x), \psi(y) \right] = \Delta(x-y) = \frac{1}{(2\pi)^3} \int \frac{d^3 p}{2p_0} \left( e^{\ifrak p \cdot (x-y)} - e^{-\ifrak p \cdot (x-y)} \right)\]

where \(p_0 = \sqrt{\pbf^2 + m^2}\).

We claim that the canonical commutation relations Eq.6.1.1 are satisfied by

(6.1.2)#\[\begin{split}q(t, \xbf) &\coloneqq \psi(t, \xbf) \\ p(t, \xbf) &\coloneqq \dot{\psi}(t, \xbf)\end{split}\]

Indeed, it follows from the following calculations

(6.1.3)#\[\begin{split}\begin{alignat*}{3} \left[ q(t, \xbf), p(t, \ybf) \right] &= \left[ \psi(t, \xbf), \dot{\psi}(t, \ybf) \right] &&= -\dot{\Delta}(0, \xbf-\ybf) &&= \ifrak \delta^3(\xbf-\ybf) \\ \left[ q(t, \xbf), q(t, \ybf) \right] &= \left[ \psi(t, \xbf), \psi(t, \ybf) \right] &&= \Delta(0, \xbf-\ybf) &&= 0 \\ \left[ p(t, \xbf), p(t, \ybf) \right] &= \left[ \dot{\psi}(t, \xbf), \dot{\psi}(t, \ybf) \right] &&= -\ddot{\Delta}(0, \xbf-\ybf) = 0 \end{alignat*}\end{split}\]

Now for particles that are different from their antiparticles, we must modify Eq.6.1.2 as follows

\[\begin{split}q(t, \xbf) &= \psi(t, \xbf) \\ p(t, \xbf) &= \dot{\psi}^{\dagger}(t, \xbf)\end{split}\]

and note that in this case \(\left[ \psi(t, \xbf), \psi(t', \ybf) \right] = 0\), in contrast to the second equation in Eq.6.1.3.

Spin-\(1\) vector fields

Consider once again particles that are self-charge-dual. Using notations from Spin-1 vector fields, we recall the commutation relation Eq.4.3.17 as follows

\[\left[ \psi_{\mu}(x), \psi_{\nu}(y) \right] = \left( \eta_{\mu\nu} - \frac{\p_{\mu} \p_{\nu}}{m^2} \right) \Delta(x-y)\]

The canonical variables in this case can be defined as follows

(6.1.4)#\[\begin{split}q_i(t, \xbf) &= \psi_i(t, \xbf) \\ p_i(t, \xbf) &= \dot{\psi}_i(t, \xbf) - \frac{\p \psi_0(t, \xbf)}{\p x_i}\end{split}\]

where \(i=1,2,3\). Indeed, let’s calculate the equal-time commutators as follows

\[\begin{split}\left[ q_i(t, \xbf), p_j(t, \ybf) \right] &= \left[ \psi_i(t, \xbf), \dot{\psi}_j(t, \ybf) \right] - \left[ \psi_i(t, \xbf), \frac{\p \psi_0(t, \ybf)}{\p y_j} \right] \\ &= -\left( \eta_{ij} -\frac{\p_i \p_j}{m^2} \right) \dot{\Delta}(0, \xbf-\ybf) - \left. \frac{\p_i \p_0}{m^2} \right|_{t=0} \left( \p_j \Delta(t, \xbf-\ybf) \right) \\ &= \ifrak \delta^3(\xbf-\ybf) \delta_{ij} \\ \left[ q_i(t, \xbf), q_j(t, \ybf) \right] &= \left( \eta_{ij} - \frac{\p_i \p_j}{m^2} \right) \Delta(0, \xbf-\ybf) = 0 \\ \left[ p_i(t, \xbf), p_j(t, \ybf) \right] &= \left[ \dot{\psi}_i(t, \xbf), \dot{\psi}_j(t, \ybf)\right] + \p_{x_i} \p_{y_j} \left[ \psi_0(t, \xbf), \psi_0(t, \ybf) \right] \\ &\qquad - \p_{x_i} \left[ \psi_0(t, \xbf), \dot{\psi}_j(t, \ybf) \right] - \p_{y_j} \left[ \dot{\psi}_i(t, \xbf), \psi_0(t, \ybf) \right] = 0\end{split}\]

We’ve omitted some details about the vanishing of the last quantities – it turns out that the the first and second terms cancel out, and the third and the fourth terms also cancel out.

In any case, we’ve constructed three pairs of canonical variables, one for each spatial index. But what about the time index? It turns out that \(\psi_0\) is not an independent variable. Indeed, we can derive from Eq.6.1.4, using Eq.4.3.19 and Eq.4.1.18, an expression of \(\psi_0\) as follows

\[\begin{split}p_i = \p_0 \psi_i - \p_i \psi_0 & \implies \p_i p_i = \p_0 \p_i \psi_i - \p^2_i \psi_0 \\ & \implies \nabla \cdot \pbf = \p_0 \sum_{i=1}^3 \p_i \psi_i - \sum_{i=1}^3 \p^2_i \psi_0 \\ & \implies \nabla \cdot \pbf = \p_0^2 \psi_0 - \sum_{i=1}^3 \p_i^2 \psi_0 = -\square \psi_0 \\ & \implies \psi_0 = -m^{-2} \nabla \cdot \pbf\end{split}\]

Spin-\(1/2\) Dirac fields

Recall the anti-commutator of Dirac fields Eq.4.4.34 as follows

\[\left[ \psi_{\ell}(x), \psi^{\dagger}_{\ell'}(y) \right]_+ = \left( (-\gamma^{\mu} \p_{\mu} + m) \beta \right)_{\ell \ell'} \Delta(x-y)\]

where \(\ell, \ell'\) are indexes corresponding to the two spin \(z\)-component \(\pm 1/2\). Assuming that particle under question has distinct antiparticle, i.e., it’s not a Majorana fermion, the following holds trivially

\[\left[ \psi_{\ell}(x), \psi_{\ell'}(y) \right]_+ = 0\]

It follows that the canonical variables can be defined by

\[\begin{split}q_{\ell}(x) &= \psi_{\ell}(x) \\ p_{\ell}(x) &= \ifrak \psi^{\dagger}_{\ell}(x)\end{split}\]

Indeed, the only nontrivial (and non-vanishing) anti-commutator can be calculated as follows

\[\begin{split}\left[ q_{\ell}(t, \xbf), p_{\ell'}(t, \ybf) \right]_+ &= \ifrak \left[ \psi_{\ell}(t, \xbf), \psi_{\ell'}^{\dagger}(t, \ybf) \right]_+ \\ &= -\ifrak \left( \gamma^0 \beta \right)_{\ell \ell'} \dot{\Delta}(0, \xbf-\ybf) \\ &= \ifrak \delta^3(\xbf-\ybf) \delta_{\ell \ell'}\end{split}\]

Through these examples, we see that there is no particular pattern in how one may define canonical variables. In fact, one doesn’t really define canonical variables in this way either – they are simply given for granted in the Lagrangian formalism as we will see.

We begin by a general discussion on functionals \(F[q(t), p(t)]\) of canonical variables, since both Hamiltonians and Lagrangians will be such functionals. A few notes are in order. First we’ve used a shorthand notation \(q(t)\) and \(p(t)\) to denote a collection of canonical variables. Moreover, in writing \(q(t)\) (and similarly for \(p(t)\)) we implicitly think of them as fields at a given time. Indeed, as we’ll see, the time variable plays an exceptional role in the Lagrangian formalism, in contrast to our mindset so far that space and time are all mixed up in a Lorentz invariant theory. Finally, we’ve used square bracket to differentiate it from regular functions of spacetime or momentum variables.

At the heart of the Lagrangian formalism lies a variational principle. Hence it’s crucial to be able to take infinitesimal variations on \(F[q(t), p(t)]\), which we write as follows

(6.1.5)#\[\delta F[q(t), p(t)] = \int d^3 x \sum_n \left( \delta q_n(t, \xbf) \frac{\delta F[q(t), p(t)]}{\delta q_n(t, \xbf)} + \frac{\delta F[q(t), p(t)]}{\delta p_n(t, \xbf)} \delta p_n(t, \xbf) \right)\]

Here the infinitesimal fields \(\delta q_n\) and \(\delta p_n\) are assumed to (anti-)commute with all other fields. Now assuming \(F[q(t), p(t)]\) is written so that all the \(q\) fields lie to the left of all the \(p\) fields, then Eq.6.1.5 can be realized by the following definition of variational derivatives

\[\begin{split}\frac{\delta F[q(t), p(t)]}{\delta q_n(t, \xbf)} \coloneqq \ifrak \big[ p_n(t, \xbf), F[q(t), p(t)] \big] \\ \frac{\delta F[q(t), p(t)]}{\delta p_n(t, \xbf)} \coloneqq \ifrak \big[ F[q(t), p(t)], q_n(t, \xbf) \big]\end{split}\]

6.1.1. Hamiltonian and Lagrangian for free fields#

For free fields we have

(6.1.6)#\[\begin{split}q_n(t, \xbf) &= e^{\ifrak H_0 t} q_n(0, \xbf) e^{-\ifrak H_0 t} \\ p_n(t, \xbf) &= e^{\ifrak H_0 t} p_n(0, \xbf) e^{-\ifrak H_0 t}\end{split}\]

where \(H_0\) is the free field Hamiltonian, also known as the symmetry generator for the time translation, or the energy operator. However, rather than thinking of it as an abstract operator as we’ve done so far, we’ll (momentarily) make it a functional of canonical variables. With this in mind, we can take the time derivative of Eq.6.1.6 as follows

(6.1.7)#\[\begin{split}\begin{alignat*}{2} \dot{q}_n(t, \xbf) &= \ifrak \left[ H_0, q_n(t, \xbf) \right] &&= \frac{\delta H_0}{\delta p_n(t, \xbf)} \\ \dot{p}_n(t, \xbf) &= \ifrak \left[ H_0, p_n(t, \xbf) \right] &&= -\frac{\delta H_0}{\delta q_n(t, \xbf)} \end{alignat*}\end{split}\]

We recognize these as the quantum analog of Hamilton’s equation of motion.

To turn \(H_0\) into a functional of canonical variables, we first make it a functional of creation and annihilation operators. Remembering that \(H_0\) is the energy operator, and \(p_0 = \sqrt{\pbf^2 + m^2}\) is the energy in the \(4\)-momentum, we can write \(H_0\) as a diagonal operator as follows

(6.1.8)#\[H_0 = \sum_{n, \sigma} \int d^3 p~a^{\dagger}(\pbf, \sigma, n) a(\pbf, \sigma, n) \sqrt{\pbf^2 + m^2}\]

For simplicity, let’s consider the case of a real scalar field \(\psi(x)\) given by Eq.4.2.8 as follows

\[q(t, \xbf) = \psi(x) = \frac{1}{(2\pi)^{3/2}} \int \frac{d^3 p}{\sqrt{2p_0}} \left( e^{\ifrak p \cdot x} a(\pbf) + e^{-\ifrak p \cdot x} a^{\dagger}(\pbf) \right)\]

The canonical conjugate variable is

\[p(t, \xbf) = \dot{\psi}(x) = \frac{1}{(2\pi)^{3/2}} \int \frac{d^3 p}{\sqrt{2p_0}} (-\ifrak p_0) \left( e^{\ifrak p \cdot x} a(\pbf) - e^{-\ifrak p \cdot x} a^{\dagger}(\pbf) \right)\]

These look a bit far from Eq.6.1.8. But since \(H_0\) involves products like \(a^{\dagger}(\pbf, \sigma, n) a(\pbf, \sigma, n)\), let’s try to square the canonical variables as follows

\[\begin{split}\int d^3 x~q^2(t, \xbf) &= \frac{1}{(2\pi)^3} \int \frac{d^3 p~d^3 p'~d^3 x}{2\sqrt{p_0 p'_0}} \Big( e^{\ifrak p \cdot x} a(\pbf) + e^{-\ifrak p \cdot x} a^{\dagger}(\pbf) \Big) \Big( e^{\ifrak p' \cdot x} a(\pbf') + e^{-\ifrak p' \cdot x} a^{\dagger}(\pbf') \Big) \\ &= \int \frac{d^3 p}{2p_0} \left( \blue{ e^{-2\ifrak p_0 t} a(\pbf) a(-\pbf) + e^{2\ifrak p_0 t} a^{\dagger}(\pbf) a^{\dagger}(-\pbf) } + \left[ a(\pbf), a^{\dagger}(\pbf) \right]_+ \right)\end{split}\]

and

\[\begin{split}\int d^3 x~p^2(t, \xbf) &= \frac{1}{(2\pi)^3} \int \frac{d^3 p~d^3 p'~d^3 x}{2\sqrt{p_0 p'_0}} (-p_0 p'_0) \Big( e^{\ifrak p \cdot x} a(\pbf) - e^{-\ifrak p \cdot x} a^{\dagger}(\pbf) \Big) \\ &\qquad \times \Big( e^{\ifrak p' \cdot x} a(\pbf') - e^{-\ifrak p' \cdot x} a^{\dagger}(\pbf') \Big) \\ &= \int \frac{d^3 p}{2p_0} \left( -p_0^2 \right) \left( \blue{ e^{-2\ifrak p_0 t} a(\pbf) a(-\pbf) + e^{2\ifrak p_0 t} a^{\dagger}(\pbf) a^{\dagger}(-\pbf) } - \left[ a(\pbf), a^{\dagger}(\pbf) \right]_+ \right)\end{split}\]

and finally, inspired by the calculations above

\[\begin{split}\int d^3 x~\left( \nabla q(t, \xbf) \right)^2 &= \frac{1}{(2\pi)^3} \int \frac{d^3 p~d^3 p'~d^3 x}{2\sqrt{p_0 p'_0}} \left( -\pbf \cdot \pbf' \right) \Big( e^{\ifrak p \cdot x} a(\pbf) - e^{-\ifrak p \cdot x} a^{\dagger}(\pbf) \Big) \\ &\qquad \times \Big( e^{\ifrak p' \cdot x} a(\pbf') - e^{-\ifrak p' \cdot x} a^{\dagger}(\pbf') \Big) \\ &= \int \frac{d^3 p}{2p_0} \pbf^2 \left( \blue{ e^{-2\ifrak p_0 t} a(\pbf) a(-\pbf) + e^{2\ifrak p_0 t} a^{\dagger}(\pbf) a^{\dagger}(-\pbf) } + \left[ a(\pbf), a^{\dagger}(\pbf) \right]_+ \right)\end{split}\]

Putting these calculations together in a specific way, and using the identity \(p_0^2 - \pbf^2 = m^2\), we can eliminate the blue terms as follows

(6.1.9)#\[\begin{split}\frac{1}{2} \int d^3 x \left( p^2 + \left( \nabla q \right)^2 + m^2 q^2 \right) &= \frac{1}{2} \int d^3 p~p_0 \left[ a(\pbf), a^{\dagger}(\pbf) \right]_+ \\ &= \int d^3 p~p_0 \left( a^{\dagger}(\pbf) a(\pbf) + \frac{1}{2} \delta^3(\pbf-\pbf) \right) \\ &= H_0 + \blue{ \frac{1}{2} \int d^3 p~p_0 \delta^3(0) }\end{split}\]

Here we’ve encountered for the first time an infinite term (which we’ve marked in blue). As long as the Hamiltonian dynamics Eq.6.1.7 is concerned, it makes no difference adding a constant to the Hamiltonian. Hence we can write the free Hamiltonian for real scalar fields as follows

(6.1.10)#\[H_0^{\text{RSF}} = \frac{1}{2} \int d^3 x \left( p^2 + \left( \nabla q \right)^2 + m^2 q^2 \right)\]

Warning

Throwing away the infinite term in Eq.6.1.9 is an instance of a well-known criticism in quantum field theory – just because something is infinite doesn’t mean it’s zero. Indeed, Weinberg mentioned in page 297 [Wei95] that such “infinities” shouldn’t be thrown away when, for example, the fields are constrained within a finite space, or there is an involvement of gravity.

Now it’s time to introduce the rather mysterious Lagrangian, which can be derived from the Hamiltonian via the so-called Legendre transformation as follows

(6.1.11)#\[L_0\left[ q(t), \dot{q}(t) \right] \coloneqq \sum_n \int d^3 x~p_n(t, \xbf) \dot{q}_n(t, \xbf) - H_0\]

where each occurrence of \(p_n(t)\) is replaced by its expression in \(q_n(t)\) and \(\dot{q}_n(t)\).

As a concrete example, let’s consider again the real scalar field, where \(p = \dot{q}\). It follows that

(6.1.12)#\[\begin{split}L_0^{\text{RSF}} &= \int d^3 x \left( p\dot{q} - \frac{1}{2} p^2 - \frac{1}{2} \left( \nabla q \right)^2 - \frac{1}{2} m^2 q^2 \right) \\ &= \frac{1}{2} \int d^3 x \left( \dot{q}^2 - \left( \nabla q \right)^2 - m^2 q^2 \right) \\ &= -\frac{1}{2} \int d^3 x \left( \p_{\mu} \psi \p^{\mu} \psi + m^2 \psi^2 \right)\end{split}\]

It should be noted that expressing \(p\) in terms of \(q\) and \(\dot{q}\) isn’t always easy. Indeed, it’s far from obvious how the \(p_i\) defined by Eq.6.1.4 could be expressed in the corresponding \(q_i\) and \(\dot{q}_i\). (Un)Fortunately, we’d never really need to do so – writing down a Lagrangian turns out to be mostly a guess work.

6.1.2. Hamiltonian and Lagrangian for interacting fields#

Let \(H\) be the full Hamiltonian. Then the Heisenberg picture canonical variables can be defined as follows

(6.1.13)#\[\begin{split}Q_n(t, \xbf) &\coloneqq e^{\ifrak Ht} q_n(0, \xbf) e^{-\ifrak Ht} \\ P_n(t, \xbf) &\coloneqq e^{\ifrak Ht} p_n(0, \xbf) e^{-\ifrak Ht}\end{split}\]

Then obviously these canonical variables also satisfy the canonical (anti-)commutation relations

\[\begin{split}\left[ Q_n(t, \xbf), P_{n'}(t, \ybf) \right]_{\pm} &= \ifrak \delta^3(\xbf-\ybf) \delta_{n n'} \\ \left[ Q_n(t, \xbf), Q_{n'}(t, \ybf) \right]_{\pm} &= 0 \\ \left[ P_n(t, \xbf), P_{n'}(t, \ybf) \right]_{\pm} &= 0\end{split}\]

Moreover, the analog of Eq.6.1.7 holds as follows

(6.1.14)#\[\begin{split}\begin{alignat*}{2} \dot{Q}_n(t, \xbf) &= \ifrak \left[ H, Q_n(t, \xbf) \right] &&= \frac{\delta H}{\delta P_n(t, \xbf)} \\ \dot{P}_n(t, \xbf) &= \ifrak \left[ H, P_n(t, \xbf) \right] &&= -\frac{\delta H}{ \delta Q_n(t, \xbf)} \end{alignat*}\end{split}\]

As an example, we note that, in light of Eq.6.1.10, the full Hamiltonian for real scalar fields may be written as

(6.1.15)#\[H^{RSF} = \int d^3 x \left( \frac{1}{2} P^2 + \frac{1}{2} \left( \nabla Q \right)^2 + \frac{1}{2} m^2 Q^2 + \Hscr(Q) \right)\]

where \(\Hscr(Q)\) is the perturbation term giving rise to the interaction.

6.2. The Lagrangian Formalism#

We’ll leave aside the discussion of canonical variables for a bit to introduce the Lagrangian formalism in its most general form. After that we’ll play the game backwards. Namely, instead of constructing canonical variables out of the free fields that we’ve been exclusively considering since Quantum Fields and Antiparticles, we’ll get canonically conjugate fields out of the (magically appearing) Lagrangians, and then impose the canonical commutation relations Eq.6.1.1 on them – a procedure generally known as “quantization”.

In the classical physical theory of fields, a Lagrangian is a functional \(L[\Psi(t), \dot{\Psi}(t)]\), where \(\Psi(t)\) is any field and \(\dot{\Psi}(t)\) is its time derivative. Here we’ve capitalized the field variables to distinguish them from the free fields considered in the previous section. Define the conjugate fields as follows

(6.2.1)#\[\Pi_n(t, \xbf) \coloneqq \frac{\delta L[\Psi(t), \dot{\Psi}(t)]}{\delta \dot{\Psi}_n(t, \xbf)}\]

so that the field equations are given by

(6.2.2)#\[\dot{\Pi}_n(t, \xbf) = \frac{\delta L[\Psi(t), \dot{\Psi}(t)]}{\delta \Psi_n(t, \xbf)}\]

Warning

Unlike the functional derivatives considered in Eq.6.1.5 for canonical variables, the functional derivative Eq.6.2.1, interpreted quantum mechanically, is not really well-defined since \(\Psi(t)\) and \(\dot{\Psi}(t)\) don’t in general satisfy a simple (same time) commutation relation. According to Weinberg (see footnote on page 299 in [Wei95]), “no important issues hinge on the details here”. So we’ll pretend that it behaves just like usual derivatives.

Indeed, recall that in the classical Lagrangian formalism, the field equations are given by a variational principle applied to the so-called action, defined as follows

(6.2.3)#\[I[\Psi] \coloneqq \int_{-\infty}^{\infty} dt~L[\Psi(t), \dot{\Psi}(t)]\]

The infinitesimal variation of \(I[\Psi]\) is given by

\[\begin{split}\delta I[\Psi] &= \sum_n \int_{-\infty}^{\infty} dt \int d^3 x \left( \frac{\delta L[\Psi(t), \dot{\Psi}(t)]}{\delta \Psi_n(t, \xbf)} \delta \Psi_n(t, \xbf) + \frac{\delta L[\Psi(t), \dot{\Psi}(t)]}{\delta \dot{\Psi}_n(t, \xbf)} \delta \dot{\Psi}_n(t, \xbf) \right) \\ &= \sum_n \int_{-\infty}^{\infty} dt \int d^3 x \left( \frac{\delta L[\Psi(t), \dot{\Psi}(t)]}{\delta \Psi(t, \xbf)} - \frac{d}{dt} \frac{\delta L[\Psi(t), \dot{\Psi}(t)]}{\delta \dot{\Psi}_n(t, \xbf)} \right) \delta \Psi_n(t, \xbf)\end{split}\]

where for the last equality, integration by parts is used under the assumption that the infinitesimal variation \(\delta \Psi_n(t, \xbf)\) vanishes at \(t \to \pm\infty\). Obviously \(\delta I[\Psi]\) vanishes for any \(\delta \Psi_n(t, \xbf)\) if and only if Eq.6.2.2 is satisfied.

Now we’re interested in constructing Lorentz invariant theories, but an action defined by Eq.6.2.3 apparently distinguishes the time from space variables. This motivates the hypothesis that the Lagrangian itself is given by a spatial integral of a so-called Lagrangian density as follows

(6.2.4)#\[L[\Psi(t), \dot{\Psi}(t)] = \int d^3 x~\Lscr(\Psi(t, \xbf), \nabla\Psi(t, \xbf), \dot{\Psi}(t, \xbf))\]

In terms of the Lagrangian density, we can rewrite the action Eq.6.2.3 as a \(4\)-integral as follows

\[I[\Psi] = \int d^4 x~\Lscr(\Psi(x), \p_{\mu} \Psi(x))\]

Note

The Lagrangian density \(\Lscr(\Psi, \p_{\mu} \Psi)\) is to be considered as a function-valued functional of \(\Psi\) and \(\p_{\mu} \Psi\). Thus it makes sense to take partial derivatives, instead of variational derivatives, with respect to its variables such as \(\p \Lscr / \p \Psi\).

We’d also like to reexpress the field equations Eq.6.2.2 in terms of the Lagrangian density. To this end, let’s first calculate the variation of Eq.6.2.4 by an amount \(\delta \Psi_n(t, \xbf)\) as follows

\[\begin{split}\delta L &= \sum_n \int d^3 x \left( \frac{\p \Lscr}{\p \Psi_n} \delta\Psi_n + \frac{\p \Lscr}{\p (\nabla \Psi_n)} \cdot \nabla \delta\Psi_n + \frac{\p \Lscr}{\p \dot{\Psi}_n} \delta\dot{\Psi}_n \right) \\ &= \sum_n \int d^3 x \left( \left( \frac{\p \Lscr}{\p \Psi_n} - \nabla \cdot \frac{\p \Lscr}{\p (\nabla \Psi_n)} \right) \delta\Psi_n + \frac{\p \Lscr}{\p \dot{\Psi}_n} \delta\dot{\Psi}_n \right)\end{split}\]

It follows that

\[\begin{split}\frac{\delta L}{\delta\Psi_n} &= \frac{\p \Lscr}{\p \Psi_n} - \nabla \cdot \frac{\p \Lscr}{\p (\nabla \Psi_n)} \\ \frac{\delta L}{\delta\dot{\Psi}_n} &= \frac{\p \Lscr}{\p \dot{\Psi}_n}\end{split}\]

Combining these with Eq.6.2.2 and Eq.6.2.2, we’ve derived the so-called Euler-Lagrange equations for the Lagrangian density

(6.2.5)#\[\frac{\p \Lscr}{\p \Psi_n} = \p_{\mu} \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)}\]

Note that the summing \(4\)-index \(\mu\) here represents \(x_{\mu}\). Most importantly, the field equations given by Eq.6.2.5 will be Lorentz invariant if \(\Lscr\) is. Indeed, guessing such \(\Lscr\) will be more or less the only way to construct Lorentz invariant (quantum) field theories.

Note

The Lagrangian density \(\Lscr\) is assumed to be real for two reasons. First, if \(\Lscr\) were complex, then splitting it into the real and imaginary parts, Eq.6.2.5 would contain twice as many equations as there are fields, regardless whether real or complex. This is undesirable because generically there will be no solutions. The second reason has to wait until the next section, where symmetries will be discussed. It turns out that the reality of \(\Lscr\) will guarantee that the symmetry generators are Hermitian.

Now recall from the previous section that the anchor of our knowledge is the Hamiltonian – we know how it must look like, at least for free fields. To go from the Lagrangian to the Hamiltonian, we use again the Legendre transformation (cf. Eq.6.1.11) to define the Hamiltonian as follows

(6.2.6)#\[H[\Psi, \Pi] \coloneqq \sum_n \int d^3 x~\Pi_n(t, \xbf) \dot{\Psi}_n(t, \xbf) - L[\Psi(t), \dot{\Psi}(t)]\]

Warning

In order to realize \(H\) as a functional of \(\Psi\) and \(\Pi\), one must in principle be able to solve for \(\dot{\Psi}_n\) in terms of \(\Psi_n\) and \(\Pi_n\) from Eq.6.2.1. This isn’t always easy, if at all possible, but it rarely pose serious difficulties in applications either.

As a double check, let’s verify that the Hamiltonian defined by Eq.6.2.6 also satisfies Hamilton’s equations (cf. Eq.6.1.7). Indeed, the variational derivatives are calculated, using Eq.6.2.1 and Eq.6.2.2, as follows

(6.2.7)#\[\begin{split}\frac{\delta H}{\delta \Pi_n(t, \xbf)} &= \sum_m \int d^3 y \left( \frac{\delta \Pi_m(t, \ybf)}{\delta \Pi_n(t, \xbf)} \dot{\Psi}_m(t, \ybf) + \Pi_m(t, \ybf) \frac{\delta \dot{\Psi}_m(t, \ybf)}{\delta \Pi_n(t, \xbf)} \right) \\ &\qquad - \sum_m \int d^3 y \frac{\delta L}{\delta \dot{\Psi}_m(t, \ybf)} \frac{\delta \dot{\Psi}_m(t, \ybf)}{\delta \Pi_n(t, \xbf)} \\ &= \sum_m \int d^3 y~\delta_{m,n} \delta^3(\ybf-\xbf) \dot{\Psi}_m(t, \ybf) \\ &= \dot{\Psi}_n(t, \xbf) \\ \frac{\delta H}{\delta \Psi_n(t, \xbf)} &= \sum_m \int d^3 y~\Pi_m(t, \ybf) \frac{\delta \dot{\Psi}_m(t, \ybf)}{\delta \Psi_n(t, \xbf)} \\ &\qquad - \sum_m \int d^3 y \left( \frac{\delta L}{\delta \Psi_m(t, \ybf)} \frac{\delta \Psi_m(t, \ybf)}{\delta \Psi_n(t, \xbf)} + \frac{\delta L}{\delta \dot{\Psi}_m(t, \ybf)} \frac{\delta \dot{\Psi}_m(t, \ybf)}{\delta \Psi_n(t, \xbf)} \right) \\ &= -\sum_m \int d^3 y~\delta_{m, n} \delta^3(\ybf-\xbf) \dot{\Pi}_m(t, \ybf) \\ &= -\dot{\Pi}_n(t, \xbf)\end{split}\]

It’s therefore attempting to demand, in the Lagrangian formalism, that \(\Psi_n\) and \(\Pi_n\), defined by Eq.6.2.1, satisfy the canonical commutation relations. In other words, they are (Heisenberg picture) canonically conjugate fields. But this is not true in general, as it turns out.

The issue is that the Lagrangian \(L[\Psi(t), \dot{\Psi}(t)]\) may contain certain field, but not its time derivative. One example is spin-\(1\) vector fields, where we see from Eq.6.1.4 that the spatial fields \(\psi_i\) are part of the canonical variables, but not \(\psi_0\), which nonetheless should present in the Lagrangian by Lorentz invariance. It turns out that what’s missing from the Lagrangian is \(\dot{\psi}_0\), which causes its conjugate variable defined by Eq.6.2.1 to vanish.

But instead of dealing with vector fields further, we’ll turn back to the general ground to establish the fundamental principles. Inspired by above discussion, we can rewrite the Lagrangian as

(6.2.8)#\[L[Q(t), \dot{Q}(t), C(t)]\]

where each \(Q_n(t)\) has a corresponding \(\dot{Q}_n(t)\), but not for \(C(t)\). It follows that one can define the canonical conjugates by

\[P_n(t, \xbf) \coloneqq \frac{\delta L[Q(t), \dot{Q}(t), C(t)]}{\delta \dot{Q}_n(t, \xbf)}\]

and hence the Hamiltonian takes the following form

(6.2.9)#\[H[Q, P] = \sum_n \int d^3 x~P_n \dot{Q}_n - L[Q(t), \dot{Q}(t), C(t)]\]

6.3. Global symmetries#

Of course, the reason for introducing the Lagrangian formalism is not to reproduce the Hamiltonians and the fields that we already knew. The main motivation is that, as we’ll see, the Lagrangian formalism provides a framework for studying symmetries. Recall from What is a Symmetry? that a symmetry was defined to be a(n anti-)unitary transformation on the Hilbert space of states, i.e., a transformation that preserves amplitudes. Now in the Lagrangian formalism, field equations come out of the stationary action condition. Therefore in this context, we’ll redefine a symmetry as an infinitesimal variation of the fields that leaves the action invariant. As it turns out, symmetries in this sense lead to conserved currents, which are nothing but the symmetry operators considered earlier. Hence besides a slight abuse of terminology, the notion of symmetries will be consistent.

Note

Throughout this section, repeated indexes like \(n\), which are used to index various fields, in an equation are not automatically summed up. On the other hand, repeated \(4\)-indexes like \(\mu\) do follow the Einstein summation convention.

Consider an infinitesimal variation

(6.3.1)#\[\Psi_n(x) \to \Psi_n(x) + \ifrak \epsilon \Fscr_n(x)\]

which leaves the action \(I[\Psi]\) invariant

(6.3.2)#\[0 = \delta I = \ifrak \epsilon \sum_n \int d^4 x~\frac{\delta I[\Psi]}{\delta \Psi_n(x)} \Fscr_n(x)\]

A few remarks are in order. First of all, if we think of Eq.6.3.1 as an infinitesimal (unitary) symmetry transformation, then the coefficient \(\ifrak\) can be justified by the intention of making \(\Fscr_n(x)\) Hermitian. Next, although Eq.6.3.2 always holds when \(\Psi_n(x)\) is stationary, the infinitesimal \(\Fscr_n(x)\) being a symmetry demands that Eq.6.3.2 holds true for any \(\Psi_n(x)\). Finally, we emphasize the fact that \(\epsilon\) is an infinitesimal constant, rather than a function of \(x\), is the defining property for the symmetry to be called “global”. Indeed, we’ll be dealing with symmetries that are not global in the next chapter, namely, the gauge symmetries.

The general principle that “symmetries imply conservation laws” is mathematically known as Noether’s theorem, but we’ll not bother with any mathematical formality here. To see how to derive conserved quantities from an assumed symmetry, let’s change Eq.6.3.1 as follows

(6.3.3)#\[\Psi_n(x) \to \Psi_n(x) + \ifrak \epsilon(x) \Fscr_n(x)\]

where \(\epsilon(x)\) now is an infinitesimal function of \(x\). Under this variation, the corresponding \(\delta I\) may not vanish. But it must take the following form

(6.3.4)#\[\delta I = -\int d^4 x J^{\mu}(x) \p_{\mu} \epsilon(x)\]

because it must vanish when \(\epsilon(x)\) is constant. Here \(J^{\mu}(x)\) is a function(al) to be determined in individual cases, and is usually known as current. Now if \(\Psi_n(x)\) satisfies the field equations, i.e., it’s a stationary point of the action, then Eq.6.3.4 must vanishes for any \(\epsilon(x)\). Applying integration by parts (and assuming \(\Fscr_n(x)\) vanishes at infinity), we must have

(6.3.5)#\[\p_{\mu} J^{\mu}(x) = 0\]

which is the conservation law for \(J\), which then can be called a conserved current. One gets also a conserved quantity, i.e., a quantity that doesn’t change by time, by integrating Eq.6.3.5 over the \(3\)-space as follows

\[\begin{split}\dot{J}^0(x) = -\nabla \cdot \Jbf(x) & \implies \int d^3 x~\dot{J}^0(x) = -\int d^3 x~\nabla \cdot \Jbf(x) = 0 \\ & \implies F \coloneqq \int d^3 x~J^0(x) \text{ is conserved.}\end{split}\]

Unfortunately, not much more can be said about the conserved current \(J\) at this level of generality. This is, however, not the case if one imposes stronger assumptions on the symmetry, as we now explain.

Lagrangian-preserving symmetry

This is the first strengthening of the symmetry assumption. Namely, instead of assuming that the variation Eq.6.3.1 fixes the action, we assume that it fixes the Lagrangian itself. Namely,

(6.3.6)#\[\delta L = \ifrak \epsilon \sum_n \int d^3 x \left( \frac{\delta L}{\delta \Psi_n(t, \xbf)} \Fscr_n(t, \xbf) + \frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \dot{\Fscr}_n(t, \xbf) \right) = 0\]

Now let \(\epsilon(t)\) be a time-dependent infinitesimal in Eq.6.3.3. Then we can calculate \(\delta I\) under such variation as follows

\[\begin{split}\delta I &= \ifrak \sum_n \int dt \int d^3 x \left( \frac{\delta L}{\delta \Psi_n(t, \xbf)} \epsilon(t) \Fscr_n(t, \xbf) + \frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \frac{d}{dt} \big( \epsilon(t) \Fscr_n(t, \xbf) \big) \right) \\ &= \ifrak \sum_n \int dt \int d^3 x~\frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \dot{\epsilon}(t) \Fscr_n(t, \xbf)\end{split}\]

Comparing with Eq.6.3.4, we can derive an explicit formula for the conserved quantity as follows

(6.3.7)#\[F = -\ifrak \sum_n \int d^3 x~\frac{\delta L}{\delta \dot{\Psi}_n(t, \xbf)} \Fscr_n(t, \xbf)\]

Indeed, one can verify directly that \(\dot{F}(t) = 0\) using Eq.6.3.6 together with the field equations Eq.6.2.1 and Eq.6.2.2.

Lagrangian-density-preserving symmetry

Taking the previous assumption further, let’s impose the even stronger condition that the Lagrangian density is invariant under Eq.6.3.1. It means that

(6.3.8)#\[\delta \Lscr = \ifrak \epsilon \sum_n \left( \frac{\p \Lscr}{\p \Psi_n} \Fscr_n(x) + \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \p_{\mu} \Fscr_n(x) \right) = 0\]

Now under Eq.6.3.3, we can calculate the variation of the action as follows

\[\begin{split}\delta I &= \ifrak \sum_n \int d^4 x~\left( \frac{\p \Lscr}{\p \Psi_n} \epsilon(x) \Fscr_n(x) + \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \p_{\mu} \big( \epsilon(x) \Fscr_n(x) \big) \right) \\ &= \ifrak \sum_n \int d^4 x~\frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \Fscr_n(x) \p_{\mu}\epsilon(x)\end{split}\]

Comparing with Eq.6.3.4 as before, we can derive an explicit formula for the conserved current as follows

(6.3.9)#\[J^{\mu}(x) = -\ifrak \sum_n \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} \Fscr_n(x)\]

Once again, one can directly verify that \(\p_{\mu} J^{\mu}(x) = 0\) using Eq.6.3.8 together with the Euler-Lagrange equation Eq.6.2.5.

So far everything has been completely classical. To make it a quantum theory, we’ll involve the canonical fields introduced in Hamiltonian and Lagrangian for interacting fields. More precisely, instead of any \(\Fscr_n(t, \xbf)\), we’ll suppose that it takes the following form

\[\Fscr_n(Q(t), \xbf)\]

where \(Q(t)\) is defined by Eq.6.1.13. Next, recall from Eq.6.2.8 that the field \(\Psi_n\) is either a \(Q_n\), in which case \(\delta L / \delta \dot{Q}_n = P_n\), or a \(C_n\), in which case the functional derivative vanishes.

Now in the case of a Lagrangian-preserving symmetry, the conserved quantity Eq.6.3.7 takes the following form

(6.3.10)#\[F = -\ifrak \sum_n \int d^3 x~P_n(t, \xbf) \Fscr_n(Q(t), \xbf)\]

which of course is time-independent. Moreover, one can show that \(F\) in fact generates the quantum symmetry in the following sense

(6.3.11)#\[\left[ F, Q_n(t, \xbf) \right] = -\ifrak \sum_m \int d^3 y~\left[ P_m(t, \ybf), Q_n(t, \xbf) \right] \Fscr_m(Q(t), \ybf) = -\Fscr_n(Q(t), \xbf)\]

where we’ve taken advantage of the time-independency of \(F\) to arrange the same-time commutator.

6.3.1. Spacetime translations#

So far the symmetries have been rather abstract, to make it more explicit, and also to get warmed up for the general case, let’s assume the Lagrangian is invariant under the (spacetime) translation transformation given as follows

\[\Psi_n(x) \to \Psi_n(x + \epsilon) = \Psi_n(x) + \epsilon^{\mu} \p_{\mu} \Psi_n(x)\]

Comparing with Eq.6.3.1 we see that

\[\Fscr_{\mu} = -\ifrak \p_{\mu} \Psi_n\]

It follows from Eq.6.3.4 and Eq.6.3.5 that there exists a conserved \(4\)-current \({T^{\nu}}_{\mu}\), which is known as the energy-momentum tensor, such that

\[\p_{\nu} {T^{\nu}}_{\mu} = 0\]

The corresponding conserved currents then take the form

(6.3.12)#\[P_{\mu} \coloneqq \int d^3 x~{T^0}_{\mu}\]

such that \(\dot{P}_{\mu} = 0\). Here it’s important to not confuse \(P_{\mu}\) with a canonical variable – it’s just a conserved quantity which turns out to be the \(4\)-momentum.

Now recall from Eq.6.2.4 that the Lagrangian is usually the spatial integral of a density functional. Hence it’s not unreasonable to suppose that the Lagrangian is indeed invariant under spatial translations. Under this assumption, we can rewrite Eq.6.3.10 as follows

(6.3.13)#\[\Pbf \coloneqq -\sum_n \int d^3 x~P_n(t, \xbf) \nabla Q_n(t, \xbf)\]

with the understanding that \(\Psi_n = Q_n\).

To verify that \(\Pbf\) indeed generates spatial translations, let’s calculate using the fact that \(\Pbf\) is time-independent as follows

\[\begin{split}\left[ \Pbf, Q_n(t, \xbf) \right] &= \ifrak \nabla Q_n(t, \xbf) \\ \left[ \Pbf, P_n(t, \xbf) \right] &= \ifrak \nabla P_n(t, \xbf)\end{split}\]

It follows that

(6.3.14)#\[\left[ \Pbf, \Gscr \right] = \ifrak \nabla \Gscr\]

for any functional \(\Gscr\) that doesn’t explicitly involve \(\xbf\). This verifies that \(\Pbf\) indeed generates the spatial translation.

In contrast, one cannot hope that the Lagrangian to be invariant under time translation, if there should be any interaction. But we already know the operator that generates time translation, namely, the Hamiltonian. In other words, we define \(P_0 \coloneqq -H\) such that

(6.3.15)#\[\left[ H, \Gscr \right] = -\ifrak \dot{\Gscr}\]

for any functional \(\Gscr\) that doesn’t explicitly involve \(t\).

In general, the Lagrangian density is not invariant under spacetime translations. However, it turns out that the conserved current, which in this case is \({T^{\mu}}_{\nu}\), can nonetheless be calculated. To spell out the details, let’s consider the following variation

\[\Psi_n(x) \to \Psi_n(x + \epsilon(x)) = \Psi_n(x) + \epsilon^{\mu}(x) \p_{\mu} \Psi_n(x)\]

The corresponding variation of the action is given as follows

\[\begin{split}\delta I[\Psi] &= \sum_n \int d^4 x \left( \frac{\p \Lscr}{\p \Psi_n} \epsilon^{\mu} \p_{\mu} \Psi_n + \frac{\p \Lscr}{\p (\p_{\nu} \Psi_n)} \p_{\nu}(\epsilon^{\mu} \p_{\mu} \Psi_n) \right) \\ &= \int d^4 x \left( \epsilon^{\mu} \p_{\mu} \Lscr + \sum_n \frac{\p \Lscr}{\p (\p_{\nu} \Psi_n)} \p_{\mu}\Psi_n \p_{\nu} \epsilon^{\mu} \right) \\ &= -\int d^4 x \left( \delta^{\nu}_{\mu} \Lscr - \sum_n \frac{\p \Lscr}{\p (\p_{\nu} \Psi_n)} \p_{\mu} \Psi_n \right) \p_{\nu} \epsilon^{\mu}\end{split}\]

where we’ve used the chain rule for derivatives in the second equality, and integration by parts in the third. Comparing with Eq.6.3.4, we see that

(6.3.16)#\[{T^{\nu}}_{\mu} = \delta^{\nu}_{\mu} \Lscr - \sum_n \frac{\p \Lscr}{\p (\p_{\nu} \Psi_n)} \p_{\mu} \Psi_n\]

Note

The energy-momentum tensor \({T^{\nu}}_{\mu}\) is not yet suitable for general relativity since it’s not symmetric. As we’ll see in Lorentz symmetry, when taking homogeneous Lorentz transformation symmetry into account, one can supplement \({T^{\nu}}_{\mu}\) with some extra terms to make it both conserved and symmetric.

Indeed, this calculation recovers Eq.6.3.13 by letting \(\nu = 0\) and \(\mu \neq 0\). Moreover, it recovers the Hamiltonian by letting \(\mu = \nu = 0\) as follows

\[H = -P_0 = -\int d^3 x~{T^0}_0 = \int d^3 x \left( \sum_n P_n \dot{Q}_n - \Lscr \right)\]

6.3.2. Linear transformations#

As another example, let’s consider linear variations as follows

\[\begin{split}Q_n(x) &\to Q_n(x) + \ifrak \epsilon^a {(t_a)_n}^m Q_m(x) \\ C_r(x) &\to C_r(x) + \ifrak \epsilon^a {(\tau_a)_r}^s C_s(x)\end{split}\]

where we’ve adopted the Einstein summation convention for repeated upper and lower indexes because it’d otherwise be too tedious to write out the summations. Here \((t_{\square})^{\square}_{\square}\) should furnish a representation of the Lie algebra of the symmetry group.

As before, the invariance of action under such variations implies the existence of conserved currents \(J^{\mu}_a\) such that

\[\p_{\mu} J^{\mu}_a = 0\]

as well as the conserved quantity

\[T_a \coloneqq \int d^3 x~J^0_a\]

If, in addition, the Lagrangian is invariant under such variations, then \(T_a\) takes the following form by Eq.6.3.10

\[T_a = -\ifrak \int d^3 x~P_n(t, \xbf) {(t_a)^n}_m Q^m(t, \xbf)\]

It follows that

\[\begin{split}\left[ T_a, Q^n(x) \right] &= -{(t_a)^n}_m Q^m(x) \\ \left[ T_a, P_n(x) \right] &= {(t_a)_n}^m P_m(x)\end{split}\]

In particular, when \(t_a\) is diagonal (e.g., in electrodynamics), the operators \(Q^n\) and \(P_n\) may be regarded as raising/lowering operators. In fact, we claim that \(T_a\) form a Lie algebra by the following calculation

\[\begin{split}\left[ T_a, T_b \right] &= -\left[ \int d^3 x~P_n(t, \xbf) {(t_a)^n}_m Q^m(t, \xbf), \int d^3 y~P_r(t, \ybf) {(t_b)^r}_s Q^s(t, \ybf) \right] \\ &= -\int d^3 x~d^3 y~{(t_a)^n}_m {(t_b)^r}_s \left[ P_n(t, \xbf) Q^m(t, \xbf), P_r(t, \ybf) Q^s(t, \ybf) \right] \\ &= -\int d^3 x~d^3 y~{(t_a)^n}_m {(t_b)^r}_s \Big( P_n(t, \xbf) \left[ Q^m(t, \xbf), P_r(t, \ybf) \right] Q^s(t, \ybf) \\ &\qquad - P_r(t, \ybf) \left[ Q^s(t, \ybf), P_n(t, \xbf) \right] Q^m(t, \xbf) \Big) \\ &= -\ifrak \int d^3 x \Big( {(t_a)^n}_m {(t_b)^m}_s P_n(t, \xbf) Q^s(t, \xbf) - {(t_a)^n}_m {(t_b)^r}_n P_r(t, \xbf) Q^m(t, \xbf) \Big) \\ &= -\ifrak \int d^3 x~{\left[ t_a, t_b \right]^n}_m P_n(t, \xbf) Q^m(t, \xbf)\end{split}\]

Now if \(t_a\) form a Lie algebra with structure constants \({f_{ab}}^c\) as follows

\[\left[ t_a, t_b \right] = \ifrak {f_{ab}}^c t_c\]

then

\[\left[ T_a, T_b \right] = \ifrak {f_{ab}}^c T_c\]

In other words, the conserved quantities also form the same Lie algebra.

Now if, in addition, the Lagrangian density is also invariant, then Eq.6.3.9 takes the following form

(6.3.17)#\[J^{\mu}_a = -\ifrak \frac{\p \Lscr}{\p (\p_{\mu} Q_n)} {(t_a)_n}^m Q_m - \ifrak \frac{\p \Lscr}{\p (\p_{\mu} C_r)} {(\tau_a)_r}^s C_s\]

Note that since \(\Lscr\) doesn’t have \(\dot{C}_r\) dependencies, we have the following by letting \(\mu = 0\) in Eq.6.3.17

\[J^0_a = -\ifrak P^n {(t_a)_n}^m Q_m\]

whose equal-time commutation relations with canonical variables \(P\) and \(Q\) can be easily calculated.

6.3.3. Lorentz invariance#

The goal of this section is to show that the Lorentz invariance of the Lagrangian density implies the Lorentz invariance of the S-matrix, which justifies our interest in the Lagrangian formalism in the first place.

Recall from Eq.1.2.16 and Eq.1.2.17 that

(6.3.18)#\[\begin{split}{\Lambda_{\mu}}^{\nu} &= {\delta_{\mu}}^{\nu} + {\omega_{\mu}}^{\nu} \\ \omega_{\mu \nu} &= -\omega_{\nu \mu}\end{split}\]

is a \((\mu, \nu)\)-parametrized anti-symmetric variation. It follows then from Eq.6.3.4 that there exist \((\mu, \nu)\)-parametrized anti-symmetric conserved currents as follows

(6.3.19)#\[\begin{split}\p_{\rho} \Mscr^{\rho \mu \nu} &= 0 \\ \Mscr^{\rho \mu \nu} &= -\Mscr^{\rho \nu \mu}\end{split}\]

which, in turn, make conversed quantities

(6.3.20)#\[J^{\mu \nu} \coloneqq \int d^3 x~\Mscr^{0 \mu \nu}\]

such that \(\dot{J}^{\mu \nu} = 0\). These, as we’ll see, turn out to be rather familiar objects that we’ve encountered as early as in Eq.1.2.18.

In light of Eq.6.3.9, one can work out an explicit formula for \(\Mscr^{\rho \mu \nu}\) if the Lagrangian density is invariant under the symmetry transformation. Now since the Lagrangian density is expressed in terms of quantum fields, one’d like to know how they transform under Lorentz transformations. Since the translation symmetry has already been dealt with in Spacetime translations, we’ll consider here homogeneous Lorentz transformations. Luckily this has been worked out already in Quantum Fields and Antiparticles. More precisely, recall from Eq.4.4.1 that the variation term can be written as follows

\[\delta \Psi_n = \frac{\ifrak}{2} \omega^{\mu \nu} {(\Jscr_{\mu \nu})_n}^m \Psi_m\]

where \(\Jscr\) are matrices satisfying Eq.4.4.2. The corresponding derivatives then have the following variation term

\[\delta (\p_{\kappa} \Psi_n) = \frac{\ifrak}{2} \omega^{\mu \nu} {(\Jscr_{\mu \nu})_n}^m \p_{\kappa} \Psi_m + {\omega_{\kappa}}^{\lambda} \p_{\lambda} \Psi_n\]

where the second summand on the right-hand-side corresponds to the fact the the Lorentz transformation also acts on the spacetime coordinates.

Now the invariance of the Lagrangian density under such variation can be written as follows

\[\frac{\p \Lscr}{\p \Psi_n} \frac{\ifrak}{2} \omega^{\mu \nu} {(\Jscr_{\mu \nu})_n}^m \Psi_m + \frac{\p \Lscr}{\p (\p_{\kappa} \Psi_n)} \left( \frac{\ifrak}{2} \omega^{\mu \nu} ({\Jscr_{\mu \nu})_n}^m \p_{\kappa} \Psi_m + {\omega_{\kappa}}^{\lambda} \p_{\lambda} \Psi_n \right) = 0\]

Since \(\omega^{\mu \nu}\) is not in general zero, its coefficient must be zero, which, taking Eq.6.3.18 into account, implies the following

(6.3.21)#\[\begin{split}& \frac{\ifrak}{2} \frac{\p \Lscr}{\p \Psi_n} {(\Jscr_{\mu\nu})_n}^m \Psi_m + \frac{\ifrak}{2} \frac{\p \Lscr}{\p (\p_{\kappa} \Psi_n)} ({\Jscr_{\mu\nu})_n}^m \p_{\kappa}\Psi_m \\ & \qquad + \frac{1}{2} \frac{\p \Lscr}{\p (\p_{\kappa} \Psi_n)} \left( \eta_{\kappa \mu} \p_{\nu} - \eta_{\kappa \nu} \p_{\mu} \right) \Psi_n = 0\end{split}\]

Using the Euler-Lagrange equation Eq.6.2.5, we can get rid of the \(\delta\Lscr / \delta\Psi_n\) term in Eq.6.3.21 to arrive at the following

(6.3.22)#\[\ifrak \p_{\kappa} \left( \frac{\p \Lscr}{\p (\p_{\kappa} \Psi_n)} {(\Jscr_{\mu\nu})_n}^m \Psi_m \right) - T_{\mu\nu} + T_{\nu\mu} = 0\]

where we’ve also used Eq.6.3.16. Now we can address the issue of energy-momentum tensor not being symmetric by introducing the following so-called Belinfante tensor

(6.3.23)#\[\begin{split}\Theta_{\mu\nu} &\coloneqq T_{\mu\nu} - \frac{\ifrak}{2} \p_{\kappa} \Big( \frac{\p \Lscr}{\p (\p_{\kappa} \Psi_n)} {(\Jscr_{\mu\nu})_n}^m \Psi_m - \frac{\p \Lscr}{\p (\p_{\mu} \Psi_n)} {(\Jscr_{\kappa\nu})_n}^m \Psi_m \\ &\qquad - \frac{\p \Lscr}{\p (\p_{\nu} \Psi_n)} {(\Jscr_{\kappa\mu})_n}^m \Psi_m \Big)\end{split}\]

which is both conserved in the sense that

(6.3.24)#\[\p^{\mu} \Theta_{\mu\nu} = 0\]

and symmetric in the sense that

(6.3.25)#\[\Theta_{\mu\nu} = \Theta_{\nu\mu}\]

Indeed Eq.6.3.24 follows from the observation that the term inside the parenthesis of Eq.6.3.23 is anti-symmetric in \(\mu\) and \(\kappa\), and Eq.6.3.25 is a direct consequence of Eq.6.3.22.

The conserved quantities corresponding to \(\Theta_{\mu\nu}\), according to Eq.6.3.12 are

(6.3.26)#\[\int d^3 x~{\Theta^0}_\nu = \int d^3 x~{T^0}_\nu = P_{\nu}\]

where the first equality holds because, again, the item in the parenthesis of Eq.6.3.23 is anti-symmetric is \(\mu\) and \(\kappa\), and therefore \(\kappa \neq 0\) given \(\mu = 0\). Hence it’s at least equally legitimate to call \(\Theta_{\mu \nu}\) the energy-momentum tensor. Indeed, the fact that \(\Theta_{\mu \nu}\) is the symmetric makes it suitable for general relativity.

Unlike the other conserved currents, which are derived under the general principles explained in Global symmetries, we’ll construct the anti-symmetric \(\Mscr^{\rho \mu \nu}\) declared in Eq.6.3.19 by hand as follows

\[\Mscr^{\rho\mu\nu} \coloneqq x^{\mu} \Theta^{\rho\nu} - x^{\nu} \Theta^{\rho\mu}\]

While Eq.6.3.18 is automatically satisfied by definition, we can verify Eq.6.3.19, using Eq.6.3.24 and Eq.6.3.25, as follows

\[\p_{\rho} \Mscr^{\rho\mu\nu} = \Theta^{\mu\nu} - \Theta^{\nu\mu} = 0\]

Moreover Eq.6.3.20 takes the following form

(6.3.27)#\[J^{\mu\nu} = \int d^3 x \left( x^{\mu} \Theta^{0\nu} - x^{\nu} \Theta^{0\mu} \right)\]

Now if we consider the rotation generators defined by

\[J_i \coloneqq \tfrac{1}{2} \epsilon_{ijk} J^{jk}\]

then it follows from Eq.6.3.15 that

\[[H, \Jbf] = -\ifrak \dot{\Jbf} = 0\]

since \(\Jbf\) doesn’t implicitly involve \(t\). This recovers one of the commutation relations Eq.1.2.22 for the Poincaré algebra. Next, let’s verify the commutation relation between \(\Pbf\) and \(\Jbf\), using Eq.6.3.14 and Eq.6.3.26, as follows

\[\begin{split}[P_i, J_j] &= \frac{1}{2} \epsilon_{jk\ell} \left[ P_i, J^{k\ell} \right] \\ &= \frac{\ifrak}{2} \epsilon_{jk\ell} \int d^3x \left( x^k \p_i \Theta^{0\ell} - x^{\ell} \p_i \Theta^{0k} \right) \\ &= \frac{\ifrak}{2} \epsilon_{jk\ell} \int d^3x \left( -\delta^k_i \Theta^{0\ell} + \delta^{\ell}_i \Theta^{0k} \right) \\ &= \ifrak \epsilon_{ijk} \int d^3x~\Theta^{0k} \\ &= \ifrak \epsilon_{ijk} P^k\end{split}\]

What come next are the boost operators defined as follows [1]

(6.3.28)#\[K^i \coloneqq J^{0i} = \int d^3x \left( x^0 \Theta^{0i} - x^i \Theta^{00} \right)\]

Bringing down the index, we can rewrite Eq.6.3.28 in vector form as follows

(6.3.29)#\[\Kbf = t \Pbf - \int d^3 x~\xbf \Theta^{00}\]

Now it follows from Eq.6.3.15 that

\[\begin{split}[H, \Kbf] &= t[H, \Pbf] + \ifrak \int d^3 x~\xbf \dot{\Theta}^{00} \\ &= \ifrak \int d^3 x~\xbf \dot{\Theta}^{00} \\ &= \ifrak (\Pbf - \dot{\Kbf}) = \ifrak \Pbf\end{split}\]

which is consistent with Eq.1.2.22.

Finally, using Eq.6.3.14 and Eq.6.3.29 together with the fact that \(\Pbf\) commutes with itself, one can evaluate the commutator between \(\Pbf\) and \(\Kbf\) as follows

\[\left[ P_j, K_k \right] = -\ifrak \int d^3 x~x_k \p_j \Theta^{00} = \ifrak \delta_{kj} \int d^3 x \Theta^{00} = \ifrak \delta_{kj} P^0 = \ifrak \delta_{kj} H\]

which, again, is consistent with Eq.1.2.22.

It turns out, following the lines of argument in Lorentz symmetry of S-matrix, these commutation relations are enough to show the Lorentz invariance of S-matrix under the same “smoothness” assumptions on the interaction terms. In addition, the other commutation relations between \(H, \Pbf, \Jbf, \Kbf\) also follows.

Though not necessary, it’s indeed possible to verify the other Poincaré algebra relations directly. In particular, the commutation relations between the rotation generators are verified as follows.

An explicit formula for rotation generators \(J^{ij}\)

According to Eq.6.3.27, Eq.6.3.23, and Eq.6.3.16, the rotation generator \(J^{ij}\) can be calculated as follows

\[\begin{split}J^{ij} &= \int d^3 x \left( x^i \Theta^{0j} - x^j \Theta^{0i} \right) \\ &= \int d^3 x \left( x^i T^{0j} - x^j T^{0i} \right) \\ &\mkern-24mu - \frac{\ifrak}{2} \int d^3 x~x^i \p_k \left( \frac{\p \Lscr}{\p (\p_k \Psi_n)} {\left( \Jscr^{0j} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p \dot{\Psi}_n} {\left( \Jscr^{kj} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p (\p_j \Psi_n)} {\left( \Jscr^{k0} \right)_n}^m \Psi_m \right) \\ &\mkern-24mu + \frac{\ifrak}{2} \int d^3 x~x^j \p_k \left( \frac{\p \Lscr}{\p (\p_k \Psi_n)} {\left( \Jscr^{0i} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p \dot{\Psi}_n} {\left( \Jscr^{ki} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p (\p_i \Psi_n)} {\left( \Jscr^{k0} \right)_n}^m \Psi_m \right) \\ &= \int d^3 x \left( x^i T^{0j} - x^j T^{0i} \right) \\ &\mkern-24mu + \frac{\ifrak}{2} \int d^3 x \left( \frac{\p \Lscr}{\p (\p_i \Psi_n)} {\left( \Jscr^{0j} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p \dot{\Psi}_n} {\left( \Jscr^{ij} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p (\p_j \Psi_n)} {\left( \Jscr^{i0} \right)_n}^m \Psi_m \right) \\ &\mkern-24mu - \frac{\ifrak}{2} \int d^3 x \left( \frac{\p \Lscr}{\p (\p_j \Psi_n)} {\left( \Jscr^{0i} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p \dot{\Psi}_n} {\left( \Jscr^{ji} \right)_n}^m \Psi_m - \frac{\p \Lscr}{\p (\p_i \Psi_n)} {\left( \Jscr^{j0} \right)_n}^m \Psi_m \right) \\ &= \int d^3 x \left( x^i T^{0j} - x^j T^{0i} \right) - \ifrak \int d^3 x \frac{\p \Lscr}{\p \dot{\Psi}_n} {\left( \Jscr^{ij} \right)_n}^m \Psi_m \\ &= \int d^3 x \frac{\p \Lscr}{\p \dot{\Psi}_n} \left( -x^i \p^j \Psi_n + x^j \p^i \Psi_n - \ifrak {\left( \Jscr^{ij} \right)_n}^m \Psi_m \right)\end{split}\]

Now since \(\p \Lscr / \p \dot{\Psi}_n\) vanishes when \(\Psi_n\) is an auxiliary field, we can rewrite \(J^{ij}\) in terms of canonical variables as follows

\[J^{ij} = \int d^3 x~P^n \left( -x^i \p^j Q_n + x^j \p^i Q_n - \ifrak {\left( \Jscr^{ij} \right)_n}^m Q_m \right)\]

The commutator between \(J^{ij}\) and the canonical variables follows

\[\begin{split}\left[ J^{ij}, Q_n \right] &= -\ifrak \left( -x^i \p^j + x^j \p^i \right) Q_n - {\left( \Jscr^{ij} \right)_n}^m Q_m \\ \left[ J^{ij}, P^n \right] &= \ifrak \left( -x^i \p^j + x^j \p^i \right) P^n + {\left( \Jscr^{ij} \right)_m}^n P^m\end{split}\]

where we’ve used integration-by-parts in the second equality. The standard commutation relation between the components of \(\Jbf\) follows readily.

6.4. Transition to Interaction Picture#

In this section, we will investigate, through examples, how to derive from the Lagrangian formalism an interaction picture, on which our entire approach to quantum field theory has been based. As a byproduct, we will also generalize the quantization procedure considered in Quantization of Free Scalar Fields.

6.4.1. Scalar field with derivative coupling#

In light of the Lagrangian Eq.6.1.12 for free scalar field, let’s consider the following Lagrangian density with derivative coupling and interaction

(6.4.1)#\[\Lscr = -\frac{1}{2} \p_{\mu} \Phi \p^{\mu} \Phi - \frac{1}{2} m^2 \Phi^2 - J^{\mu} \p_{\mu} \Phi - \Hscr(\Phi)\]

where coupling \(J^{\mu}\) may be either a scalar current or a functional of other fields, and should not be confused with the conserved quantity defined in Eq.6.3.27.

Now the canonical conjugate variable \(\Pi\) is, according to Eq.6.2.1, given by

(6.4.2)#\[\Pi \coloneqq \frac{\p \Lscr}{\p \dot{\Phi}} = \dot{\Phi} - J^0\]

and the Hamiltonian is, according to Eq.6.2.9 and Eq.6.4.2, given by

\[\begin{split}H &= \int d^3 x \left( \Pi \dot{\Phi} - \Lscr \right) \\ &= \int d^3 x \left( \Pi (\Pi + J^0) - \frac{1}{2} (\Pi + J^0)^2 + \frac{1}{2} (\nabla \Phi)^2 + \frac{1}{2} m^2 \Phi^2 \right. \\ &\qquad \left. + J^0 (\Pi + J^0) + \Jbf \cdot \nabla \Phi + \Hscr(\Phi) \right) \\ &= \int d^3 x \left( \frac{1}{2} \Pi^2 + \frac{1}{2} (\nabla \Phi)^2 + \frac{1}{2} m^2 \Phi^2 + \Pi J^0 + \frac{1}{2} (J^0)^2 + \Jbf \cdot \nabla \Phi + \Hscr(\Phi) \right)\end{split}\]

In light of Eq.6.1.15, we recognize the first three summands in the integrand as the free Hamiltonian density, and the rest as the interaction density. More explicitly, we can rewrite \(H\) as follows

\[\begin{split}H &= H_0 + V \\ H_0 &= \frac{1}{2} \int d^3 x \left( \Pi^2 + (\nabla \Phi)^2 + m^2 \Phi^2 \right) \\ V &= \int d^3 x \left( \Pi J^0 + \frac{1}{2} (J^0)^2 + \Jbf \cdot \nabla \Phi + \Hscr(\Phi) \right)\end{split}\]

Now we can pass to the interaction picture in the sense of Eq.2.5.7 as follows

\[\begin{split}H_0 &= \frac{1}{2} \int d^3 x \left( \pi^2(t, \xbf) + (\nabla \phi(t, \xbf))^2 + m^2 \phi^2(t, \xbf) \right) \\ V(t) &= \int d^3 x \left( \pi(t, \xbf) J^0(t, \xbf) + \frac{1}{2} (J^0(t, \xbf))^2 + \Jbf(t, \xbf) \cdot \nabla \phi(t, \xbf) + \Hscr(\phi(t, \xbf)) \right)\end{split}\]

where, for example, \(\pi(t, \xbf) \coloneqq e^{\ifrak H_0 t} \Pi(0, \xbf) e^{-\ifrak H_0 t}\). Moreover, we note that the free Hamiltonian \(H_0\) is time-independent, and recovers Eq.6.2.11.

Finally, in order to get the interaction density in terms of fields as explained in Quantum Fields and Antiparticles, we simply replace \(\pi(t, \xbf)\) with \(\dot{\phi}(t, \xbf)\) to get the following

(6.4.3)#\[V(t) = \int d^3 x \left( J^{\mu}(t, \xbf) \phi_{\mu}(t, \xbf) + \frac{1}{2} (J^0(t, \xbf))^2 + \Hscr(t, \xbf) \right)\]

It’s said in [Wei95] that the manifestly non-Lorentz-invariant summand \(\tfrac{1}{2} (J^0(t, \xbf))^2\) corresponds exactly to the local term in Eq.5.2.14, but I haven’t been able to see how.

6.4.2. Vector field with spin-\(1\)#

We start with a very general Lagrangian density defined as follows

(6.4.4)#\[\Lscr = -\frac{1}{2} \alpha \p_{\mu} V_{\nu} \p^{\mu} V^{\nu} - \frac{1}{2} \beta \p_{\mu} V_{\nu} \p^{\nu} V^{\mu} - \frac{1}{2} m^2 V_{\mu} V^{\mu} + J^{\mu} V_{\mu}\]

where \(J^{\mu}\) is a coupling current just as in the case of scalar fields. As explained in Vector Fields, vector fields may come with spin \(0\) or \(1\), and we would like to consider here only the spin-\(1\) part. To achieve this, consider the Euler-Lagrange equation Eq.6.2.5 which takes the following form

(6.4.5)#\[-m^2 V^{\mu} + J^{\mu} = -\alpha \square V^{\mu} - \beta \p^{\mu} (\p_{\nu} V^{\nu})\]

Taking the divergence, we get

\[-(\alpha + \beta) \square (\p_{\mu} V^{\mu}) + m^2 (\p_{\mu} V^{\mu}) = \p_{\mu} J^{\mu}\]

which we recognize, in light of Eq.6.2.12, as the field equation of a scalar field, or more precisely, a spin-\(0\) vector field. To avoid the appearance of \(\p_{\mu} V^{\mu}\) as an independently propagating (scalar) field, we take \(\alpha = -\beta = 1\), so that \(\p_{\mu} V^{\mu}\) may be expressed in terms of the external current \(J\).

Now we can rewrite Eq.6.4.4 as follows

(6.4.6)#\[\Lscr = -\frac{1}{4} F_{\mu \nu} F^{\mu \nu} - \frac{1}{2} m^2 V_{\mu} V^{\mu} + J^{\mu} V_{\mu}\]

where

\[F_{\mu \nu} \coloneqq \p_{\mu} V_{\nu} - \p_{\nu} V_{\mu}\]

in analogous to Eq.4.7.19, where we’ve tried to construct a tensor for massless spin-\(1\) particles. This is the general Lagrangian for spin-\(1\) vector fields.

To work out the canonical variables, we note that

(6.4.7)#\[\frac{\p \Lscr}{\p \dot{V}_{\mu}} = -F^{0\mu}\]

which is nonzero for \(\mu \neq 0\). It follows that for spatial indexes \(i\), we have the the canonical variables \(V_i\) whose canonical dual is, according to Eq.6.2.1, given by

(6.4.8)#\[\Pi^i = \frac{\p \Lscr}{\p \dot{V}_i} = F^{i0} = \p^i V^0 + \dot{V}^i\]

while \(V_0\) is auxiliary since \(\p \Lscr / \p \dot{V}_0 = 0\). It turns out \(V_0\) can be explicitly solved in terms of the other fields as follows. Setting \(\mu = 0\) in Eq.6.4.5 and remembering \(\alpha = -\beta = 1\), we have

(6.4.9)#\[\begin{split}& -m^2 V^0 + J^0 = -\square V^0 + \p^0(\p_{\nu} V^{\nu}) = \p_{\nu} F^{0 \nu} = \p_i F^{0i} \\ \implies & V^0 = \frac{1}{m^2} (\p_i F^{i0} + J^0) = \frac{1}{m^2} (\nabla \cdot \bm{\Pi} + J^0)\end{split}\]

To be able to write down the Hamiltonian, we need the following preparations. First, using Eq.6.4.8, we can write

\[\dot{\Vbf} = \bm{\Pi} - \nabla V^0 = \bm{\Pi} - \frac{1}{m^2} \nabla(\nabla \cdot \bm{\Pi} + J^0)\]

and second, we can expand

\[\frac{1}{4} F_{\mu \nu} F^{\mu \nu} = \frac{1}{2} F_{0i} F^{0i} + \frac{1}{4} F_{ij} F^{ij} = -\frac{1}{2} \bm{\Pi}^2 + \frac{1}{2} (\nabla \times \Vbf)^2\]

Putting these all together, we can finally write down the Hamiltonian as follows

(6.4.10)#\[\begin{split}H &= \int d^3 x \left( \bm{\Pi} \cdot \dot{\Vbf} - \Lscr \right) \\ &= \int d^3 x \left( \bm{\Pi} \cdot \dot{\Vbf} + \frac{1}{4} F_{\mu \nu} F^{\mu \nu} + \frac{1}{2} m^2 V_{\mu} V^{\mu} - J^{\mu} V_{\mu} \right) \\ &= \int d^3 x \Big( \bm{\Pi}^2 + \frac{1}{m^2} (\nabla \cdot \bm{\Pi}) (\nabla \cdot \bm{\Pi} + J^0) \\ &\qquad - \frac{1}{2} \bm{\Pi}^2 + \frac{1}{2} (\nabla \times \Vbf)^2 \\ &\qquad - \frac{1}{2m^2} (\nabla \cdot \bm{\Pi} + J^0)^2 + \frac{1}{2} m^2 \Vbf^2 \\ &\qquad + \frac{1}{m^2} J^0 (\nabla \cdot \bm{\Pi} + J^0) - \Jbf \cdot \Vbf \Big) \\ &= \int d^3 x \Big( \frac{1}{2} \bm{\Pi}^2 + \frac{1}{2m^2} (\nabla \cdot \bm{\Pi})^2 + \frac{1}{2} (\nabla \times \Vbf)^2 + \frac{1}{2} m^2 \Vbf^2 \\ &\qquad + \frac{1}{m^2} J^0 \nabla \cdot \bm{\Pi} + \frac{1}{2m^2} (J^0)^2 - \Jbf \cdot \Vbf \Big)\end{split}\]

where the last equality serves the purpose of separating the Hamiltonian \(H = H_0 + V\) into the free part and the interacting part.

Now we pass to the interaction picture by writing

(6.4.11)#\[\begin{split}H_0 &= \int d^3 x \left( \frac{1}{2} \bm{\pi}^2 + \frac{1}{2m^2} (\nabla \cdot \bm{\pi})^2 + \frac{1}{2} (\nabla \times \vbf)^2 + \frac{1}{2} m^2 \vbf^2 \right) \\ V(t) &= \int d^3 x \left( \frac{1}{m^2} J^0 \nabla \cdot \bm{\pi} + \frac{1}{2m^2} (J^0)^2 - \Jbf \cdot \bm{\vbf} \right)\end{split}\]

The Hamilton’s equations Eq.6.1.7 take the following form (cf. Eq.6.2.7)

(6.4.12)#\[\begin{split}\begin{alignat*}{2} \dot{\vbf} &= \frac{\delta H_0}{\delta \bm{\pi}} &&= \bm{\pi} - \frac{1}{m^2} \nabla(\nabla \cdot \bm{\pi}) \\ \dot{\bm{\pi}} &= -\frac{\delta H_0}{\delta \vbf} &&= m^2 \vbf - \nabla \times (\nabla \times \vbf) = m^2 \vbf^2 + \nabla^2 \vbf - \nabla(\nabla \cdot \vbf) \end{alignat*}\end{split}\]

We’re still missing \(v^0\) since \(V^0\) was an auxiliary variable, which was solved by Eq.6.4.9. Inspired by Eq.6.4.9, let’s define

(6.4.13)#\[v^0 \coloneqq \frac{1}{m^2} \nabla \cdot \bm{\pi}\]

This enables us to rewrite Eq.6.4.11 as follows

\[V(t) = \int d^3 x \left( -J^{\mu} v_{\mu} + \frac{1}{2m^2} (J^0)^2 \right)\]

where we see again, just as in Eq.6.4.3, the non-Lorentz-invariant local term.

We can now solve for \(\bm{\pi}\) using Eq.6.4.13 and Eq.6.4.12 by

\[\bm{\pi} = \dot{\vbf} + \nabla v^0\]

and write down the field equations, again using Eq.6.4.13 and Eq.6.4.12 as follows

\[\begin{split}m^2 v^0 &= \nabla \cdot \dot{\vbf} + \nabla^2 v^0 \\ \ddot{\vbf} + \nabla \dot{v}^0 &= m^2 \vbf^2 + \nabla^2 \vbf - \nabla(\nabla \cdot \vbf)\end{split}\]

These two equations can be unified using the \(4\)-vector notation as follows

(6.4.14)#\[\square v^{\mu} - \p^{\mu} \p_{\nu} v^{\nu} - m^2 v^{\mu} = 0\]

Taking the divergence, we get

(6.4.15)#\[\p_{\mu} v^{\mu} = 0\]

in agreement with Eq.4.3.19. Moreover it follows that Eq.6.4.14 reduces to the Klein-Gordon equation

(6.4.16)#\[(\square - m^2) v^{\mu} = 0\]

General real solutions to Eq.6.4.15 and Eq.6.4.16 take the following form

\[v^{\mu} = (2\pi)^{-3/2} \sum_{\sigma} \int \frac{d^3 p}{\sqrt{2p^0}} \left( e^{\mu}(\pbf, \sigma) a(\pbf, \sigma) e^{\ifrak p \cdot x} + {e^{\mu}}^{\ast}(\pbf, \sigma) a^{\dagger}(\pbf, \sigma) e^{-\ifrak p \cdot x} \right)\]

where \(p^0 = \sqrt{\pbf^2 + m^2}\) and \(\sigma\) takes value in \(\{-1, 0, 1\}\) by convention so that \(e(\pbf, \sigma)\) correspond to the three \(4\)-vectors orthogonal to \(p\), namely

\[p_{\mu} e^{\mu}(\pbf, \sigma) = 0\]

As explained in Eq.4.3.9, Eq.4.3.12 and Eq.4.3.13, we may choose the spinors \(e^{\mu}(\pbf, \sigma)\) such that

\[\sum_{\sigma} e^{\mu}(\pbf, \sigma) {e^{\nu}}^{\ast}(\pbf, \sigma) = \eta^{\mu \nu} + m^{-2} p^{\mu} p^{\nu}\]

It’s then straightforward to verified the desired canonical commutation relations Eq.6.1.1

\[\begin{split}\left[ v^i(t, \xbf), \pi^j(t, \ybf) \right] &= \ifrak \delta^{ij} \delta(\xbf - \ybf) \\ \left[ v^i(t, \xbf), v^j(t, \xbf) \right] &= 0 \\ \left[ \pi^i(t, \xbf), \pi^j(t, \ybf) \right] &= 0\end{split}\]

given that the following hold

\[\begin{split}\left[ a(\pbf, \sigma), a^{\dagger}(\pbf', \sigma') \right] &= \delta_{\sigma \sigma'} \delta(\pbf - \pbf') \\ \left[ a(\pbf, \sigma), a(\pbf', \sigma') \right] &= 0\end{split}\]

This is a rather convincing evidence for the validity of the free field Hamiltonian Eq.6.4.11.

6.4.3. Dirac field with spin-\(1/2\)#

Recall from Eq.4.4.9 that the Dirac representation is not unitary, which eventually led to the definition of \(\bar{\psi}\) in Eq.4.4.45 for constructing interaction densities for Dirac fields. Motivated by discussions in Construction of the interaction density and the desire to make Lagrangian real, let’s consider the following

(6.4.17)#\[\Lscr = -\bar{\Psi} \left( \gamma^{\mu} \p_{\mu} + m \right) \Psi - \Hscr(\Psi, \bar{\Psi})\]

where \(\Hscr\) is a real function. Such \(\Lscr\) is nonetheless not real, which can be seen by the following calculation using Eq.4.4.10, Eq.4.4.12, and Eq.4.4.45

\[\begin{split}\bar{\Psi} \gamma^{\mu} \p_{\mu}\Psi - \left( \bar{\Psi} \gamma^{\mu} \p_{\mu}\Psi \right)^{\dagger} &= \bar{\Psi} \gamma^{\mu} \p_{\mu}\Psi - \left( \p_{\mu}\Psi^{\dagger} \right) {\gamma^{\mu}}^{\dagger} \beta \Psi \\ &= \bar{\Psi} \gamma^{\mu} \p_{\mu}\Psi + \left( \p_{\mu} \bar{\Psi} \right) \gamma^{\mu} \Psi \\ &= \p_{\mu} \left( \bar{\Psi} \gamma^{\mu} \Psi \right)\end{split}\]

However, the same calculation shows that the action, i.e., the spacetime integral of the Lagrangian, is real. It follows that one needs not to treat \(\Psi\) and \(\bar{\Psi}\) as independent variables since the field equations, given as the stationary point of the action functional, for \(\Psi\) is adjoint to that for \(\bar{\Psi}\). Therefore we can simply define the canonical conjugate

(6.4.18)#\[\Pi \coloneqq \frac{\p \Lscr}{\p \dot{\Psi}} = -\bar{\Psi} \gamma^0\]

and write the Hamiltonian

\[\begin{split}H &= \int d^3 x \left( \Pi \dot{\Psi} - \Lscr \right) \\ &= \int d^3 x \left( \bar{\Psi} (\gamma^i \p_i + m) \Psi + \Hscr \right) \\ &= \int d^3 x \left( \Pi \gamma^0 (\bm{\gamma} \cdot \nabla + m) \Psi + \Hscr \right)\end{split}\]

where the two summands in the last integrand give the usual splitting \(H = H_0 + V\).

Passing to the interaction picture, we can write

(6.4.19)#\[H_0 = \int d^3 x~\pi \gamma^0 (\bm{\gamma} \cdot \nabla + m) \psi\]

so that the field equation is simply

(6.4.20)#\[\dot{\psi} = \frac{\delta H_0}{\delta \pi} = \gamma^0 (\bm{\gamma} \cdot \nabla + m) \psi \iff (\gamma^{\mu} \p_{\mu} + m) \psi = 0\]

which is recognized as the Dirac equation Eq.4.4.37. Indeed, the other Hamilton’s equation \(\dot{\pi} = -\delta H_0 / \delta \psi\) gives nothing but the adjoint of the Dirac equation, hence no new information.

A general solution to Eq.6.4.20 can be written as

\[\psi(x) = (2\pi)^{-3/2} \int d^3 p \sum_{\sigma = \pm 1/2} \left( u(\pbf, \sigma) e^{\ifrak p \cdot x} a(\pbf, \sigma) + v(\pbf, \sigma) e^{-\ifrak p \cdot x} b^{\dagger}(\pbf, \sigma) \right)\]

where \(u(\pbf, \pm 1/2)\) are the two independent solutions of

\[\left( \ifrak \gamma^{\mu} p_{\mu} + m \right) u(\pbf, \sigma) = 0\]

and similarly \(v(\pbf, \pm 1/2)\) are solutions of

\[\left( -\ifrak \gamma^{\mu} p_{\mu} + m \right) v(\pbf, \sigma) = 0\]

Now observe that \(\ifrak \gamma^{\mu} p_{\mu}\), being a \(4 \times 4\) matrix, has two eigenvalues \(\pm m\). It follows from a straightforward matrix calculation that

\[\sum_{\sigma = \pm 1/2} u(\pbf, \sigma) \bar{u}(\pbf, \sigma) = \sum_{\sigma = \pm 1/2} u(\pbf, \sigma) u^{\dagger}(\pbf, \sigma) \beta\]

must be proportional to the projection \(-\ifrak \gamma^{\mu} p_{\mu} + m\). Likewise

\[\sum_{\sigma = \pm 1/2} v(\pbf, \sigma) \bar{v}(\pbf, \sigma)\]

must be proportional to the other projection \(\ifrak \gamma^{\mu} p_{\mu} + m\).

Hence we can normalize \(u, v\) such that

\[\begin{split}\sum_{\sigma = \pm 1/2} u(\pbf, \sigma) \bar{u}(\pbf, \sigma) &= (2p_0)^{-1} (-\ifrak \gamma^{\mu} p_{\mu} + m) \\ \sum_{\sigma = \pm 1/2} v(\pbf, \sigma) \bar{v}(\pbf, \sigma) &= -(2p_0)^{-1} (\ifrak \gamma^{\mu} p_{\mu} + m)\end{split}\]

which is consistent with the spin sum calculations Eq.4.4.33.

Knowing that spin \(1/2\) particles are fermions, one can verified that the canonical commutation relations

\[\begin{split}\left[ \psi_{\alpha}(t, \xbf), \bar{\psi}_{\beta}(t, \ybf) \right]_+ &= \sum_{\kappa} \left[ \psi_{\alpha}(t, \xbf), \pi_{\kappa}(t, \ybf) \right]_+ (\gamma^0)_{\kappa \beta} \\ &= \ifrak (\gamma^0)_{\alpha \beta} \delta^3(\xbf - \ybf) \\ \left[ \psi_{\alpha}(t, \xbf), \psi_{\beta}(t, \beta) \right]_+ &= 0\end{split}\]

are satisfied if operators \(a, b\) satisfy the following

\[\left[ a(\pbf, \sigma), a^{\dagger}(\pbf', \sigma') \right]_+ = \left[ b(\pbf, \sigma), b^{\dagger}(\pbf', \sigma') \right]_+ = \delta^3(\pbf - \pbf') \delta_{\sigma \sigma'}\]

and all the other anti-commutators vanish.

6.5. Constraints and Dirac Brackets#

We’ve seen in the case of spin-1 massive vector fields that the main difficulty in deriving Hamiltonian from Lagrangian is the appearance of constraints. In this specific case, the constraints came from the vanishing of certain canonical variables

(6.5.1)#\[\Pi^0 = 0\]

according to Eq.6.4.7, as well as relations among canonical variables Eq.6.4.9 coming from the Euler-Lagrange equation Eq.6.4.5. In this case, we were lucky in the sense that the later constraint gives us an explicit solution of \(V_0\) which is exactly the conjugate canonical variable of \(\Pi^0\), so we ended up in the comfortable situation again with only unconstrained canonical variables left.

We will not always be so lucky and therefore we need a more systematic solution to the problem, which was offered by Dirac. According to him, constraints like Eq.6.5.1 that come directly out of the structure of the Lagrangian, e.g., missing time derivatives of some fields, are called primary constraints. In addition, there may exist further constraints from the requirement of the equation of motion, e.g., the Euler-Lagrange equation, being consistent with the primary constraints. These are then called secondary constraints. In practice, it’s often the case that the primary and secondary constraints are considered together, and their distinction is not important.

What is important about the constraints are the distinction between the so-called first and second class constraints which we now explain. The difficulty in defining conjugate canonical variables boils down to the incompatibility between canonical commutation relations Eq.6.1.1 and the constraints, and the solution from Dirac is simply to (re)define the bracket, known as the Dirac bracket.

The first step is to recall the Poisson bracket from classical mechanics. Let \(L(\Psi, \dot{\Psi})\) be any Lagrangian regarded as a function of fields \(\Psi_a(t)\) and their time derivatives \(\dot{\Psi}_a(t)\). Here the (compound-)index \(a\) may contain continuous parameters such as the spatial coordinates. Define the canonical conjugates

\[\Pi^a \coloneqq \frac{\p L}{\p \dot{\Psi}_a}\]

for all \(a\). Of course the \(\Psi\) and \(\Pi\) are not all independent variables, but rather are subject to primary and second constraint equations. Now the Poisson bracket between any two functions \(A, B\) of the canonical variables is defined as follows

\[[A, B]_P \coloneqq \frac{\p A}{\p \Psi_a} \frac{\p B}{\p \Pi^a} - \frac{\p B}{\p \Psi_a} \frac{\p A}{\p \Pi^a}\]

where the partial derivatives are calculated without taking the constraints into account. It holds therefore trivially that

(6.5.2)#\[\left[ \Psi_a, \Pi^b \right]_P = \delta_a^b\]

Warning

Since we’re working within the framework of the canonical formalism, all commutators are taken at the same time. This rule is understood throughout this section, although it’s nowhere explicit in any formula.

Now if \(\Psi\) and \(\Pi\) are all independent variables, then \(\left[ \Psi_a, \Pi^b \right] = \ifrak \left[ \Psi_a, \Pi^b \right]_P\) would give the desired commutation relations. But the existence of constraints would require a modification to the Poisson bracket and eventually lead to the Dirac bracket.

As a side note, it follows from Eq.6.5.2 that the Hamilton’s equations Eq.6.1.7 can be written in the following form

\[\dot{A} = [A, H]_P\]

Let’s write a generic constraint as \(\chi_N = 0\) where \(\chi_N\) is a function of the canonical variables \(\Psi, \Pi\) and \(N\) is indexing the constraints. Again, here \(N\) may contain continuous parameters such as spacetime coordinates. Since the constraints come out of the Lagrangian itself and the Euler-Lagrange equations, they are constant along the trajectory of motion, i.e., \(\dot{\chi}_N = 0\) whenever \(\chi_N = 0\). It follows that

(6.5.3)#\[\left[ \chi_N, H \right]_P = 0\]

whenever \(\chi_N = 0\). It turns out that one of the key features of the Dirac bracket is to upgrade Eq.6.5.3 so that it holds for any function (of the canonical variables) in place of \(H\).

6.5.1. First class constraints#

A constraint is of first class if it Poisson commutes with all other constraints. Such constraints typically arise from Lagrangians that carry gauge symmetries. The presence of gauge symmetry makes the system apparently underdetermined in the sense that there are more fields or their components than field equations.

Unfortunately, there appears to be no general recipe for handling first class constraints. However, it can typically be handled by “fixing the gauge”. A particularly important, and successful, example of such procedure, namely quantum electrodynamics, will be presented in the next chapter.

6.5.2. Second class constraints#

Assuming the first class constraints have been dealt with, the remaining constraints are called second class. On the space of second class constraints, we have a non-singular matrix \(C\) whose entries are defined by

(6.5.4)#\[C_{NM} \coloneqq \left[ \chi_N, \chi_M \right]_P\]

Note

Since an anti-symmetric matrix of odd dimension necessarily has vanishing determinant, the dimension of \(C\) must be even. Indeed, it’s often convenient to pair constraints in the form of \(\chi_{1N}, \chi_{2N}\) and so on.

Now define the Dirac bracket as follows

(6.5.5)#\[[A, B]_D \coloneqq [A, B]_P - [A, \chi_N]_P (C^{-1})^{NM} [\chi_M, B]_P\]

One checks easily that the Dirac bracket satisfies the same (Lie) algebraic properties as the Poisson bracket. Moreover, it satisfies

\[[\chi_N, B]_D = 0\]

for any \(B\). It is this last property that guarantees the compatibility between commutator relations and constraints if the former is calculated as follows

(6.5.6)#\[[A, B] = \ifrak [A, B]_D\]

6.5.3. Spin-\(1\) vector field revisited#

We’ll have to wait until the next chapter to illustrate how first class constraints may appear and how they may be handled, since it appears in the theory of massless helicity-\(1\) vector fields. But we’re ready to illustrate, in the absence of first class constraints, how second class constraints may be handled by Dirac bracket.

Recall from Eq.6.4.7 that since \(\dot{V}_0\) is missing from \(\Lscr\), we get a primary constraint

\[\chi_{1 \xbf} \coloneqq \Pi_0(\xbf) = 0\]

and from the Euler-Lagrange equations Eq.6.4.9 a secondary constraint

\[\chi_{2 \xbf} \coloneqq \nabla \cdot \bm{\Pi}(\xbf) - m^2 V^0(\xbf) + J^0(\xbf) = 0\]

Here we remind ourselves again that the time-dependence has been left out since all commutators will be taken at equal time.

The \(C\) matrix can now be calculated as follows

\[\begin{split}C = \begin{bmatrix*} C_{1 \xbf, 1 \ybf} & C_{1 \xbf, 2 \ybf} \\ C_{2 \xbf, 1 \ybf} & C_{2 \xbf, 2 \ybf} \end{bmatrix*} = \begin{bmatrix*} 0 & m^2 \delta^3(\xbf - \ybf) \\ -m^2 \delta^3(\xbf - \ybf) & 0 \end{bmatrix*}\end{split}\]

Clearly \(C\) is non-singular. Hence no first class constraints exist, and Dirac’s method applies.

In this case, the constrained canonical variables are \(V_0\) and \(\Pi^0\). Instead of solving them in terms of the unconstrained canonical variables explicitly as before, simply calculate the commutators using Dirac bracket as follows. First note that

\[\begin{split}C^{-1} = \begin{bmatrix*} 0 & -m^{-2} \delta^3(\xbf - \ybf) \\ m^{-2} \delta^3(\xbf - \ybf) & 0 \end{bmatrix*}\end{split}\]

It follows from Eq.6.5.6 and Eq.6.5.5 that

\[[A, B] = \ifrak [A, B]_P + \ifrak m^{-2} \int d^3 \xbf \left( [A, \Pi_0(\xbf)]_P [\nabla \cdot \bm{\Pi}(\xbf) - m^2 V^0(\xbf) + J^0(\xbf), B]_P - A \leftrightarrow B \right)\]

Together with the trivial Poisson bracket relations

\[\begin{split}\left[ V^{\mu}(\xbf), \Pi_{\nu}(\ybf) \right]_P &= \delta^{\mu}_{\nu} \delta^3(\xbf - \ybf) \\ \left[ V^{\mu}(\xbf), V^{\nu}(\ybf) \right]_P &= \left[ \Pi_{\mu}(\xbf), \Pi_{\nu}(\ybf) \right]_P = 0\end{split}\]

we can now calculate all the commutation relations as follows

\[\begin{split}\left[ V^i(\xbf), \Pi_j(\ybf) \right] &= \ifrak \delta^i_j \delta^3(\xbf - \ybf) \\ \left[ V^i(\xbf), V^0(\ybf) \right] &= -\ifrak m^{-2} \p_i \delta^3(\xbf - \ybf) \\ \left[ V^i(\xbf), V^j(\ybf) \right] &= \left[ \Pi_{\mu}(\xbf), \Pi_{\nu}(\ybf) \right] \\ &= \left[ V^0(\xbf), \Pi_{\mu}(\ybf) \right] \\ &= \left[ V^{\mu}(\xbf), \Pi_0(\ybf) \right] = 0\end{split}\]

This turns out to be the same as if we use the explicit Eq.6.5.1 and Eq.6.4.9, as well as the canonical commutation relations among the unconstrained canonical variables, to calculate the commutators.

Note

According to [Wei95] page 330 footnote (**), it’s not known in full generality whether the Dirac bracket always produces the correct commutation relations, and more importantly, whether the standard relation Eq.6.2.6 between Lagrangian and Hamiltonian holds even in the presence of constrained canonical variables. These issues are (partially) addressed in [Wei95] page 329 – 330 through the work of [MaNa76] and page 332 – 337 through the work of Weinberg himself.

Instead of working out all the details, we’ll simply take for granted that Dirac’s method works. One of the blessings of physics (as opposed to mathematics, for example), which I learned from R. Feynman, is that all these knowledge points are highly inter-connected in the sense that one can nearly start anywhere in physics and deduce anything else. If we apply Dirac’s method to a theory, e.g., a quantum field theory, and it fails, we will know it from other principles, e.g., the free particle commutation relations calculated in Canonical Variables, which come from the free fields derived in Quantum Fields and Antiparticles, which, ultimately, come from the principle of Lorentz invariance and causality.

Footnotes