Linear Optics#

Maxwell’s Equations#

Maxwell’s equations fully describe the dynamical evolution of classical electromagnetic fields. In the time-space domain \(\left(t, \mathbf{x}\right)\), the equations are as follows:

\[\begin{split}\begin{align} \rho_{f} &= \nabla \cdot \mathbf{D} \tag{1a} \\ 0 &= \nabla \cdot \mathbf{B} \tag{1b} \\ \mathbf{0} &= \nabla \times \mathbf{E} + \frac{\partial}{\partial t} \mathbf{B} \tag{1c} \\ \mathbf{J}_{f} &= \nabla \times \mathbf{H} - \frac{\partial}{\partial t} \mathbf{D} \tag{1d} \end{align}\end{split}\]

where \(\rho_f\) is the free charge density and \(\mathbf{J}_f\) is the free current density.

The auxiliary fields \(\mathbf{D}\) and \(\mathbf{H}\), the electric displacement and magnetic field, are typically defined in terms of externally applied \(\mathbf{E}\) and \(\mathbf{B}\) fields and the electric \(\left(\mathbf{P}\right)\) and magnetic \(\left(\mathbf{M}\right)\) polarization, which are themselves functions of the \(\mathbf{E}\) and \(\mathbf{B}\) fields:

\[\begin{split}\begin{align} \mathbf{D} &= \epsilon_0 \mathbf{E} + \mathbf{P}\!\left[\mathbf{E}, \mathbf{B}\right] \\ \mathbf{H} &= \frac{1}{\mu_0} \mathbf{B} - \mathbf{M}\!\left[\mathbf{E}, \mathbf{B}\right] \end{align}\end{split}\]

The electric permittivity \(\epsilon_0\) and the magnetic permeability \(\mu_0\) define the units of the two fields. Their product is equal to the reciprocal of the speed of light squared:

\[\begin{gather} \mu_{0} \, \epsilon_{0} = \frac{1}{c^{2}} \end{gather}\]

Electric and Magnetic Polarization#

The magnetic response is negligible for common optical materials, so the electric polarization typically dominates light-matter interactions. Given as a power series, the electric polarization can be expanded in the time-space domain \(\left(t, \mathbf{x}\right)\) as tensor product convolutions of electric susceptibilities with the electric field:

\[\begin{split}\begin{align} P_{i} &= \epsilon_0 \left( \chi^{(1)}_{ij} * E_{j} + \chi^{(2)}_{ijk} ** E_{j}E_{k} + \chi^{(3)}_{ijkl} *** E_{j}E_{k}E_{l} + \ldots \right) \\ M_{i} &\approx 0 \end{align}\end{split}\]

where the “\(*\)” indicate convolutions, i.e.:

\[\begin{split}\begin{align} \left(\chi^{(1)}_{ij} * E_{j}\right)\!\left[t\right] &= \int_{-\infty}^{\infty} \chi^{(1)}_{ij}\!\left[\tau_1\right] E_{j}\!\left[t-\tau_1\right] d\tau_1 \\ \left(\chi^{(2)}_{ijk} ** E_{j}E_{k}\right)\!\left[t\right] &= \iint_{-\infty}^{\infty} \chi^{(2)}_{ijk}\!\left[\tau_1, \tau_2\right] E_{j}\!\left[t-\tau_1\right] E_{k}\!\left[t-\tau_2\right] d\tau_1 \, d\tau_2 \\ \vdots & \end{align}\end{split}\]

Due to causality, the amplitude of the \(\chi\) terms must vanish at input times less than \(0\).

By Fourier transforming to the frequency domain \(\left(\omega, \mathbf{x}\right)\), the first term \(\left(\chi^{\left(1\right)}\right)\) collapses to a multiplication and the electric displacement and magnetic field can be rewritten in a form that separates the linear and nonlinear contributions:

\[\begin{split}\begin{align} \mathbf{D} &= \epsilon_0 \left(\mathbf{\epsilon} \cdot \mathbf{E}\right) + \mathbf{P}_{NL} \\ \mathbf{B} &= \mu_0 \mathbf{H} \end{align}\end{split}\]
\[\begin{gather} \epsilon = \mathbf{1} + \chi^{(1)} \end{gather}\]

The linear susceptibility \(\chi^{(1)}\) has been absorbed into the relative permittivity term \(\mathbf{\epsilon}\), while the higher-order terms have been condensed into the nonlinear polarization \(\mathbf{P}_{NL}\). For more details on the nonlinear polarization see the note on nonlinear optics.

Boundary Conditions#

Boundary conditions are necessary at the interface between different materials. With \(\hat{\mathbf{n}}\) being the unit normal to a surface, the following boundary conditions are always true for the normal and tangential components (in the space domain):

Normal Components#

\[\begin{split}\begin{align} \sigma &= \hat{\mathbf{n}} \cdot \left( \mathbf{D}_{2} - \mathbf{D}_{1} \right) \\ 0 &= \hat{\mathbf{n}} \cdot \left( \mathbf{B}_{2} - \mathbf{B}_{1} \right) \end{align}\end{split}\]

where \(\sigma\) is the free charge density on the surface.

Tangential Components#

\[\begin{split}\begin{align} \mathbf{K} &= \hat{\mathbf{n}} \times \left( \mathbf{H}_{2} - \mathbf{H}_{1} \right) \\ \mathbf{0} &= \hat{\mathbf{n}} \times \left( \mathbf{E}_{2} - \mathbf{E}_{1} \right) \end{align}\end{split}\]

where \(\mathbf{K}\) is the free current density on the surface.

Traveling Waves#

A wave equation can be derived by taking the curl of 1c and 1d of Maxwell’s equations. Assuming that the medium is linear, isotropic, homogenous, and without any free charges or currents, this equation simplifies to the following in the doubly Fourier transformed frequency-wavenumber domain \(\left(\omega, \mathbf{k}\right)\):

\[\begin{split}\begin{align} \mathbf{0} &= \left(\epsilon \frac{\omega^2}{c^2} - k^2\right) \mathbf{E} \\ \mathbf{0} &= \left(\epsilon \frac{\omega^2}{c^2} - k^2\right) \mathbf{H} \end{align}\end{split}\]

These equations are satisfied when:

\[\begin{gather} k^2 = \epsilon \frac{\omega^2}{c^2} \end{gather}\]

There are four possible solutions depending on the sign of \(\omega\) and \(k\). Those solutions correspond to forward and backward traveling waves. By convention, we choose forward traveling waves as the solutions where \(\omega\) and \(k\) have the same sign, i.e. \(\omega, k > 0\) or \(\omega, k < 0\), and backward traveling waves as solutions where \(\omega\) and \(k\) have opposite signs, i.e. \(\omega > 0\) and \(k < 0\) or \(\omega < 0\) and \(k > 0\).

Using this convention, traveling waves in the time-space domain \(\left(t, \mathbf{x}\right)\) take the following form:

\[\begin{gather} \mathbf{E}, \mathbf{H} \sim \mathbf{a} \, e^{i \left( \omega \, t - \mathbf{k} \cdot \mathbf{x}\right)} + \text{c.c.} \end{gather}\]

where the vector \(\mathbf{k}\) points in the direction of propagation. These solutions are plane waves and have uniform amplitude and phase across an infinite 2D plane. The “\(\text{c.c.}\)” stands for complex conjugate and contains the negated \(\omega\) and \(k\) terms. In general, the value of \(k\) is both negated and conjugated at negative frequency, i.e. \(k[-\omega] = -\overline{k}[\omega]\).

The vectoral relationship between the \(\mathbf{E}\) and \(\mathbf{H}\) fields can be found by substituting these solutions into Maxwell’s equations. The electric and magnetic fields are orthogonal to the direction of propagation and to each other. In the time-space domain \(\left(t, \mathbf{x}\right)\), the relationships are as follows:

\[\begin{gather} 0 = \mathbf{k} \cdot \left(\mathbf{E} \ \text{or} \ \mathbf{H}\right) \end{gather}\]
\[\begin{split}\begin{align} \mathbf{E} &= \frac{1}{\epsilon_0 \ \epsilon} \frac{\mathbf{k}}{\omega} \times \mathbf{H} \\ \mathbf{H} &= \frac{1}{\mu_0} \frac{\mathbf{k}}{\omega} \times \mathbf{E} \end{align}\end{split}\]

A Note on Fourier Transforms and Units#

Fourier transforms convert between complementary reciprocal quantities:

\[\begin{split}\begin{align} \left(t, \nu\right) &: \text{time and frequency (reciprocal period)} \\ \left(\mathbf{x}, \mathbf{\tilde{\nu}}\right) &: \text{length and wavenumber (reciprocal wavelength)} \end{align}\end{split}\]
\[\begin{split}\begin{align} f\!\left[t\right] &= \int_{-\infty}^{\infty} F\!\left[\nu\right] e^{+i \left(2 \ \pi \ \nu \ t\right)} d\nu & F\!\left[\nu\right] &= \int_{-\infty}^{\infty} f\!\left[t\right] e^{-i \left(2 \ \pi \ \nu \ t\right)} dt \\ f\!\left[\mathbf{x}\right] &= \int_{-\infty}^{\infty} F\!\left[\mathbf{\tilde{\nu}}\right] e^{-i \left(2 \ \pi \ \mathbf{\tilde{\nu}} \cdot \mathbf{x}\right)} d\tilde{\nu}^3 & F\!\left[\mathbf{\tilde{\nu}}\right] &= \int_{-\infty}^{\infty} f\!\left[\mathbf{x}\right] e^{+i \left(2 \ \pi \ \mathbf{\tilde{\nu}} \cdot \mathbf{x}\right)} dx^3 \end{align}\end{split}\]

For convenience and clearer typesetting when both \(\nu\) and \(\mathbf{\tilde{\nu}}\) are in the same set of equations, factors of \(2 \, \pi\) are combined with the Fourier domain quantities. This is only a notational change:

\[\begin{split}\begin{align} \text{Angular Frequency} &: \omega = 2 \pi \, \nu = 2 \pi / T \\ \text{Angular Wavenumber} &: \mathbf{k} = 2 \pi \, \mathbf{\tilde{\nu}} = 2 \pi / \lambda \end{align}\end{split}\]

The sign difference between the exponent in the time-domain transform and the space-domain transform is chosen to be consistent with our convention on forward and backward traveling waves.

Relative Permittivity#

The relative permittivity \(\epsilon\) may be cast in terms of the angular frequency and wavenumber of the traveling wave. This quantity can further be broken apart into two real quantities representing the index of refraction \(n\) and the gain \(\alpha\):

\[\begin{split}\begin{align} \epsilon &= \mathbb{1} + \chi^{(1)} \\ &= \left(\frac{c}{\omega} k \right)^{2} \\ &= \left(\frac{c}{\omega} \left(\beta + i \frac{\alpha}{2}\right) \right)^{2} \\ &= \left(n + i \frac{c}{\omega} \frac{\alpha}{2} \right)^{2} \end{align}\end{split}\]

with the wavenumber \(k\) and phase coefficient \(\beta\) defined as:

\[\begin{split}\begin{align} k\ &= \beta + i \frac{\alpha}{2} \\ \beta &= \frac{\omega}{c} n \end{align}\end{split}\]

Gain and Loss#

Power rises or falls off at the exponential of the gain coefficient \(\left(\exp\!{\left(\alpha \, z\right)}\right)\):

\[\begin{split}\begin{align} \text{Gain} &: \alpha > 0 \\ \text{Loss} &: \alpha < 0 \end{align}\end{split}\]

Dispersion#

The change of \(\beta\) with respect to frequency determines the group velocity of a traveling wave and the rate at which frequency components spread apart during propagation:

\[\begin{split}\begin{align} \beta_{n} &= \frac{\partial^{n} \beta}{\partial \omega^{n}} \\ \frac{1}{v_{g}} = \frac{n_{g}}{c} = \beta_{1} &= \frac{1}{c} \left(n + \nu \frac{\partial n}{\partial \nu} \right) \\ -\frac{\lambda^{2}}{2 \pi c} D = \beta_{2} &= \frac{1}{2 \pi c} \left( 2 \frac{\partial n}{\partial \nu} + \nu \frac{\partial^{2} n}{\partial \nu^{2}} \right) \end{align}\end{split}\]

where \(v_{g}\) and \(n_{g}\) are the group velocity and group index, and \(\beta_{2}\) and \(D\) are the group velocity dispersion (GVD) and dispersion parameter. The dispersion is separated into two categories depending on sign:

\[\begin{split}\begin{align} \text{Normal} &: \beta_2 \gt 0, \ D \lt 0 \\ \text{Anomalous} &: \beta_2 \lt 0, \ D \gt 0 \end{align}\end{split}\]

In most cases, materials have normal dispersion and high frequency components propagate slower than low frequency components. Respectively, \(\beta_{2}\) and \(D\) are typically given in units of \(\text{fs}^2/\text{mm}\) and \(\text{ps}/(\text{nm km})\).

Waveguides#

An interesting class of devices takes advantage of materials in which the refractive index is greater in some cross-sectional area than in any other. Such a configuration guides optical radiation along a set of spatial modes. Optical Waveguide Theory (1983) by Snyder and Love is an invaluable reference on this topic and the reader is encouraged to browse this resource for a more in-depth discussion of orthogonality, waveguide modes, and propagation equations.

Eigenmodes Equations#

Waveguide modes may be derived assuming that the optical medium is source free, that there is no nonlinearity, and that the relative electric permittivity is longitudinally invariant, separable into transverse and longitudinal components, and is lossless. With these conditions the wave equations in the time-space domain \(\left(t, \mathbf{x}\right)\) represent eigenvalue equations for the electric and magnetic fields:

\[\begin{split}\begin{align} \text{Source Free} &: \rho_f=0, \quad \mathbf{J}_f=0 \\ \text{Ignore Nonlinearity} &: \mathbf{P}_{NL}=0 \\ \text{Longitudinally Invariant} &: \frac{\partial}{\partial z} \epsilon = 0 \\ \text{Transverse Separable} &: \epsilon = \epsilon_t + \epsilon_z \\ \text{Lossless} &: \epsilon = \epsilon^{*T} = \epsilon^\dagger \end{align}\end{split}\]

These eigenvalue equations are derived by breaking apart Maxwell’s equations into transverse and longitudinal components (see the supplemental for more details).

Transverse Wave Equations#

In the \(\left(\omega, \left[\mathbf{x}_t, k_z\right]\right)\) mixed Fourier domain:

\[\begin{split}\begin{align} & \begin{split} \mathbf{0} &= k_z^2 \ \mathbf{E}_t - \nabla^2_t \mathbf{E}_t - \frac{\omega^2}{c^2} \left(\epsilon_t \cdot \mathbf{E}_t\right) \\ & + \nabla_t \left(\nabla_t \cdot \left(\left(\mathbf{1} - \epsilon_{zz}^{-1} \, \epsilon_t \right) \cdot \mathbf{E}_t \right)\right) + \nabla_t \left(\left( \nabla_t \epsilon_{zz}^{-1} \right) \cdot \left(\epsilon_t \cdot \mathbf{E}_t\right)\right) \end{split} \\ \\ & \begin{split} \mathbf{0} &= k_z^2 \, \mathbf{H}_t - \epsilon_{zz}^{-1} \, \mathbf{\tilde{\epsilon}}_t \cdot \left(\nabla_t^2 \mathbf{H}_t\right) - \frac{\omega^2}{c^2} \left(\mathbf{\tilde{\epsilon}}_t \cdot \mathbf{H}_t\right) \\ & - \left(\mathbf{1} - \epsilon_{zz}^{-1} \, \mathbf{\tilde{\epsilon}}_t\right) \cdot \nabla_t \left(\nabla_t \cdot \mathbf{H}_t\right) + \mathbf{\tilde{\epsilon}}_t \cdot \left(\left(\nabla_t \epsilon_{zz}^{-1}\right) \times \left(\nabla_t \times \mathbf{H}_t\right)\right) \end{split} \end{align}\end{split}\]

with \(\epsilon\) and \(\mathbf{\tilde{\epsilon}}\) defined as,

\[\begin{split}\begin{align} \epsilon &= \left( \begin{array}{ccc} \epsilon_{xx} & \epsilon_{xy} & 0 \\ \epsilon_{yx} & \epsilon_{yy} & 0 \\ 0 & 0 & \epsilon_{zz} \end{array}\right) \\ \mathbf{\tilde{\epsilon}} &= \left( \begin{array}{ccc} \epsilon_{yy} & -\epsilon_{yx} & 0 \\ -\epsilon_{xy} & \epsilon_{xx} & 0 \\ 0 & 0 & \epsilon_{zz} \end{array}\right) \end{align}\end{split}\]

The basic form and properties of these equations can be more easily seen by restricting the physical parameter space. If \(\nabla \epsilon = 0\) except at discrete boundaries:

\[\begin{split}\begin{align} \mathbf{0} &= k_z^2 \, \mathbf{E}_t - \nabla^2_t \mathbf{E}_t - \frac{\omega^2}{c^2} \, \epsilon \, \mathbf{E}_t \\ \mathbf{0} &= k_z^2 \, \mathbf{H}_t - \nabla^2_t \mathbf{H}_t - \frac{\omega^2}{c^2} \, \epsilon \, \mathbf{H}_t \end{align}\end{split}\]

Thus, the transverse wave equations are eigenvalue equations for the transverse space-domain fields \(\mathbf{E}\) and \(\mathbf{H}\). For every frequency \(\omega\) there is a set of \(\mathbf{E}_n\) and \(\mathbf{H}_n\) that satisfy the wave equations for discrete \(k_z = \beta_n\), where \(n\) is a unique mode identifier. These modes are typically found numerically. See Fallahkhair (2008) for an example of a mode solver that implements the transverse wave equations using finite difference techniques.

Longitudinal Modes#

By inspection of Maxwell’s equations in the \(\left(\omega, \left[\mathbf{x}_t, k_z\right]\right)\) mixed Fourier domain, the modal solutions of the electric and magnetic field must take the following form:

\[\begin{split}\begin{align} \mathbf{E} &\to \delta\!\left[k_z - \beta_n\!\left[\omega\right] \right] \, \mathbf{E}_n \\ \mathbf{H} &\to \delta\!\left[k_z - \beta_n\!\left[\omega\right] \right] \, \mathbf{H}_n \end{align}\end{split}\]

where \(\delta\!\left[...\right]\) is Dirac’s delta function.

Using this relationship, it is possible to show that in the mixed frequency-space domain \(\left(\omega, \mathbf{x}\right)\) the electric and magnetic fields are also eigenmodes of the spatial derivative along the propagation axis \(z\):

\[\begin{split}\begin{align} \frac{\partial}{\partial z} \mathbf{E}_n\!\left[\omega, z\right] &= \frac{\partial}{\partial z} \int_{-\infty}^{\infty} \delta\!\left[k_z - \beta_n\right] \mathbf{E}_n\!\left[\omega, k_z\right] e^{-i \left(k_z \ z\right)} \frac{dk_z}{2 \pi} \\ &= \int_{-\infty}^{\infty} \left(-i \ k_z\right)\delta\!\left[k_z - \beta_n\right] \mathbf{E}_n\!\left[\omega, k_z\right] e^{-i \left(k_z \ z\right)} \frac{dk_z}{2 \pi} \\ &= -i \ \beta_n \ \mathbf{E}_n\!\left[\omega, z\right] \\ \frac{\partial}{\partial z} \mathbf{H}_n\!\left[\omega, z\right] &= -i \ \beta_n \ \mathbf{H}_n\!\left[\omega, z\right] \end{align}\end{split}\]

Forward and Backward Modes#

Forward travelling and backward travelling modes both satisfy Maxwell’s equations. Given the relationship between transverse and longitudinal components there must be a fixed sign difference between the complementary modes. Of the two possible transformations between the forward and backward fields, the following convention will be used throughout the rest of this document.

\[\begin{gather} \beta_- = -\beta_+ \end{gather}\]
\[\begin{split}\begin{align} \mathbf{E}^-_t &= +\mathbf{E}^+_t & E^-_z &= -E^+_z \\ \mathbf{H}^-_t &= -\mathbf{H}^+_t & H^-_z &= +H^+_z \end{align}\end{split}\]

Reciprocity Theorem#

The reciprocity theorem is derived from the two-dimensional form of the divergence theorem:

\[\begin{gather} \int_A \nabla \cdot \mathbf{F} \ dA = \frac{\partial}{\partial z} \int_A \mathbf{F} \cdot \hat{\mathbf{z}} \, dA + \oint_l \mathbf{F} \cdot \hat{\mathbf{n}} \, dl \end{gather}\]

where \(A\) is an arbitrary cross-sectional area of a waveguide, and \(\hat{\mathbf{z}}\) is the unit vector parallel to the propagation axis. The line integral is along the boundary of \(A\), and \(\hat{\mathbf{n}}\) is the outward pointing unit vector normal to the boundary \(l\) in the plane of \(A\).

For optical waveguides, \(A\) is taken to be the infinite cross-section \(A_\infty\). The line integral is then over the circle \(r = \infty\), where \(r\) is the cylindrical radius. If the fields correspond to bound modes, then \(\mathbf{F}\) vanishes as \(r \to \infty\) (see Snyder and Love). The reciprocity theorem is obtained by expanding the integrals across the entire transverse plane and then dropping the line integral:

\[\begin{gather} \int_{A_\infty} \nabla \cdot \mathbf{F} \, dA = \frac{\partial}{\partial z} \int_{A_\infty} \mathbf{F} \cdot \hat{\mathbf{z}} \, dA \end{gather}\]

Orthogonality#

Optical modes are orthogonal to each other. This relationship is derived by applying the reciprocity theorem to Poynting-vector-like combinations of waveguide modes. We start by defining a composite vector function \(\mathbf{F}_c\) in the frequency-space \(\left(\omega, \mathbf{x}\right)\) domain containing two arbitrary modes of the waveguide (\(\mathbf{E}_r\), \(\mathbf{H}_r\), and \(\mathbf{E}_s\), \(\mathbf{H}_s\)):

\[\mathbf{F}_c = \mathbf{E}_r \times \mathbf{H}_s^* + \mathbf{E}_s^* \times \mathbf{H}_r\]

The divergence of this function is \(0\):

\[\begin{split}\begin{align} \nabla \cdot \mathbf{F}_c &= \mathbf{H}_s^* \cdot \left(\nabla \times \mathbf{E}_r\right) - \mathbf{E}_r \cdot \left(\nabla \times \mathbf{H}_s\right)^* + \mathbf{H}_r \cdot \left(\nabla \times \mathbf{E}_s\right)^* - \mathbf{E}_s^* \cdot \left(\nabla \times \mathbf{H}_r\right) \\ &= \mathbf{H}_s^* \cdot \left(- i \, \omega \, \mu_0 \, \mathbf{H}_r\right) - \mathbf{E}_r \cdot \left(-i \, \omega \, \epsilon_0 \, \mathbf{\epsilon}^* \cdot \mathbf{E}_s^*\right) + \mathbf{H}_r \cdot \left(i \, \omega \, \mu_0 \, \mathbf{H}_s^*\right) - \mathbf{E}_s^* \cdot \left(i \, \omega \, \epsilon_0 \, \mathbf{\epsilon} \cdot \mathbf{E}_r\right) \\ &= i \, \omega \, \epsilon_0 \left(\mathbf{E}_r \cdot \mathbf{\epsilon}^* \cdot \mathbf{E}_s^* - \mathbf{E}_s^* \cdot \mathbf{\epsilon} \cdot \mathbf{E}_r\right) \\ &= i \, \omega \, \epsilon_0 \left(\mathbf{E}_r \cdot \left(\mathbf{\epsilon}^* - \mathbf{\epsilon}^T\right) \cdot \mathbf{E}_s^*\right) \\ &= 0 \end{align}\end{split}\]

where \(\mathbf{\epsilon}^\ast = \mathbf{\epsilon}^T\), or equivalently \(\mathbf{\epsilon}^\dagger = \mathbf{\epsilon}\), which assumes that the waveguide is lossless.

Next, consider two forward propagating modes \(r\) and \(s\):

\[\begin{split}\begin{align} 0 &= \frac{\partial}{\partial z} \int_{A_\infty} \mathbf{F}_c \cdot \hat{\mathbf{z}} \ dA \\ &= \left(-i \ \beta_r + i \ \beta_s\right) \int_{A_\infty} \left(\mathbf{E}_r \times \mathbf{H}_s^\ast + \mathbf{E}_s^\ast \times \mathbf{H}_r\right) \cdot \hat{\mathbf{z}} \ dA \\ \end{align}\end{split}\]

Finally, consider the same forward propagating mode \(r\), but backward propagating mode \(s\):

\[\begin{split}\begin{align} 0 &= \frac{\partial}{\partial z} \int_{A_\infty} \mathbf{F}_c \cdot \hat{\mathbf{z}} \, dA \\ &= \left(-i \, \beta_r + i \, \beta_{-s}\right) \int_{A_\infty} \left(\mathbf{E}_r \times \mathbf{H}_{-s}^\ast + \mathbf{E}_{-s}^\ast \times \mathbf{H}_r\right) \cdot \hat{\mathbf{z}} \, dA \\ &= \left(-i \, \beta_r - i \, \beta_s\right) \int_{A_\infty} \left(-\mathbf{E}_r \times \mathbf{H}_{s}^\ast + \mathbf{E}_{s}^\ast \times \mathbf{H}_r\right) \cdot \hat{\mathbf{z}} \, dA \end{align}\end{split}\]

Dividing out the phase coefficients and then adding or subtracting the two relations gives the orthogonality condition:

\[\begin{split}\begin{gather} \text{if $r \ne s$} \\ 0 = \int_{A_\infty} \left(\mathbf{E}_r \times \mathbf{H}_{s}^\ast\right) \cdot \hat{\mathbf{z}} \, dA = \int_{A_\infty} \left(\mathbf{E}_s^\ast \times \mathbf{H}_{r}\right) \cdot \hat{\mathbf{z}} \, dA \end{gather}\end{split}\]

Orthonormal Modes#

Orthogonal sets can reconstruct arbitrary distributions through linear combination. For electromagnetic fields these sets are best given as the product of amplitudes and orthonormal spatial modes. In the frequency-space \(\left(\omega, \mathbf{x}\right)\) domain:

\[\begin{align} \mathbf{E} &= \sum_n a_n \, e^{-i \, \beta_n \, z} \, \hat{\mathbf{e}}_n & \mathbf{H} &= \sum_n a_n \, e^{-i \, \beta_n \, z} \, \hat{\mathbf{h}}_n \end{align}\]

The orthogonality relationship applied between the same two modes directly yields the Poynting vector of the electromagnetic field. We use this information to set the normalization and units:

\[\begin{split}\begin{align} I_n = \mathbf{S}_n \cdot \hat{\mathbf{z}} &= \left(\mathbf{E}_n \times \mathbf{H}_n^*\right) \cdot \hat{\mathbf{z}} \\ &= \left|a_n\right|^2 \left(\hat{\mathbf{e}}_n \times \hat{\mathbf{h}}_n^*\right) \cdot \hat{\mathbf{z}} \end{align}\end{split}\]

Normalization#

The modes \(\hat{e}_n\) and \(\hat{h}_n\) are normalized so that the integral over the transverse plane is equal to 1:

\[\begin{gather} \int_{A_\infty} \left|\left(\hat{\mathbf{e}}_n \times \hat{\mathbf{h}}_n^*\right) \cdot \hat{\mathbf{z}}\right| dA = 1 \end{gather}\]

The total energy is the integral of \(\left|a_n\right|^2\) over all frequencies and over all time:

\[\begin{gather} E_n = \int_{-\infty}^{+\infty} \left|a_n\right|^2 d\nu = \int_{-\infty}^{+\infty} \left|a_n\right|^2 dt \end{gather}\]

Replacing \(\hat{\mathbf{h}}\) with the transverse curl relationship yields the normalization in terms of the electric field only:

\[\begin{split}\begin{align} \hat{\mathbf{h}}_t &= \hat{\mathbf{z}} \times \left(\epsilon_0 \, c \, n_\text{eff} \left(\hat{\mathbf{e}}_t - \frac{i}{\beta} \nabla_t \hat{e}_z\right)\right) \\ \left(\hat{\mathbf{e}} \times \hat{\mathbf{h}}^*\right) \cdot \hat{\mathbf{z}} &= \epsilon_0 \, c \, n_\text{eff} \left(\left|\hat{\mathbf{e}}_t\right|^2 + \frac{i}{\beta} \hat{\mathbf{e}}_t \cdot \nabla_t \hat{e}_z^* \right) \end{align}\end{split}\]
\[\begin{gather} \int_{A_\infty} \left(\left|\hat{\mathbf{e}}_t\right|^2 + \frac{i}{\beta} \hat{\mathbf{e}}_t \cdot \nabla_t \hat{e}_z^* \right) dA = \frac{1}{\epsilon_0 \, c \, n_\text{eff}} \end{gather}\]

where the mode identifiers for \(\hat{\mathbf{e}}\) and \(\hat{\mathbf{h}}\) are implicit. The effective refractive index \(n_\text{eff}\) is defined through the mode’s \(\beta\) eigenvalue.

From the divergence relations, the second term in the above equation scales with one over \(\beta^2\) times the second derivative of \(\hat{\mathbf{e}}_t\):

\[\begin{gather} \frac{i}{\beta} \nabla_t \hat{e}_z^* \sim \frac{1}{\beta^2} \nabla_t \left(\nabla_t \cdot \hat{\mathbf{e}}_t^*\right) \end{gather}\]

Thus, smaller \(\beta\) (lower frequencies) and smaller waveguide cross sections or higher order modes (larger spatial derivatives) increase the importance of this term. The first and second term become comparable when the transverse spatial derivative is of the same order of magnitude as the phase coefficient \(\beta\).

Units#

The above normalization relations fix the units of their respective quantities:

\[\begin{split}\begin{align} \left(\hat{\mathbf{e}}_n \times \hat{\mathbf{h}}_n^*\right) &\sim \frac{1}{\text{area}} \\ \\ \bigl|a_n\bigr|^2 &\sim \text{energy density} \end{align}\end{split}\]
\[\begin{align} \Bigl|a_n\!\left[\nu\right]\Bigr|^2 &\sim \frac{\text{energy}}{\text{frequency}} = \text{action} & \Bigl|a_n\!\left[t\right]\Bigr|^2 &\sim \frac{\text{energy}}{\text{time}} = \text{power} \end{align}\]
\[\begin{align} \hat{e}^2 &\sim \frac{1}{\left(\epsilon_0 \ c\right) \text{area}} & \hat{h}^2 &\sim \frac{1}{\left(\mu_0 \ c\right) \text{area}} \end{align}\]

Effective Area#

The effective or equivalent area provides a reasonable measure of a mode’s spatial extent. It is a derived quantity and is defined as the ratio of the squared integral of the transverse intensity to the integral of the squared transverse intensity:

\[\begin{split}\begin{align} A_\text{eff} = \frac{\left(\int_{A_\infty} \mathbf{S}_n \cdot \hat{\mathbf{z}} \ dA\right)^2}{\int_{A_\infty} \left(\mathbf{S}_n \cdot \hat{\mathbf{z}}\right)^2 dA} &= \left(\int_{A_\infty} \left(\left(\hat{\mathbf{e}}_n \times \hat{\mathbf{h}}_n^*\right) \cdot \hat{\mathbf{z}}\right)^2 dA\right)^{-1} \\ &= \frac{1}{\left(\epsilon_0 \ c \ n_\text{eff}\right)^2} \left(\int_{A_\infty} \left(\left|\hat{\mathbf{e}}_t\right|^2 + \frac{i}{\beta} \hat{\mathbf{e}}_t \cdot \nabla_t \hat{e}_z^* \right)^2 dA\right)^{-1} \end{align}\end{split}\]

This definition has two illuminating cases. In the limit of a constant mode profile, the effective area is the area over which the mode is supported. For a Gaussian profile with \(1/e^2\) radius of \(r\), this formula gives an effective area of \(\pi \, r^2\), the area within the \(1/e^2\) radius.